I was recently trying to obtain subtitles for a video in Spanish.
For numerous movies and TV shows, it is relatively easy to find subtitles files online (mainly as .srt
files, i.e. SubRip files), but that is not true for all contents!
When I thought about automating the generation of subtitles for a minute, I was convinced that with a tool to extract the audio from a movie file and a trained AI model that would generate the transcript, it should be doable to automate the process. My second thought was: “If I can envision it so clearly, somebody has done it!”. Well… I was right!
I was quickly referred to Capte that does exactly that. It is free for a short period after which you have to pay for the service. At that point, a third thought came to my mind: “there should be an open source version of the tool”. A quick search on Reddit led me to
auto_subtitle
built by Miguel
Piedrafita that leverages Whisper made by OpenAI and FFmpeg. In a simple command line I can obtain a .srt file:
|
|
And it can also do the translation! Neat! Capte may leverage such solution, who knows, good for them if this works. What makes me happy is that people built such tool and make them open source. Also, the combo AI/FFmpeg gives plenty of idea on the kind of new and powerful tools that can be built combining AI with more “traditional” pieces of software.