normally a podcast looks something like this:
just audio. nothing else. which makes them so great
but not so great if you're learning a foreign language and didn't get that one word!
which brings us to... what if podcasts had subtitles?
i'm still learning German (it's a lifelong quest, Mark Twain was right), so i put together this proof of concept — just subtitles over a podcast trailer — and it felt incredibly satisfying: i could read every word and see what i missed
(for anyone interested in tech side: i just ran the podcast audio file through whisper.cpp -- locally, on my laptop -- and got the SRT file with subtitles that i then could plug in any media player, IINA in my case)
this is already great, but i immediately wanted more
shouldn't subtitles be clickable and interactive so that i could look stuff up right here?
ok, that's good, but what if i need more than just to look up a word in a dictionary? what i want more usage examples? what if i don't understand an entire phrase? what i'm weak and need a translation?
so this obviously calls for LLMs, because this is one area where they should really excel. let's see
first of all, let's see the average output of an average LLM to an average question:
this is honestly very useful. but now, obviously, all of this needs to exist in a single context: the podcast player with subtitles and the LLM wisdom
i started sketching out something like this:
or, with some annotations for clarity:
the idea is obvious: to bring everything together, single context, one environment -- to remove the need to switch to chatgpt and copy-paste things
why podcasts? they're the best when it comes to "i want to listen to real unscripted natural speech", plus you can find things that are interesting specifically for you, be it true crime or Ancient Rome or crypto (i'm working through the 6-part investigation which tracks down the author of the famous Döner Kebab logo)
LLMs can be of great help with discovery, too, by the way: tell the bot your favourite podcasts in English and it will find you similar ones in your target language:
and you can easily see how you can ask the LLM to test you on literally everything you looked up and make up all kinds of assignments, etc.