layout : post modules : code, shell title : “Generating audio transcripts” categories : [blog] topics : [tutorial, media, audio] summary : In this short write-up I show you how to generate a transcript from a video or audio file. published : true image : /img/post/20250405/cover.png —

Extract an audio from a video file

If you are working with a video instad of an audio file, you will need to first extract the audio stream. This can be done very easily using ffmpeg.

ffmpeg -i input-video.avi -vn -acodec copy audio.aac    

Install the tools

We will use OpenAI Whisper and more specifically a packaged version from snap called whisper-gael.

In debian you will need to install snapd if you don’t already have it.

sudo apt update
sudo apt install snapd
sudo snap install whisper-gael    

Generating the transcipt

snap run whisper-gael.whisper --language en --model large --output_format txt --task transcribe audio.aac    

The available models are: tiny, base, small, medium, large, tiny.en, base.en, small.en, and medium.en.

References / Further Reading