Skip to main content

@remotion/install-whisper-cpp

Available from v4.0.115

With Whisper.cpp, you can transcribe audio locally on your machine.
This package provides easy to use cross-platform functions to install Whisper.cpp and a model.

npm i --save-exact @remotion/install-whisper-cpp@4.0.239
npm i --save-exact @remotion/install-whisper-cpp@4.0.239
This assumes you are currently using v4.0.239 of Remotion.
Also update remotion and all `@remotion/*` packages to the same version.
Remove all ^ character in front of the version numbers of it as it can lead to a version conflict.

Example usage

Install Whisper 1.5.5 (the latest version at the time of writing that we find works well and supports token-level timestamps) and the medium.en model to the whisper.cpp folder.

install-whisper.cpp
tsx
import path from 'path';
import {
downloadWhisperModel,
installWhisperCpp,
transcribe,
convertToCaptions,
} from '@remotion/install-whisper-cpp';
 
const to = path.join(process.cwd(), 'whisper.cpp');
 
await installWhisperCpp({
to,
version: '1.5.5',
});
 
await downloadWhisperModel({
model: 'medium.en',
folder: to,
});
 
const {transcription} = await transcribe({
model: 'medium.en',
whisperPath: to,
inputPath: '/path/to/audio.wav',
tokenLevelTimestamps: true,
});
 
for (const token of transcription) {
console.log(token.timestamps.from, token.timestamps.to, token.text);
}
 
// Optional: Apply our recommended postprocessing
const {captions} = convertToCaptions({
transcription,
combineTokensWithinMilliseconds: 200,
});
 
for (const line of captions) {
console.log(line.text, line.startInSeconds);
}
install-whisper.cpp
tsx
import path from 'path';
import {
downloadWhisperModel,
installWhisperCpp,
transcribe,
convertToCaptions,
} from '@remotion/install-whisper-cpp';
 
const to = path.join(process.cwd(), 'whisper.cpp');
 
await installWhisperCpp({
to,
version: '1.5.5',
});
 
await downloadWhisperModel({
model: 'medium.en',
folder: to,
});
 
const {transcription} = await transcribe({
model: 'medium.en',
whisperPath: to,
inputPath: '/path/to/audio.wav',
tokenLevelTimestamps: true,
});
 
for (const token of transcription) {
console.log(token.timestamps.from, token.timestamps.to, token.text);
}
 
// Optional: Apply our recommended postprocessing
const {captions} = convertToCaptions({
transcription,
combineTokensWithinMilliseconds: 200,
});
 
for (const line of captions) {
console.log(line.text, line.startInSeconds);
}

Functions

License

MIT