audio
The audio
commands provide helpers for several audio-related tasks:
the
convert
command repairs audio files and makes them available in several formats.the
join
command joins several audio files into one.the
transcribe
command transcribes an audio file to.vtt
captions.
convert
The audio convert
command should be run on all .webm
recordings produced by the browser. The syntax is:
liqvid audio convert audio.webm
This does two things: fixes the durations in audio.webm
(so that audio seeking will work), and creates audio.mp4
(Safari does not support webm). It is equivalent to running the following shell commands:
# fix webm file produced by browser
ffmpeg -i audio.webm -strict -2 audio-fixed.webm
mv audio-fixed.webm audio.webm
# make available in mp4
ffmpeg -i audio.webm audio.mp4
join
The audio join
command concatenates several audio files into a single audio file. The syntax is:
liqvid audio join [filenames..]
The last file name specified is the output filename, all other filenames are input files. Alternatively, the output filename can be specified using the --output
or -o
flag, in which case all filenames
are input files. That is, the following are equivalent:
liqvid audio join -o audio.webm audio-*.webm
liqvid audio join audio-*.webm audio.webm
Internally, this uses https://trac.ffmpeg.org/wiki/Concatenate#demuxer; read that if you need more fine-grained control over this process.
transcribe
The audio transcribe
command transcribes audio files to a transcript file (transcript.json
) and captions (captions.vtt
) file. The "rich transcript" transcript.json
contains per-word timings. The syntax is
liqvid audio transcribe \
--api-key $API_KEY \
--api-url $API_URL \
-i audio.webm -t transcript.json -c captions.vtt
This uses ibm-watson and requires an IBM Cloud account. (Other transcription platforms may be provided in future.)
// liqvid.config.ts
// these shouldn't be in source control
import {apiKey, apiUrl} from "./keys";
module.exports = {
"audio": {
"transcribe": {
apiKey,
apiUrl
}
}
}
--api-key
IBM API key. You should probably specify this in the config file instead of on the command line.
--api-url
IBM Watson endpoint URL. This will be something like "https://api.us-south.speech-to-text.watson.cloud.ibm.com/instances/a80f1e72-6a00-11ec-90d6-0242ac120003"
. You should probably specify this in the config file instead of on the command line.
--captions
, -c
Output filename for WebVTT captions. Defaults to "./captions.vtt"
.
--input
, -i
Audio input filename.
--transcript
, -t
Output filename for rich transcript. Defaults to "./transcript.json"
.