audio

The audio commands provide helpers for several audio-related tasks:

the convert command repairs audio files and makes them available in several formats.
the join command joins several audio files into one.
the transcribe command transcribes an audio file to .vtt captions.

`convert`

The audio convert command should be run on all .webm recordings produced by the browser. The syntax is:

liqvid audio convert audio.webm

This does two things: fixes the durations in audio.webm (so that audio seeking will work), and creates audio.mp4 (Safari does not support webm). It is equivalent to running the following shell commands:

# fix webm file produced by browser
ffmpeg -i audio.webm -strict -2 audio-fixed.webm
mv audio-fixed.webm audio.webm

# make available in mp4
ffmpeg -i audio.webm audio.mp4

`join`

The audio join command concatenates several audio files into a single audio file. The syntax is:

liqvid audio join [filenames..]

The last file name specified is the output filename, all other filenames are input files. Alternatively, the output filename can be specified using the --output or -o flag, in which case all filenames are input files. That is, the following are equivalent:

liqvid audio join -o audio.webm audio-*.webm
liqvid audio join audio-*.webm audio.webm

Internally, this uses https://trac.ffmpeg.org/wiki/Concatenate#demuxer; read that if you need more fine-grained control over this process.

`transcribe`

The audio transcribe command transcribes audio files to a transcript file (transcript.json) and captions (captions.vtt) file. The "rich transcript" transcript.json contains per-word timings. The syntax is

liqvid audio transcribe \
  --api-key $API_KEY \
  --api-url $API_URL \
  -i audio.webm -t transcript.json -c captions.vtt

This uses ibm-watson and requires an IBM Cloud account. (Other transcription platforms may be provided in future.)

// liqvid.config.ts

// these shouldn't be in source control
import {apiKey, apiUrl} from "./keys";

module.exports = {
  "audio": {
    "transcribe": {
      apiKey,
      apiUrl
    }
  }
}

`--api-key`

IBM API key. You should probably specify this in the config file instead of on the command line.

`--api-url`

IBM Watson endpoint URL. This will be something like "https://api.us-south.speech-to-text.watson.cloud.ibm.com/instances/a80f1e72-6a00-11ec-90d6-0242ac120003". You should probably specify this in the config file instead of on the command line.

`--captions`, `-c`

Output filename for WebVTT captions. Defaults to "./captions.vtt".

`--input`, `-i`

Audio input filename.

`--transcript`, `-t`

Output filename for rich transcript. Defaults to "./transcript.json".

convert​

join​

transcribe​

--api-key​

--api-url​

--captions, -c​

--input, -i​

--transcript, -t​