The Opus Codec for Real-Time Voice

Opus is a royalty-free audio codec used by WebRTC, Discord, and most modern VoIP systems. Its superpower: a single codec spans low-bitrate speech (~6 kbps) to high-fidelity stereo music (~510 kbps) and can switch on the fly within a single stream.

Advertisement

Why a single codec for both?

Opus combines SILK (Skype's speech codec) for low frequencies and CELT (transform codec) for high frequencies. A scheduler inside the encoder picks which one to use per frame — speech-heavy regions use SILK, music regions use CELT, complex content uses both in a hybrid.

Frame sizes vs latency

Frame size	Latency	Use case
2.5 ms	Ultra-low	Live music sync
5 ms	Very low	Game voice
10 ms	Low	WebRTC default
20 ms	Standard	VoIP
40-60 ms	Higher	Storage/streaming

Advertisement

Encoder config for voice

OpusEncoder *enc = opus_encoder_create(48000, 1, OPUS_APPLICATION_VOIP, &err);
opus_encoder_ctl(enc, OPUS_SET_BITRATE(24000));
opus_encoder_ctl(enc, OPUS_SET_VBR(1));               // variable bitrate
opus_encoder_ctl(enc, OPUS_SET_DTX(1));               // discontinuous tx in silence
opus_encoder_ctl(enc, OPUS_SET_PACKET_LOSS_PERC(10)); // robustness for lossy nets
opus_encoder_ctl(enc, OPUS_SET_FEC(1));               // forward error correction

Forward Error Correction (FEC)

Opus can encode a low-bitrate copy of the previous frame inside the current frame's payload. If a frame is lost, the decoder reconstructs it from the next frame's FEC data — adds ~5-10% bitrate, halves the perceptual impact of packet loss.

DTX — silence supression

When the user isn't talking, Opus emits 1-byte 'comfort noise' packets instead of full frames. Saves bandwidth (~70% reduction on average voice stream) and battery on mobile.

For voice over WebRTC: 20ms frames, 24 kbps VBR, DTX on, FEC on, packet-loss% matched to your network.