Lossless audio is used on various media, including studio masters, CD, DVD-Audio (via MLP) and Blu-ray (via Dolby TrueHD, which is technically a rebrand of and an extension to MLP, and DTS-HD Master Audio). All of these, when decoded, will result in a pulse-code modulated signal identical to the source, unlike the popular MP3 format. MP3 performs a quality-file size trade-off by discarding or reducing frequencies less audible to human hearing.
PCM by design uses a constant bitrate, which is proportional to the sample rate, bit depth and number of audio channels, which results in very large file sizes with the increasing of each parameter, and/or duration of the audio track.
Solutions such as FLAC, TrueHD and DTS-HD MA are used to losslessly compress the source audio so that the rest of the medium (for example, a Blu-ray disc) can be used for more audio tracks, higher-bandwidth video or extras.
Out of the aforementioned, only FLAC is free to use—both TrueHD and DTS-HD MA encoders and decoders have to be licensed.
Anatomy of a WAVE file
In order to process a digital audio signal, we have to know three key parameters:
- The frequency, at which the signal was sampled. Usual sample rates are 44,100 Hz; 48,000 Hz, 96,000 Hz and rarely 192,000 Hz.
- The “depth” of each sample, measured in bits. The FLAC encoder supports up to 24 bits. Our C# WAVE reader will support 16 and 24 bits of audio data.
- The number of channels, which the recording consists of. This is usually mono, stereo (CD), 5.1 (DVD-Audio, Blu-ray) or 7.1 (Blu-ray).
The most common container for PCM audio data is the WAVE file format. As noted above, PCM has a constant bitrate of
SampleRate * BitDepth * Channels,
which makes it very easy to predict the size of each block of audio samples—a single second of audio data would be Bitrate / 8 bytes (8 bits in a byte)—e.g. 176.4KB for a second of CD-quality audio.
We can create the initialization method of the WavReader class by starting with an input Stream object. We have to ensure that there is enough available data for the wave format header, and check that the file is indeed a RIFF/WAVE file to avoid unnecessary reading and processing.
uRiffHeader = reader.ReadInt32();
uRiffHeaderSize = reader.ReadInt32();
uWaveHeader = reader.ReadInt32();
if (uRiffHeader != 0×46464952 /* RIFF */ ||
uWaveHeader != 0×45564157 /* WAVE */)
throw new Exception(”Invalid WAVE header!”);
Right after the RIFF chunk there can be a number of JUNK (padding) chunks, which we can skip and data and fmt chunks, whose data we need.
// Read all WAVE chunks
while (reader.BaseStream.Position < reader.BaseStream.Length)
int type = reader.ReadInt32();
int size = reader.ReadInt32();
long last = reader.BaseStream.Position;
case 0×61746164: /* data */
uDataHeader = type;
nTotalAudioBytes = size;
case 0×20746d66: /* fmt */
uFmtHeader = type;
uFmtHeaderSize = size;
format.wFormatTag = reader.ReadInt16();
format.nChannels = reader.ReadInt16();
format.nSamplesPerSec = reader.ReadInt32();
format.nAvgBytesPerSec = reader.ReadInt32();
format.nBlockAlign = reader.ReadInt16();
format.wBitsPerSample = reader.ReadInt16();
format.cbSize = reader.ReadInt16();
if (uDataHeader == 0) // Do not skip the ‘data’ chunk size
reader.BaseStream.Position = last + size;
Our WavReader class only supports 16 and 24-bit PCM samples, so we have to ensure that format format.wFormatTag is 1 (PCM) and format.wBitsPerSample ple is either 16 or 24. These limitations can be further removed by implementing sample rate conversion on-the-fly.
After all headers are read, nTotalAudioBytes will contain the total count of audio data bytes in the WAVE file. To determine the duration of the audio file, we can simply divide it by the block size (Bitrate / 8 bytes).
The input stream will now be at the start of the audio samples. Every 16th or 24th bit (respectively, 2nd or 3rd byte) will mark the beginning of each sample. All audio samples are interleaved so the stream consists of:
channel 0, sample 0
channel 1, sample 0
channel n, sample 0
channel 0, sample 1
Now that we have reached the audio samples, we can start feeding them to FLAC.
Free Lossless Audio Codec
The FLAC encoder is an open-source C/C++ project. In order to use it in a C# application we have to use PInvoke to call its application programming interface—LibFlac.dll. The latest version of LibFlac can be found at http://sourceforge.net/projects/flac/files/flac-win/. When the encoder processes a block of audio samples, our callback functions will write the compressed data to the output stream. In a nutshell, what our FlacWriter class is going to do is:
The FLAC encoder is an open-source C/C++ project. In order to use it in a C# application we have to use PInvoke to call its application programming interface—LibFlac.dll. The latest version of LibFlac can be found at http://sourceforge.net/projects/flac/files/flac-win/.
When the encoder processes a block of audio samples, our callback functions will write the compressed data to the output stream.
In a nutshell, what our FlacWriter class is going to do is:
delegate int WriteCallback(IntPtr context, IntPtr buffer, int bytes, uint samples, uint current_frame, IntPtr userData);
delegate int SeekCallback(IntPtr context, long absoluteOffset, IntPtr userData);
delegate int TellCallback(IntPtr context, out long absoluteOffset, IntPtr userData);
When a buffer of PCM samples has been read, we can pass it to FlacWriter. Internally it has to pad all 16 or 24-bit sample to a 32-bit window (using little-endian format).
padded = new byte[buffer.Length * 8 / inputBitDepth];
if (inputBitDepth == 16)
for (int i = 0; i < paddedSamples; i++)
padded[i] = buffer[i * bytes + 1] << 8 |
buffer[i * bytes + 0];
else if (inputBitDepth == 24)
for (int i = 0; i < paddedSamples; i++)
padded[i] = buffer[i * bytes + 2] << 16 |
buffer[i * bytes + 1] << 8 |
buffer[i * bytes + 0];
The main program will use a combination of WavReader and FlacWriter to perform the encoding of WAVE files. Because both classes implement the IDisposable interface, they close all input/output streams as well as the FLAC encoder instance when the Dispose() method is called, or a using statement is used.
using (WavReader wav = new WavReader(inputFile))
using (FlacWriter flac = new FlacWriter(
// Buffer for 1 second’s worth of audio data
byte buffer = new byte[wav.Bitrate / 8];
bytesRead = wav.InputStream.Read(buffer, 0, buffer.Length);
flac.Write(buffer, 0, bytesRead);
} while (bytesRead > 0);
Testing the program
All source code and the compiled 32-bit FLAC library can be downloaded from here.
The test program uses three sample WAVE files from the ‘wav’ folder and encodes them to ‘flac’. You can download the sample files from here, courtesy of 2L (Username: HD Password: 2L), and place them in ‘wav’:
- Britten: Simple Symphony, Op. 4 96kHz/24-bit stereo sample
- Britten: Simple Symphony, Op. 4 48kHz/24-bit 5.1 surround sample
- Haydn: String Quartet in D 96kHz/24-bit stereo sample
The test program also uses the ConsoleProgress class I have posted earlier.
The code can be further optimized and extended to support various other bit depths. The next part of the series will explore the decoding of DTS-HD Master Audio and its encoding using FlacWriter.
I would appreciate any questions, suggestions or corrections!