High Fidelity, FLAC and other stuff

in #flac7 years ago

MOTIVATION Recently I sent a compressed FLAC audio file to a friend who loves music, I just told him “play it on your PC”. Next day he came to me saying that the sound he heard blowed his head, it seemed like sound coming from all around the room, he was impressed by the fidelity and asked what it was. It is only music in CD quality (44.1KHz/24bits/stereo), but as he was already used to listen MP3 files for long time thus completely forgot how an original disc sounds. He asked me to explain the difference in layman terms without cumbersome math and tech words, I took the challenge and this is my modest explanation.

FIDELITY When it comes to audio there are two types of fidelity, Objective and Subjective. Objective Fidelity is a measure, a figure calculated by well known mechanism that accounts for how carefully an output signal is equal to its input signal. For instance, if an amplifier claims to have 0.01% of total harmonic distortion at 1KHz input, it is easy to verify with the proper lab equipment, furthermore anyone who makes this measure will get most probably the same exact results. You don’t even need to listen to the amplifier to do this. However, having two amplifiers with the same objective characteristics, there will be people who prefer the sound quality of one over the other. This is the case of the long time arguing of audiophiles about vacuum tube versus solid state amplifiers. There are people who can tell the difference, and some prefer one of them despite they have same objective fidelity. This preference is clearly subjective and very hard to measure, but is real. There are elements within those differences that can be qualified such a as bright, warmness, etc. But subjective quality in general cannot be measured. Different observers won’t agree on the results of the same subjective tests, so those results are empirical and useless for statistics purposes.

CODECS

A CODEC is a mechanism to encode and decode. In audio slang, a Codec is a computer program for converting an audio signal (sound) into a digital file, and also backwards converting a digital file into audio signal, i.e. MP3, FLAC, WMA, AAC, VORBIS (OGG), SPEEX, AC3.

FIDELITY, ARTIFACTS AND DIFFERENCES

A loss in fidelity implies noticeable differences between input and output signals, but it doesn’t mean that those differences are unpleasant or it is poor quality of sound. A vacuum tubes amplifier has no more fidelity than any modern solid state amplifier, it just add a form of distortion and colored sound that is barely audible and even gladly acceptable for most people.  We call Artifact to those elements added to an output signal, which is easily to identify, obviously strange and undesirable.

OMNIPRESENT MP3 CODEC

It is a property of German laboratories Fraunhofer IIS, MP3 codec (stands for MPEG 1 audio layer 3, and MPEG means Moving Pictures Experts Group, from International Organization for Standardization ISO) belongs to the perceptual family of codecs that aim to decrease bitrate, thus decreasing the size of the audio files at the expense of loss in objective fidelity, this is a fact. The idea is to lose fidelity in ways that cannot be perceived. However, some applications require very low bitrates that can’t be achieved at the expense of objective fidelity alone, this is also a fact, regardless of what the big names in industry publish in specialized media. Subjective fidelity will eventually suffer somehow.  As a result they yield a trade-off solution in which loss in fidelity has to be elegant, hard to perceive and easy to ignore.

HOW MP3 WORKS 

To attain such a decrease in the amount of data, MP3 relies on using some techniques and psychoacoustic tricks, Let’s describe just five of the most used:

1_Minimum Hearing Threshold

Minimal Hearing Threshold is not linear, it is not equal for all the frequencies, and is represented by an attenuation curve very sharp at 2KHz and 5KHz in what is called Fletcher and Munson curve ((ISO 226:2003Acoustics -- Normal equal-loudness-level contours) . So, as long as human ears barely perceive the sounds between these two frequencies, is not necessary to encode these sounds.

2_Masking Effect

Based on the fact that within audio fragments of loud sounds, weaker sounds cannot be heard at the same time. Thus, there is no need to encode every sound. This is the main mechanism of MP3 for reducing the size of audio files, it works based on psychoacoustic models of human ear behavior.

3_Bytes Reservoir

There will be audio fragments that cannot be encoded to a desired bitrate without compromise audio quality. In those cases MP3 uses a byte reservoir (buffer) borrowed from unused space of audio fragments that can be encoded to a lower bitrate.

4_Joint-Stereo Encoding

Some mid to high fidelity equipment have an unique subwoofer, however it gives the sensation that the sound is not coming from this subwoofer but from satellite speakers instead. In fact, human ear is not capable to determine the direction of the source for sounds at very high or very low frequencies, is very hard to tell where they’re coming from. MP3 uses this trick using a technique called Intensity-Stereo (IS), recording some signals as monophonics followed bay some information for rebuild a minimum acoustic spacing during decoding and playing.  Another tool is Mid/Side-Stereo (M/S Stereo), often used when both stereo channel signals are very similar, in this case only composite channels Middle (L+R) and Side (L-R) are encoded instead of L and R channels. This reduces drastically the size of resulting audio file. Original channels L and R are easily rebuilt during playback.

5_Huffman Encoding

Huffman algorithm (Huffman Coding homepage) is used at the end of The encoding process, and generates variable length output codes from fixed length input codes (same number of bits), here repetitive sounds are represented by codes of shorter length, allowing saving up to 20% in space.  Bottom line is sound of a MP3 file is a version of original sound, modified in such ways that are assumed pleasant to human ear.

FLAC, A LOSSLESS CODEC?

FLAC stands for Free Lossless Audio Codec. Audio is compressed without any loss in original quality. The software is free to use. Technical specifications are available under open source license (FLAC codec homepage), it runs on every operating system be it Linux, Windows, Unix*BSD, Solaris, Mac OS, BeOS, OS/2, Amiga.  Is well known that no algorithm can compress every type of input signal in a lossless way, thus compressing algorithms are restricted to a certain class of input signals (called domain) which can handle in a very efficient way. FLAC domain are audio signals. Even though FLAC can encode any audio signal, only some of them can be reduced in size. FLAC exploits the fact that audio signals have a very high degree of correlation between subsequent samples.

How FLAC works

  It works in four stages, summarized as follows 

1_Block Building

Input signal is split in several continuous blocks of variable sizes. Optimal size for each block is determined by bitrate and spectral characteristics among others. These blocks are passed to next stage. 

2_Inter-Channels Decorrelation

Similar to Mid/Side-Stereo (M/S-Stereo) mechanism described earlier, two signals are created Mid and Side based on the Average and Difference, respectively, of L and R channels. Only the best of these two (the best representation frame by frame) is passed to next stage.

3_Prediction

The encoder tries to find an approximate mathematical description of the input signal. Typically such description is way too smaller than signal itself. Considering that this mathematical predictive model is known by both the encoder and decoder, only parameters for the model are transmitted and not the whole signal. FLAC uses at least four types of prediction models, Verbatim, Constant, Linear and Finite Impulse Response (FIR); Prediction models will not be described in this article. FLAC allows changing prediction model from block to block and even from channel to channel in the same block.

4_Residual Coding

If prediction model doesn’t describe original input signal accurately, any difference (called residual error) has to be encoded losslessly. More efficient the prediction model smaller residual error and thus smaller quantity of bits. Residual Coding takes place using Rice Code Algorithm.  Final result is an audio file with extension *.flac that reproduces accurately the original audio, even though its final size is bigger than its MP3 version (compressed) is also smaller than its WAV version (uncompressed).

FINAL THOUGHTS

Didn’t mean to establish a superiority of one codec over the other, each one was designed with different goals in mind. Even each developers communities are different. This plurality, far for make any harm, brings benefits to users. Me as an example, my whole CD collection is backed up in FLAC format. In the other hand, web applications require moderate bitrates due bandwidth constrains, in this environment MP3 prevails, but what would happen when bandwidth stop being a restriction?  Arguing have been made due higher file size of FLAC files compared with MP3 and others in benefit of the later, however storage media such as hard drives and memory cards have more capacity and lower prices as the time goes by, so this difference in size won’t be relevant soon.  It is worthless having high fidelity music without an equipment able to reproduce completely. Recorded music is played through a chain of elements as amplifiers, players, audio interfaces, speakers and headphones. Like every chain, it would be as stronger as its weakest link. High Fidelity equipment are becoming more affordable every day due cost reduction in technologies, we can take advantage of that.  I’ve been asked, why manufacturers includes MP3 but not FLAC support? Even iPod create a codec of they own. Reasons are far from technical and yield to marketing and business models, fortunately for many of us those models are changing, and you can hear it...pay attention, open your minds and listen. 

Sort:  

Congratulations @angelparrales, you have decided to take the next big step with your first post! The Steem Network Team wishes you a great time among this awesome community.


Thumbs up for Steem Network´s strategy

The proven road to boost your personal success in this amazing Steem Network

Do you already know that awesome content will get great profits by following these simple steps, that have been worked out by experts?

Coin Marketplace

STEEM 0.18
TRX 0.16
JST 0.031
BTC 60354.53
ETH 2608.25
USDT 1.00
SBD 2.54