A little more about audio - What is lossless and what's not? How to read spectrograms?

A lot of people keep asking how to tell what is really lossless or lossy. They check their audio files through the spek program (or SoX) by generating its spectrogram image. But what really do you have to look in the spek image to tell what is lossy and what is lossless? Before we get into it, let's recall what is lossless. 

Well according to ChatGPT or Google,

"Lossless music refers to audio files that are compressed in a way that retains all the original data from the source. Unlike lossy compression methods (such as MP3 or AAC), which discard some audio information to reduce file size, lossless compression formats (such as FLAC, ALAC, or WAV) allow the original audio data to be perfectly reconstructed from the compressed data. This means that, in theory, there is no loss of quality compared to the original recording."

So basically what it means is that lossless music or lossless of an audio file should be or is identical in terms of quality i.e. audio frequencies should be the same as its source. 

But keep that in mind that "Nothing is ever lossless". Theoretically no matter how costly your recording setup is or will be, one can only retain some things. Some analog signals will be left out during the recording no matter what you try, however, those signals can't be recognized by humans but yes only some things are recordable. But that's just theory, should we follow it blindly? Nope, as then nothing can be called lossless as per theory.  

So how do we determine what's lossless or lossy since our source is not the live recording but the Audio CD we purchased or the songs we download from torrents or public forums? Should I use those lossless music detector programs like Lossless Audio Checker or Fakin' the funk?

Nope.

The reason is that these programs work on linear algorithms or have some predefined checks or rules for lossless audio qualification and if the audio passes 'their' set bar or threshold then the track will be called legit else it will be declared upsampled, upscaled, lossy, etc. So you can't rely on them as they are never accurate.

So how to know what is lossless or lossy? Well, you can use the spectrogram to determine, but how? 

Let's understand how to read spectrograms, but before that let's know what is a spectrogram?

Well, each musical note corresponds to a distinct frequency: lower notes correspond to lower frequencies, while higher notes correspond to higher frequencies. These frequencies are represented on a spectral diagram, which graphs all the frequencies against time in a music file. Frequencies are measured in hertz (Hz) and kilohertz (1,000 Hz). The human hearing range spans from approximately 20 Hz to 20 kHz (20,000 Hz).

So basically spectrogram is a visual representation of the intensity of various frequencies of an audio file during a timeline. In simpler words, a photo that shows the intensity or strength of frequencies of the audio over time. 




This is the way the spectrogram is generated, horizontally (left to right) i.e. X-Axis is for time and vertically (Bottom to Top) Y-Axis is for frequencies. Now the picturisation of sound frequencies on these axis takes place. The image displays which freq is more intense by colours at what time. Now how sound is converted to image? It is done using techniques like Fast Fourier Transform (FFT), now what is that? In simple words - 

Suppose you have a song, and you want to see all the different sounds that make up that song, like a big rainbow of sounds.

First, you take a tiny piece of the song, just a small bit. Then, you use a special tool called FFT, which is like a magical magnifying glass that can look at all the different sounds (frequencies) in that tiny piece and assigns a colour to frequency as per their range.

Next, you do the same thing with the next tiny piece of the song, and the next, and so on. When you're done, you put all these pieces together to make a big picture. This big picture, showing all the different sounds over time, is called a spectrogram. It's like a colourful map that shows you how the song changes from moment to moment.

Now we know what is spectrogram and how it is generated, so now how to use that to detect whether the audio is lossless? 

Each file format, such as MP3 at 128/256/320 kbps, WAV, or FLAC sourced from an Audio CD, as well as files at 48 kHz or 96 kHz, typically has a relatively standard frequency cut-off. So we had to know that first that how a spectral file of lossy audio looks first,

But let's understand or look how spectral image look of tracks extracted from Audio CDs first -

Songs of an audio CD and lossless tracks have frequencies that reach up to 22 kHz. Because transcoding from one lossless format to another retains all the data in the music file, the spectral diagram of a lossless song will appear identical whether it is in FLAC, WAV (PCM), ALAC, or any other lossless format.

Spectrogram of a 16 Bit 44.1khz audio file sourced from  authentic Audio cd below-


This track is taal se taal from the movie Taal, See the freq are reaching till 22khz, without any in-between cuts or void. This a fully lossless copy sourced from Audio CD.

But now let's take a look at a different track, I am taking an instrumental track from Interstellar movie, Cornfield Chase which was composed by Hans Zimmer (i love the music). Let's its see spectrogram -

Spectrogram of a 24 Bit 44.1khz audio file sourced from authentic web release below-


Hmm here the freq are only rising up till 10khz and from there it started diminishing and it stopped till 15khz. Above that you see those is noise or artifacts which come when we record the audio. Also the audio is not even touching 22khz so is it lossy?

No.

Why?

Because different music genres exhibit distinct spectral patterns. The track mentioned above featured many instruments and vocal performances, resulting in a wide range of frequencies throughout the song, which is why its spectrogram appeared so vibrant and full. In contrast, the instrumental song by Hans Zimmer used only organ and piano, leading to a spectral image with fewer frequencies. Despite these differences, both tracks are in lossless formats. Lookout for an abrupt cut of frequencies in spectrogram image to know whether it's lossy or not. Let me show you now lossy spectrograms of both these songs -

Let's look at the spectral image of MP3 320 kbps CBR version of that taal song- 





















This is the spectral image of 320 kbps mp3 encode of that taal song. Notice that the frequency cuts at 20khz, MP3 files encoded at 320 kbps (CBR) typically have a frequency cut-off at around 20.5 kHz. This means that frequencies above 20.5 kHz are generally not present in such files.

Now let's take a look at 320 kbps CBR MP3 encode of Cornfield Chase. 





















Here you can see that above 16kHz the noise data is lost, not just that if you look very closely enough for abrupt or small small voids in the spek image, there are some spots where frequency is lost in the spek image as compared to spek of lossless file. 

You can try it yourself, download foobar2000 or dbpoweramp on your PC. Drop a flac file and convert it to 128kbps or 256kbps mp3 and notice till where the frequency goes. 128kbps mp3 has a limit till 16kHz.

So now we know how to detect lossy and lossless with the help of spectrogram, but there are some special cases too! Like I bought an  Soundtrack Audio CD of Agneepath composed by Ajay Atul. I love those tracks but i found something strange in it. Which was that, when i ripped or extracted the tracks from that audio cd in flac or wav on PC and checked its spectrogram i saw something weird which was -









































All the tracks of this audio cd are like this, i.e. their spek image are showing that the core audio is till 16khz and after that there is just artifacts. But i ripped this in lossless then why this happened? Well there are reasons for that could be -

- The original recording might not have significant audio content above 16 kHz. Many musical instruments and vocal recordings do not produce much energy in the higher frequency range. But the instruments used in this track or vocals can easily cross 16kHz. 

So as per my understanding the real issue is not of CD nor my bad ripping skills. The issue here is of Lossy mastering and pressing. During the recording and mastering process, high frequencies might be filtered out by accident or intentionally to get the current sound profile (which i don't prefer i.e. lossy).

But the most logical or common reason that could have happened would be that if the audio was compressed or encoded before being burned onto the CD that has lead to this lossy release on audio CDs. So yes this is not actually lossless (in my opinion) but it is lossless if we look at theory, as it same as the source (CD). 

Let's see some spectrograms of Hi-res files, shall we?





















This above spectrogram is of a 96kHz audio file, Bulleya track from the movie Ae Dil Hai Mushkil. But it's not actually a 96khz track because the frequency cuts off at 24khz which means that it is a 48khz audio in real and is just resampled at 96khz. These types of tracks are called upsampled tracks. This means the sampling rate is lower but they are sampled at a higher sampling rate.

Well sound engineers often master the track at a higher sampling rate during post-processing and later publish it too at a higher sampling rate or resolution. Let's see example of that too - 





















This above track is a true 96khz track as per release but it is not actually! The track has audio till 24khz and after post processing at 96khz it was released at 96khz. So it was not a actual 96khz recording.





















Now this track above seems like a legit 96kHz release as the presence of substantial audio content above 24 kHz and extending up to 48 kHz suggests that this track was recorded or processed at a high sample rate. Therefore, it is reasonable to conclude that this track is a true 96 kHz audio recording, as it contains high-frequency information consistent with such a sample rate.

Alright that's enough for now, will post another blog for Hi res audio and regarding codecs!

I hope this blog was helpful for you for spectrogram reading or gives you an idea about what are the differences you see between lossy audio vs lossless audio.

Thanks for reading. 

Comments

Popular Posts