Project I invented an algorithm that allows the Vorbis codec to achieve a terrifying level of audio compression at 36KHz/128kbps/Stereo.

FamVanHa

Member
OP
Newcomer
Joined
Apr 29, 2024
Messages
20
Trophies
1
Age
17
XP
154
Country
Vietnam
I invented an algorithm that allows the Vorbis codec to achieve a terrifying level of audio compression at 36KHz/128kbps/Stereo.

The algorithm is very simple, and it is simple enough for humans to do it manually, the principle of this algorithm is "slow down and speed up".

"When you do something slowly, in most cases, the accuracy will increase significantly". Applying this principle, this algorithm will adjust the audio encoding and decoding process, slowing the audio down by 2 times and encoding it with a Sample Rate of 1/2 compared to the source audio file, then decoding and playing it back at 2x the speed increase. (For example: If the source audio file has Sample Rate = 88.2KHz, Bit Depth = 16, we will slow it down by 2 times by converting it to PCM format and reading the received PCM file with Sample Rate = 44.1KHz, which is half of the original file), then we convert it to WAV file, keeping the same parameters. Then, we convert the received WAV file to OGG Vorbis format using the FFmpeg command line:

ffmpeg -i %1 -vn -c:a libvorbis -b:a 64k -ar R1 -ac 2 -cutoff 48k %2

where, %1 is the path to the received WAV file, %2 is the path to the destination OGG Vorbis file, R1 is the Sample Rate of the file, in this case 44.1KHz.

The OGG Vorbis file after receiving cannot be played normally because it has been slowed down 2 times, to be able to play normally, I continue to use the software "Music Speed Changer", enable the "Join tempo and pitch" feature, change the playback speed to 200% and... BOOM! now the compression is up to 36KHz/128kbps!
You have a 3-minute input audio file with Sample Rate = 96KHz, bit depth = 16.
You want the output audio file to have Sample Rate = 96KHz, bitrate = 128kbps and encoding factor (abbreviated as HS) = 2, you just need to convert the input audio file to PCM format, keep the input file parameters, then read that PCM file with Sample Rate = 96KHz / HS = 48KHz, then convert to wav format, keep Sample Rate = 48KHz, continue to convert the received wav file to OGG Vorbis format using the FFmpeg command line:
ffmpeg -i %1 -vn -c:a libvorbis -b:a B1 -ar R1 -ac 2 -cutoff 48k %2.
B1 is the Sample Rate you aim for, divided by HS, here the result is 64kbps.
R1 is the bitrate you aim for, divided by HS, here the result is 48KHz.
After conversion, you will get an audio file with :
duration : double the original file.
Sample Rate: equal to 1/2 of the original file.
bitrate : 64kbps.
Now if you use the software "Music Speed Changer", activate the Join Tempo and Pitch function, play back that audio file at 200% speed :
duration : 6 / HS = 3 minutes.
Sample Rate : 48KHz x HS = 96KHz.
Bitrate : 64kbps x HS = 128kbps.
and the sound quality is much higher than when encoding with libopus, aac, libfdk_aac, aac_he, aac_he_v2 codecs at the same bit rate of 128kbps.

I also tested with many different parameters, and here is the standard parameter level for Nintendo DSi if there is software support:

ffmpeg -i %1 -vn -c:a libvorbis -b:a 32k -ar 24k -ac 2 -cutoff 48k %2

high quality parameters:
ffmpeg -i %1 -vn -c:a libvorbis -b:a 64k -ar 24k -ac 2 -cutoff 48k %2

With HS factor = 2.

A/ Audio spectrum when encoding with libvorbis codec.

B/ Audio spectrum when encoding with libvorbis codec with -cutoff 48k command.

C/ Audio spectrum of wav file received after decoding ogg file using Vorbis codec with "slow down and speed up" algorithm

D/ Spectrogram of the original audio file.

Note: the software "Music Speed Changer" has the installation package name com.smp.musicspeed. You can find Music Speed Changer software on Google Play or on other stores like APKCombo, APKPure..etc...

Attached is an audio file encoded with the libvorbis codec using the "slow down and speed up" algorithm.

now, if I want it to be really automatic and useful, I need a few things:
1/ a software that can do this algorithm automatically, including encoding and decoding.
2/ a file format to hold audio data encoded with this algorithm (to be easily distinguished from other ogg, ogv, or other common webm, webp, oga files) and include the file's encoding configuration information [containing information about the encoding factor].
Anyone able to do these?!
 

Attachments

  • A.png
    A.png
    3 MB · Views: 18
  • B.png
    B.png
    3.3 MB · Views: 11
  • D.png
    D.png
    3.3 MB · Views: 9
  • Libvorbis + Algorithm.ogg
    3.3 MB
  • C.png
    C.png
    3 MB · Views: 9
Last edited by FamVanHa,

IC_

GBAtemp's ???
Member
Joined
Aug 24, 2017
Messages
1,611
Trophies
3
Location
The Forest
XP
6,237
Country
Serbia, Republic of
Scary? Why are people scared of this? What element of this is scary?
This is an interesting concept, but it definitely needs more research. If this really allows for such an efficiency and quality improvement, it would be weird if the creators of the codec didn't discover this.

1/ a software that can do this algorithm automatically, including encoding and decoding.
For now, you could simply do this with a shell/batch script.

This thread should probably be moved to the Computer Technology section.
 
  • Like
Reactions: CoolMe

FamVanHa

Member
OP
Newcomer
Joined
Apr 29, 2024
Messages
20
Trophies
1
Age
17
XP
154
Country
Vietnam
Scary? Why are people scared of this? What element of this is scary?
This is an interesting concept, but it definitely needs more research. If this really allows for such an efficiency and quality improvement, it would be weird if the creators of the codec didn't discover this.


For now, you could simply do this with a shell/batch script.

This thread should probably be moved to the Computer Technology section.
The biggest problem lies in the sound decoding process. Most of the multimedia transmitter will not be able to play normally. The sound will be slow and look distorted because it has been stretched twice. I need something that can integrate the power conversion into PCM format, slow down it 2 times by reading it with the initial half, converting it to a stable wav format. Moreover, and keep the parameters, then continue to convert to OGG Vorbis format and play it again normally.
This is the FFnpeg command line to convert WAV to OGG Vorbis:
ffmpeg -i %1 -vn -c: a libvorbis -b:a B1 -ar R1 -ac 2 -cutoff 48k %2
In which %1 is the path to the wav file you have just received, %2 is the path to the output Vorbis Ogg file, B1 is the bit speed you target / HS encoding coefficient.
For example, with HS = 2, you want the output file with bitrate = 128kbps, you take 128 / HS = 64 kbps.
R1 is the Sample Rate level you target / HS = 48kHz.
What will happen if the audio file you just received at a speed = 200%?!
For example, if you have a 2 -second sound with 10,000 sound waves, there will be an average frequency of about 5000Hz. If you increase the speed to double, it will shorten the time to 1 second but at the same time push the frequency doubled, equal to 10,000Hz (due to 10,000 sound wave vertices stretching in 2 seconds now have been suffered. put down to 1 second). Because the sound has a higher frequency, it will also be more linthy, and the opposite if you slow down the sampling speed, the sound frequency will decrease and make it deep down. An Ogg file is compressed by standard codec Libvorbis, without the -cutoff 48k command, there will be a reconstructed sound range from 0 to 15.5kHz at 64kbps and 48kHz Sample Rate. If we increase the speed to the x2, you try to speculate how much it will increase the sound range?!
 

Site & Scene News

Popular threads in this forum

General chit-chat
Help Users
  • No one is chatting at the moment.
    DinohScene @ DinohScene: champ