# Ultra File Compression



## JFTS (Jun 8, 2013)

So, does anyone know an effective method to ultra-compress files? You might have seen some PC games compressed to incredibly small sizes here and there (like GTA San Andreas which was compressed to 1mb file!). I tried many high-compression programs like 7zip, UHARC, KGB Archiver, but I can't get the same results on other types of files, like a music album for example. Why some files can't be compressed? I don't get why games can be reduced to 70% of their size, but a 60MB album, or a 5MB PDF are pretty much the same after compression.


----------



## FAST6191 (Jun 8, 2013)

If San Andreas came as a 1 meg file it is because the 1 meg was either 
1) Fake
2) A virus
3) A downloader for the rest of the game/frontend for a service like onlive.
4) Possibly a combination of all three previous options.

There are ultra small games that do things like procedural generation ( http://pouet.net/prod.php?which=12036 being one of the more noted examples of it happening in the modern world) but GTA is not one of them nor one able to be reduced to it.

Compression is based on repetition of patterns in files and many files (multimedia ( http://www.cmlab.csie.ntu.edu.tw/cml/dsp/training/coding/mpeg1/ )and things like PDF being great examples) are compressed as part of how they are made. There are some area specific compressions and there are some half compression/half file recovery methods more suited to use on a supercomputer but let us not get into that right now.


----------



## JFTS (Jun 8, 2013)

Check out this link to see what I mean: https://forums.digitalpoint.com/threads/super-ultra-compressor-compress-1-7-gb-to-173mb.1271631/

Also, can you explain to me a bit more the thing you said about PDFs. Is it the same with music files, video etc?


----------



## FAST6191 (Jun 8, 2013)

Command & Conquer: Renegade is a 2002 textured 3d game -- it would not have used much compression on anything and thus is a prime candidate for this. The link you gave does not compare it to other methods but I reckon if done right you would see similar savings from a basic zip file -- it would be a distraction from the point but if you just tried to compress the directory it would likely not work very well as basic zip works on a file by file basis where things like 7zip take all the files and call it a single data pool to compress.

PDF
The PDF is halfway between javascript, html, a word document and lisp (it is a horrible standard as far as such things go... which is half the reason PDF readers are one of the tree main avenues for malware along with flash and java). However being largely composed of text and markup and being billed as a portable format the programs that make it tend to compress it when they make them for basic text and markup compress quite well. As compression eliminates repetition it will 

Music and video... I should have mentioned this first time around. It is called lossy compression and as the name implies it is compression that loses data.
If your document does not decompress exactly as it went in then bad things happen (your contract for 10 million now reads 10 and such like) -- for this you need lossless compression.
If your video happens to not have as deep a black for a few frames as it might originally have had, the somewhere where the actor is not happens to be a bit blurry or it jerks a tiny bit between frames it is no big deal. It gets very very complex (I linked MPEG1 already -- that is some 20 years old at this point, now consider what has happened in the last 20 years of computing and how multimedia still pushes computers to the limit) and draws on everything from high end physics to psychology but those sorts of things are what is done.
Music is kind of similar (in fact very similar if you take it down to the lowest levels and underlying maths) in that people are not good at telling closely related pitches apart (though recent studies have shown it is better than some assumed... if you are a trained musician) so at points they will leave things the same that nobody should be able to tell apart and gain a space saving.


----------



## JFTS (Jun 17, 2013)

Sorry for not responding sooner.

Thanks for your explanation! The only thing I don't get is the "random" rate of the compression of the files. I have a 23MB folder containing many 10-20 sec mp3s and a PDF about 1,5MB. When I compressed this folder with 7-zip, it went from 23 to 5MB! How can it be compressed so much, but other files/folders can't?


----------



## FAST6191 (Jun 17, 2013)

The MP3 thing is odd unless there is a lot of silence.

Generally when you compare something like 7zip to something else you have to consider that 7zip treats each file as one continuous/contiguous data set where basic zip and some implementations for rar do not. If you have a lot of files of the same type (again MP3 is a bit odd) then it may see similarities between the files where lesser methods might not. It is also why if you try to extract just one file from the 7zip archive it will take ages.

That might be a bit overly simplistic as there is also the dictionary size to deal with (short version is it only considers so much of the file before it stops looking) but for small files that will not come into play.


----------



## JFTS (Jun 17, 2013)

Thanks for your advice! I guess it's kinda "trial-and-error" method. Some files can be ultra compressed and some cannot.


----------



## Xuphor (Jun 17, 2013)

Two methods come to my mind for extreme file compression:

1 - Bethesda's compression software. The managed to fit the entirety of Skyrim into just a little over 5 GB on the PC version. That's insane. It's also not available to the public without some very fancy (and illegal) third-party software.

2 - UnARC, some sort of program I've discovered from uh.... less than legal places (Torrent websites). It short, it managed to once get a 7GB game down to ~2GB somehow, without loosing/ripping anything. However, the program itself is *NOT* illegal, so here is a link: http://membled.com/work/apps/unarc/
I do not know how to use UnARC myself, but I've seen the wonders it can do. Beware, however, if you do use the most extreme compression method in UnARC (like the 7GB down to 2GB thing), it took my computer a good hour and a half to extract it, using an AMD Phenom II X4 965 processor. While it's not the bets CPU, it's certainly not terrible, so decompression times are very long. I can't imagine how long it would take to compress something using that.


----------



## DinohScene (Jun 17, 2013)

Why compress files as the technology these days allows for large harddrives and large flashdrives ._.
Not only that but the internet gets faster daily. 

Besides, you really need 6+ TB of data?


----------



## JFTS (Jun 18, 2013)

Xuphor said:


> Two methods come to my mind for extreme file compression:
> 
> 1 - Bethesda's compression software. The managed to fit the entirety of Skyrim into just a little over 5 GB on the PC version. That's insane. It's also not available to the public without some very fancy (and illegal) third-party software.
> 
> ...


I think that for big games or professional programs, all companies must have "internal" compressors/extractors in order to fit them on discs.

Thanks for the reference about UnARC. I'll try it later and see if the results are any different.




DinohScene said:


> Why compress files as the technology these days allows for large harddrives and large flashdrives ._.
> Not only that but the internet gets faster daily.
> 
> Besides, you really need 6+ TB of data?


It's not so much about compression itself, but when you archive raw uncompressed files of any kind, you always have to think about the space they need. Furthermore, when working with professional audio/video programs, 6+TB are not so uncommon for HDDs.


----------

