ROM Hack compression?

bugubugu

Member
OP
Newcomer
Joined
Aug 26, 2010
Messages
9
Trophies
0
XP
1
Country
Canada
Hi, i know this is the nds hacking section, but i'm running into a rather big, unbreakable brick wall in my attempt to find the script in Fate/Extra.
I'm fairly certain it is in a bunch of files with the extension .cmp. At first i thought they were Character MaP files, since there are about 300 Mb of them, but now i'm starting to think that they are actually CoMPressed files that happen to have some script in them. In other files that did have text, there were flags like #C21022158,#CDEF and #REND, that kept popping up, which i assumed were flags to tell the program which character portrait to display and whatnot.
WHile looking for the script, i wrote some a quick 10 line program to search for instances of #REND in all the files. I was rathe rsurprised to find them in the .cmp. WHen I looked at them, however, I did indeed find #REND a few times, but i also found mangled versions of them. there were also a bunch of sections there that were NEARLY sentences, but from the hex, it looked like normal shift jis with 83 22 83 54 and then an occasional random byte that messed everything up. little by little, the mangled and mutilated sentences got more mangled and more mutilated until it really was nothing but gibberish.
Does anyone know if this is a common form of compression and if so, how I would go about uncompressing it?
In the header, the first thing it said was IECP, if that helps any...
Here's an example if my explanation is too unclear:


here are some example files if you wouldn't mind taking a look:
http://ifile.it/zerpa4h

Thank you for your time, and my apologies if psp related questions isn't welcome here.
 

bugubugu

Member
OP
Newcomer
Joined
Aug 26, 2010
Messages
9
Trophies
0
XP
1
Country
Canada
rastsan said:
they are compressed and there is a decompressor(s).
I have not tried this one but it should help.
Game Archive UnPacker 0.6.0.3 PRO
get the pro and let me know if it works please.
i got that and put it with total commander, but it doesn't seem to be working for me...i think the .cmp files he made the plugin for aren't the same as the .cmp files i'm dealing with.
it always gives me an Error packing files and Error in archive file, whenever attempting to compress and decompress, unfortunately.
the plugin is 5 years old, by the way, would it still be applicable to current gen games?
 

FAST6191

Techromancer
Editorial Team
Joined
Nov 21, 2005
Messages
36,798
Trophies
3
XP
28,321
Country
United Kingdom
You would be surprised at how much/long formats get reused for although cmp is probably second only to .bin in the most common extension/magic stamp stakes.

Anyhow what you described is classic compression- it starts out reasonably readable and jumps back.

This points to some form of LZ compression* (as opposed to the other two common methods of huffman or run length encoding aka RLE) but this is where the existing tools might fall flat- the GBA and DS have their own LZ decompression functions built into the BIOS (not that it means we never see custom algo/settings LZ compression).

* http://www.romhacking.net/docs/281/ and do also poke around the utilities section there and maybe have a look at crystaltile2 http://gbatemp.net/t232718-crystaltile2-2010-06-12 as it has great compression search abilities but the jist of it is you break it up into sections (the random byte) and then refer backwards to it (increasing levels of gibberish).
 

bugubugu

Member
OP
Newcomer
Joined
Aug 26, 2010
Messages
9
Trophies
0
XP
1
Country
Canada
Thank you so much FAST6191! that tutorial was exactly what i needed.
So, it's definitely compressed according to the LZSS scheme and actually seems to be quite similar to the examples they gave...
The one thing i don't understand, however, is the offset/length pair, as wikipedia calls it.
In the section I posted above, the part with #RUBS DERI--gibberish two kanjis #REND, is supposed to be:
(note, i'm romanizing the kana, one character per byte, kanji is represented as Kk)
#RUBS DeRiIiTo #RUBE KkKk #REND
So, logically, the #RUB in #RUBS can be reffered to in the place of the #RUB in #RUBE, which is exactly what's happening.
The compressed version has:
#RU (FF) BS DeRiIi (FB) To (2C E1) E KkKk (FF)
The FB flags the write part, the 2C E1, just like the tutorial said it would, and then it goes on. However, I can't quite see how 2C E1 can be an offset/length pair pointing to #RUB, since it's only about 14 bytes behind as well as only 4 bytes long.
in the other file, it has the same thing except with F4 71 instead, which is strange because in both files, that is the first time #RU ever appears.

I also can't quite seem to get crystaltile to find anything for me. all it seems to do is look for instances of 10...
 

FAST6191

Techromancer
Editorial Team
Joined
Nov 21, 2005
Messages
36,798
Trophies
3
XP
28,321
Country
United Kingdom
(flag) 10 is a magic stamp/appears at the start of compressed files for a type of LZ common in the GBA and DS (flag 11 has appeared in the last year or so though which did not work with a few tools). Was worth a try I guess though.

"14 bytes behind as well as only 4 bytes long." It need not be from where it is at present but where it is from the start of the compressed section (or compression window). Have a look for some other compression examples in your files and see what they say not to mention it can reference the uncompressed work (or work which was uncompressed up to this point- same thing really).

I have no idea if it was done here but on the PC with stuff like 7zip and some of the other high end compression methods multiple files can be compressed at the same time (that is to say treat all the files as one long file) making one dictionary for all the files (or a bunch of them) at the same time and at least in theory providing better compression.

Your best bet at or if you find yourself at this stage is to load up the iso in an emulator or debugger and see how things work at the assembly level. I have no idea how far the PSP has got in this world though (remember hackers do not need an emulator to be anywhere near playable). If nothing else the text will probably be decompressed into the ram in which case you can snatch it from the RAM and set about translating it which you sort the technical issues.

Sidenote- if my hunch is correct you have a really nice game there in that it will detect/work with ASCII rather than assume every character is 16 bits.
 

bugubugu

Member
OP
Newcomer
Joined
Aug 26, 2010
Messages
9
Trophies
0
XP
1
Country
Canada
FAST6191 said:
(flag) 10 is a magic stamp/appears at the start of compressed files for a type of LZ common in the GBA and DS (flag 11 has appeared in the last year or so though which did not work with a few tools). Was worth a try I guess though.

"14 bytes behind as well as only 4 bytes long." It need not be from where it is at present but where it is from the start of the compressed section (or compression window). Have a look for some other compression examples in your files and see what they say not to mention it can reference the uncompressed work (or work which was uncompressed up to this point- same thing really).

I have no idea if it was done here but on the PC with stuff like 7zip and some of the other high end compression methods multiple files can be compressed at the same time (that is to say treat all the files as one long file) making one dictionary for all the files (or a bunch of them) at the same time and at least in theory providing better compression.

Your best bet at or if you find yourself at this stage is to load up the iso in an emulator or debugger and see how things work at the assembly level. I have no idea how far the PSP has got in this world though (remember hackers do not need an emulator to be anywhere near playable). If nothing else the text will probably be decompressed into the ram in which case you can snatch it from the RAM and set about translating it which you sort the technical issues.

Sidenote- if my hunch is correct you have a really nice game there in that it will detect/work with ASCII rather than assume every character is 16 bits.
emulator's not working with the game (at least not on my computer) so much for the shortcut.
i don't think that all the files have one dictionary because the pattern of being coherent at the beginning and gibberish at the end stays.
oh, and would it be too much wishful thinking if i were to expect the window within which a distance between a matched pair were 256 bytes (FFh, naturally). I'm fairly certain there weren't any longer offset/length pairs, but it got quite hard to tell near the end of the file.
do you know of any other decompression tools that might work? or am i more than likely going to have to write my own...?
 

Site & Scene News

Popular threads in this forum

General chit-chat
Help Users
    AncientBoi @ AncientBoi: Imma make quesadillas for lunch :D +1