I recently got the idea to reverse engineer Zero Escape: 999 as a way to better understand how the visual novel storylines and room escapes are implemented. I am particularly interested in how the ROM is structured, and where the control for the different storylines is located.
Since I have never analyzed a NDS ROM before (I've done plenty of reversing, but mostly Linux and Windows executables), I started by reading the docs on the NDS cartridge header and layout. So, I was able to extract the ARM9 binary (which turns out is < 1 MB out of the total 128 MB). I also extracted the game icon and after learning how tiles work, I was able to recreate it on my computer (which was cool to see).
Anyway, for the ARM9 binary, I first saw that there was some initialization code, followed by what appeared to be an unpacker. The unpacker works by alternating control and data bytes. The control byte is a sequence of bits, where 0 means to directly copy the next byte, and 1 means to repeat a substring up to 17 times at a 14-bit index within the current decompressed segment (not sure if this is a well-known algorithm; LZ77 seems close but it's a bit different). Applying this unpacking scheme, I was able to get the full binary.
After the unpacking, the program copies some sections of memory, one being a large code segment that I think are library functions, and the other being a set of offsets. Then, some initialization functions are called, and the main program starts.
I've reversed a few of the library functions. For example, there were a few versions of memcpy that I saw, along with what was obviously a vsprintf. I also saw some functions that implement DMA transfers.
The problem is, there are over 2500 subroutines in the binary. I suspected that there was probably an extensive library that the developers used, so naturally I came across the NITRO ROM SDK. I compiled library signatures and loaded it into my database, only to find ~100 new functions labeled, and some of them did not even appear to be correct (I am using IDA, and the library signature mechanism is not super smart). My guess is that Spike Chunsoft has their own large library that implements things like reading files from the ROM, unpacking / deobfuscating file contents, and loading graphics.
I think it would be infeasible to reverse engineer the entire game by hand. So my question is, what would be the most effective way to proceed? Is it a good idea to try using a debugger and setting breakpoints to see which parts of the code are being visited during the game? How hard would it be to locate the "main" functionality (which I guess is reading in a storyline(?) and associated content, and running the escape room sections)? Is there any hope of completely understanding the ARM code and the file formats, barring getting help from a large group of people who are interested or somehow obtaining the actual game source?
Since I have never analyzed a NDS ROM before (I've done plenty of reversing, but mostly Linux and Windows executables), I started by reading the docs on the NDS cartridge header and layout. So, I was able to extract the ARM9 binary (which turns out is < 1 MB out of the total 128 MB). I also extracted the game icon and after learning how tiles work, I was able to recreate it on my computer (which was cool to see).
Anyway, for the ARM9 binary, I first saw that there was some initialization code, followed by what appeared to be an unpacker. The unpacker works by alternating control and data bytes. The control byte is a sequence of bits, where 0 means to directly copy the next byte, and 1 means to repeat a substring up to 17 times at a 14-bit index within the current decompressed segment (not sure if this is a well-known algorithm; LZ77 seems close but it's a bit different). Applying this unpacking scheme, I was able to get the full binary.
After the unpacking, the program copies some sections of memory, one being a large code segment that I think are library functions, and the other being a set of offsets. Then, some initialization functions are called, and the main program starts.
I've reversed a few of the library functions. For example, there were a few versions of memcpy that I saw, along with what was obviously a vsprintf. I also saw some functions that implement DMA transfers.
The problem is, there are over 2500 subroutines in the binary. I suspected that there was probably an extensive library that the developers used, so naturally I came across the NITRO ROM SDK. I compiled library signatures and loaded it into my database, only to find ~100 new functions labeled, and some of them did not even appear to be correct (I am using IDA, and the library signature mechanism is not super smart). My guess is that Spike Chunsoft has their own large library that implements things like reading files from the ROM, unpacking / deobfuscating file contents, and loading graphics.
I think it would be infeasible to reverse engineer the entire game by hand. So my question is, what would be the most effective way to proceed? Is it a good idea to try using a debugger and setting breakpoints to see which parts of the code are being visited during the game? How hard would it be to locate the "main" functionality (which I guess is reading in a storyline(?) and associated content, and running the escape room sections)? Is there any hope of completely understanding the ARM code and the file formats, barring getting help from a large group of people who are interested or somehow obtaining the actual game source?