Decided to peel myself away from silly videos to have a look at this. Have the European version here because that is what I grabbed, the general flow should not be any different though and hopefully it is not one of the ones Germany or Australia got the fun police involved in if you do decide to pivot.
Negative sizes tend to make physics cry so signed or not tends not to be a problem there, minor exception for older systems and compression that do relative jumps and might support signed there.
File out. I had a look at the filenames.bin and it appears to have them as well as the names which is very nice of them -- normally I would default to .bin or if there is a nice magic stamp then maybe that instead.
That is probably getting ahead of things though, however might as well start with that as it is a pretty clear example of a file format like this.
Start of the file. Bunch of numbers counting generally upwards after you flip them (see big endian and little endian if you have not met that before).
0000 361C if flipped.
Going there (not necessary in this as you could have scrolled down until things started making sense but I rarely have an example as clean as this and one was called for so hey)
Technically a 00 which is possibly odd but also maybe not, worst case scenario things might need to be shifted down one row in the eventual spreadsheet.
361D however looks like the good stuff. Everything also separated by a 00 (dump the names into another file, find replace 00 with 0d0a aka the Windows way of doing a new line, 0A if doing Unix style) so that is nice for the extraction.
Before getting into that though then
F82C 0100 is just before that and is presumably the last entry in the list.
flipped that is
0001 F82C
Oh look where the file ends/has its last entry
Variations on this is basically ROM hacking in a nutshell, at least the pulling apart of archive formats (which in the end in most things if you want to look at it that way -- most files will have sub sections and you want to be able to get them to start drilling down).
If we were playing big boy ROM hacker you would want to figure out the name location thing and note it down accordingly, as mentioned though with everything nicely being separated by a 00 that should never come up within a normal name (granted I have not checked at time of writing, might be some subdirectories or something which is possibly what that 00 at the start indicated). Similarly there is not much call to alter the names of files in something like this so of minimal interest in this particular instance. Everything else though, yeah you want to know where things are located so call it a gentle introduction.
Onto fileinfo.bin
Whole bunch of numbers counting upwards. Usually then a safe bet on it being the locations, or maybe locations and sizes (not necessarily the same as location+1-current location as it is generally easier to read files starting on boundaries, or if you find yourself on an optical media type device where sectors are a thing*).
*also why your computer says file size and size on disk, and why a few million 100 byte files can eat up hard drive space at a faster rate than you might expect.
Anyway many files will start with maybe some info about the file itself -- how many entries, when the thing ends, when the section ends... and might look a bit different to normal entries that make up the rest of the file.
a25c long so probably not that. If we were sensible we might have grabbed a number of entries in the previous name pondering to see if that is noted (number of entries multiplied by a given amount gives you the file size, which also divides back down so you can get an estimate of length of each entry) but eh.
As this is file sizes and you might have tiny palette files mixed in with huge sound files random jumps in numbers is fairly expected, likewise a whole run of expected to be identically sized files (there are a fixed number of tiles on screen, stats tables tend to be the same for each player character, palettes are a fixed size...) can also hide things from you. Can make patterns a bit harder to spot so I quite often scroll in a bit or find somewhere it is easier to ponder before going out and spreading it file wide.
12 bytes appears to make things line up (really, I make the window larger and smaller until something looks pleasing), albeit with some odd jumps a bit further into the file. That is also a lot of data (32 bits aka 4 bytes, 2^32, is about 4 gigabytes of space) so we then get to start pondering what else (file size, compressed size, file id number, we have names here but also something that can trouble things in other instances... if it makes sense to store the data for an archive or the file type you are looking at it gets to be considered, though this is console hacking and not high end data storage so don't let your imagination run too far -- read only, execute permissions, modified and created dates... kind of pointless where they might be essential to a zip file). Of course 12 bytes might also be a false friend and I want a lesser multiple.
Quick and dirty find and replace back in the names section says 3463 replacements so presumably that many files as well. Though looking more carefully
Code:
animations
collisions
comicpanelscripts
dialogues
fonts
levels
models
nitrochr
simplertas
sounds
int_room
m00_kill
m01_cuba1_a
m01_cuba1_b
m02_cuba2_a
m02_cuba2_b
m03_arctic_a
m03_arctic_b
m03_arctic_c
m04_jungle_a
m04_jungle_b
m05_light_a
m06_river_a
m06_river_b
m06_river_c
m07_afghan1_a
m07_afghan1_b
m08_afghan2_a
m09_afghan3_a
m10_secret_a
m11_esc_int_a
m11_esc_int_b
m12_esc_ext_a
m12_esc_ext_b
m12_esc_ext_c
m13_cuba3_a
m14_paddy_a
m14_paddy_b
m14_paddy_d
m15_mother_a
m15_mother_b
m16_pay_a
m16_pay_b
mp_base
mp_basilica
mp_caves
mp_command
mp_crossfire
mp_graveyard
mp_highrise
mp_lighthouse
mp_paddy
mp_praelium
mp_stronghold
mp_theyard
train
zm_facility
zm_house
zm_overlook
zm_temple
credits - 06-18-09.crdb
templates.crtc
mav.txt
credits_old_052410.crdb
credits.crdb
No extensions where other things have them and those word choices scream directory name to me, or at least further sub archives.
61 of them depending upon whether you include the initial 00 name thing.
a25c = 41564 decimal
dividing that by 3463 (the number of files)
12 exactly (fractional tends to mean you have some extra data like a length of file, length of section, or you miscounted the number of files you had possibly because directory names or something)
The numbers stop making sense around 64 entries as well and follow a new pattern after that. Whether directory names become a thing I don't know. Could be something else entirely.
Onto the filedata itself. If life was horrible and you did not fancy figuring out how the binary handles it then magic stamps noted above also tend to be followed by file sizes. You could possibly then chain a search through there for magic stamps and sizes to figure things out, at least until the devs make a custom format without it. Fortunately we just spent time looking at the helper files.
61 megabytes of file for a 64 megabyte ROM (1 meg of that being the utility.bin in the dwc directory, this usually being the download play) means most of the good stuff is probably in here, though they might have squeaked some into the binaries and overlays.
03C63AA0 is the last byte of the file so needs to be able to handle that if we are contemplating the split back in the info file (6 bytes location, 6 something else, 8 bytes location, 4 something else being the starting bets)
Snippet from the end of the fileinfo.bin, the missing few bytes at the end there is a cause for a pause)
Code:
Offset(h) 00 02 04 06 08 0A
0000A20C 0C69 8D02 2408 0000 7672 99FE .i..$...vr™þ
0000A218 3071 8D02 4601 0000 E178 99FE 0q..F...áx™þ
0000A224 7872 8D02 4601 0000 1F79 99FE xr..F....y™þ
0000A230 C073 8D02 301E 0000 E143 CE42 Às..0...áCÎB
0000A23C F091 8D02 201C 3801 2E4F E566 ð‘.. .8..Oåf
0000A248 10AE C503 8084 0000 C7D2 AFC9 .®Å.€„..ÇÒ¯É
0000A254 9032 C603 3008 0000 .2Æ.0...
The last entry might not be the end of the file as much as the start of the last file in the data blob but fairly safe bet it is going to be somewhere in the region (other than archives like this and audio files you tend not to get that large), though with file names like
cod7_mpbtn_connect3.ncgr
sound_all.sdat
sound_all.sd
digitalmusic.smb
ending off the files... might not be the ideal case (would have loved that ncgr, nice 2d graphics file there (
https://www.romhacking.net/documents/[469]nds_formats.htm ).
Searching the ROM for SDAT lands me at 028D 91F0 which you can see in the above snippet as well (again flipped). Its internal size report (most DS formats from Nintendo have this 8 bytes in) is 201C 3801 (note again it is flipped, but matching the file here for hopefully obvious reasons) so looks like we have file location and file size in fairly plain text there. What the other 4 bytes are remains a bit of a mystery, especially with the bit of randomness there. Whether we can say we are only ripping a file so who cares just make the script I do not know.
Said missing entry rather than the perfect multiple of 12 might also mean the unknown section is actually for the file after rather than this one. Might also be more evidence of directory names (directories don't necessarily need directory indicators of their own, unless they were advanced enough to include subdirectory support which few would have).
Looking more closely at the names "global.roo" features many times as well which makes things a bit trickier, though could always just include the location as a name to make it unique, and is also a further indicator of directories. 2996 I also see .ncgr_ as a name with no file name which might want its own unique effort (underscores by the way tend to indicate compression, which is fairly expected with graphics formats like that).
Anyway back to things. Got a spreadsheet opened up and names imported. Had to line things up and do manual flipping but in the end got myself a nice little batch file. Attached along with spreadsheet. Bit of a system grinder but that might also be 5 browser windows and who knows how many video tabs. Does allow fun like
Computer is now playing some suitably epic cold war themed music now. Technically could have already done that (searching for SDAT and grabbing the size is fairly basic, some programs might even do it automatically/as a first pass rather than hoping names/extensions/directories make sense). The venerable ndssndext
https://gbatemp.net/download/nds-sound-extractor.28818/ doing the deed as vgmtrans was not up to the task it seems.
Also
You might also choose to lose the location # file name part, I mostly included it to make them all unique and dodge the unnamed file mentioned without sorting the presumed directory aspect. Makes the names a bit more amenable and might also allow for easier ripping for programs that only support the same name for the various graphics aspects to be loaded/attached to each other.
Oh for for the curious about the mav.txt that you might have seen in the thing above (full file name list obviously included as part of the spreadsheet)
Code:
Generic_Ally,base_t_idle,base_r,base_strafe_LT,base_strafe_RT,
1911_Aimdown,
NVS,