Noob Question about text editing/translations

Discussion in 'NDS - ROM Hacking and Translations' started by RemyK313, Feb 1, 2008.

  1. RemyK313

    RemyK313 Newbie

    Feb 1, 2008
    United States
    I've scoured the forums a bit, and searched through the ROM hacking 101 pdf.
    I've never done any bit of rom hacking, but I've been programming for close to a decade now.
    That is to say, I'm fairly comfortable with ARM, a bit of hex and all that.....
    In other words, I'm not coming onto the forums, and demanding a port of x game to the DS.

    So, I just wanted to ask for a few clarifications :
    1.) What hex editor do people typically use for translations?
    2.) How would I be able to tell what character encoding-system a game is using.
    (For instance, how can I tell if they're using unicode, ascii or shift jis or something else?)
    3.) What steps do people typically take in a translation? I mean, is there any specific way people are doing these things, or do people just throw a rom into a hex editor and just scour through it until they hit a part that looks like a series of text sections?

    I'm interested in throwing my hand at a translation, but since the search feature's been disable, it's a bit hard for me to get any info on this...

    I'm thinking it can't be all that bad. I mean, is it as simple as finding the text sections, and replacing them with English equivalents? Or are there some kind of hardware limits that make that difficult?

    I'm interested in translating Houkago Shonen, which is an adventure game, and from what I've seen, it seems to be a fairly standard one at that. That is to say, seeing as it would have a lot of text in the form of spoken dialogue, it looks like there would just be large data sections corresponding to the game's text strings.

    However, I'm miserably new to this, so I don't know. That's why I'm asking you guys.

    I figure if someone can point me in the right direction, at least help me getting to the point where I could be replacing text, I could pop SOMETHING out. However, if there's a ton of things I completely overlooked, that's fine, too (that's why I'm a noob, right?), I just wanted to know some info on this, see if it would be viable (on my schedule) to do something like that.
  2. deufeufeu

    deufeufeu GBAtemp Advanced Fan

    Nov 21, 2005
    1) hexdump to dump the data, and some script to dump using a special encoding scheme (like sjis). I never insert data with an hex editor and consider it a very bad thing to do. Keeping track of everything you've changed is tedious and overall it's a very limited technique for editing nds roms.
    2) by looking trough the files used by the game. First thing to do : dump massively the rom as a sjis encoded string and replace everything that is not a sjis character with the empty string, this way you can directly see if you have a good amount of sjis text. Do the same for utf-8 (very unlikely in ds rom), ect... But the rule of thumb is "Is it sjis ?" if it is it's quite good, otherwise you've got a lot more work to do.
    3) Here the method I use, which is quite different from the usual one you'll get my reading tutorial.
    Step 1: extract the data in the rom
    Step 2: look trough the name of files to see if there is some obvious naming, like story.pack, ect...
    Step 3: Text will always be packed in an array like file with some header. So the first thing to do is to write script to extract these packs
    Step 4: When you've found the pack format, use it to reconstruct the rom
    Step 5: When you've a good set of script and have made translation you can release some patch

    Of course it's the easiest case, and usually there's more work to do. For example in FFTA2 : one big archive without direct naming, instead some hash function to convert filename into an integer that is used to access the file in the archive. Then each table is packed with a text using two tables encoding : one global to the game and one specific to each table. To work it out, of course I've not done it like this. It was more : extracting unknown chunk of data from the archive, getting the global table by using character naming and saves, then locating the text tables. after some time I've reversed the hash function in the arm9 binary and then I could access a lot more data, like the story that was packed quite differently.
    For Jump ultimate stars, it's a lot of packed tables of text in SJIS. So it was simple.... except that a lot of text was missing and was in fact located in another archive file format. So I have to have several layers of extraction and reconstruction.
    The rule is : every game is different and define its own little world of romhacking. But it's the main reason why it's interesting and not boring...
  3. FAST6191

    FAST6191 Techromancer

    pip Reporter
    Nov 21, 2005
    1) I have about 9 hex editors, listed here:

    2) DS roms have a file system (GBA and earlier tend to be one long binary as everything was mapped to the memory (give or take a bank or three)). This means you can usually guess what files are text (see deufeufeu's explanation).
    Failing that it is brute force time. Text is usually small compared to graphics so you can discern what is what that way and then run it in a cart/on an emulator.

    I work differently to deufeufeu in that I (ab)use existing tools (mainly a spreadsheet, a hex editor that can do tables and my hex editors search and replace function) to get the job done rather than make my own from scratch. I usually open in a hex editor and it is then obvious if it is unicode, ASCII or shiftJIS ( kana are mainly 82XX or 83XX). If that is the case then life is good. If not then we have some work.

    Relative text: c follows b which follows a etc. There are tools able to do this if you feed it a phrase. Most are 8 bit only though and I have noticed most DS roms use 16 bit encoding (even if the first 8 bits are just 00).

    Some roms (here is Final Fantasy 6 for the SNES aka FF3US: ) are a bit more complex as developers saw fit to tweak their code to make life harder.
    Such things include variables, dual/multi tile encoding (the example above has 02=TERRA ). Other times games can have many tables (normally one for menus, ingame, cutscenes..... but do not count on it).

    Now we have the wonder that is pointers to stop text being on other lines/sentences. Most DS roms use them (as opposed to fixed length or parsed text) and I wrote up some here:

    Here is a simple text tweak I did for Rune Factory 2:

    As for the search function GBATemp allows bots to crawl the forum so adding to a search tends to work.
  4. sususu

    sususu Newbie

    Jan 30, 2008
    United States
    Are the data files encrypted in all NDS games? Do they have any common and/or known decryption methods?