translate DS games

Discussion in 'NDS - ROM Hacking and Translations' started by simonrule, Dec 22, 2009.

  1. simonrule
    OP

    simonrule Advanced Member

    Newcomer
    68
    3
    Dec 18, 2009
    United States
    can any one tell me how can extract text from ds games to translate to any language
     
  2. DarthNemesis

    DarthNemesis GBAtemp Maniac

    Member
    1,211
    41
    Feb 19, 2008
    United States
    1. Extract all the files using dslazy/dsbuff.
    2. Look at different files in a hex editor until you find text (it's usually Shift-JIS or Unicode encoding, so check for both).
    3. Examine the file until you've figured out how the game finds each line.
    4. Write a program to extract the text from the file and insert replacement text, recalculating any lengths/pointers.

    People have already done that for a few games, but there is no common file format for text, so there is no magic tool that works with every game. You generally have to figure out the game's file format yourself, so it takes dedication and motivation.
     
  3. simonrule
    OP

    simonrule Advanced Member

    Newcomer
    68
    3
    Dec 18, 2009
    United States
    im etract the files but how i can translate tell me how i can find text to translate
     
  4. simonrule
    OP

    simonrule Advanced Member

    Newcomer
    68
    3
    Dec 18, 2009
    United States
    help me please how i can find text to translate
     
  5. psycoblaster

    psycoblaster Divine

    Member
    2,132
    2
    Jan 26, 2008
    Seoul.. (in Korea)
    Well there is a magic tool that works with every game- the hex editor [​IMG]
     
  6. simonrule
    OP

    simonrule Advanced Member

    Newcomer
    68
    3
    Dec 18, 2009
    United States
    can tell me please how i can find text with hex editor in any games
     
  7. luke_c

    luke_c Big Boss

    Member
    3,587
    32
    Jun 16, 2008
    Land of England
    I doubt anything anyone says to you is gonna get to you, so there's no point explaining anything, you're probably thinking it's just going to be a text file you can edit, it isn't.
     
  8. FAST6191

    FAST6191 Techromancer

    pip Reporter
    23,370
    9,171
    Nov 21, 2005
    I dare say you have already been told but:

    Remember the useless code you used as a kid 1=A 2=B 3=C and so on?

    Same idea here except we use hex.

    In reality it can be anything although there are many tricks you can use to find the meaning of the text including but not limited to luck (you have predefinied/common encodings like unicode, shiftJIS or ASCII or a very minor tweak or merging of them).

    Corruption and brute force: find the area of memory responsible for displaying text on a screen. Mess with it by either randomising it or filling it with the same string and thus change the on screen display. As you control the changes you can often rapidly work out what is what.

    Linguistics; in English the space character is undoubtedly the most common charactern and as such it will likely be the most common character, similarly nearly every word has a vowel in it. Two and three letter words will often be things like "the" "and" "it" "if" "to" and a multitude of other "joining" sorts of works as you should already have the space character, sentences will start with a capital letter but capitals are otherwise somewhat rare in use. The only reason language works is because of patterns like this, a simple Japanese one is that Japanese broadly speaking has three sets of character called hiragana, katakana (collectively known as Kana) and Kanji and they are very much treated as three distinct sets of characters, Kanji are complex looking shapes compared to the others and in simpler games.

    Computing knowledge: most character encodings but not all by any means will encode their characters one after the other (go back to the top: A=1 B=2 C=3......, same applies in ASCII, unicode and to some extent shiftJIS). You can use a relative search tool to scan the file for instances where the hex is different according to the pattern you tell it, crystaltile2 has one as does "monkey moore".
    Work in language knowledge: around 30 base characters multiplied by two for the cases and add 10 for Arabic numbers. Far less than the 256 characters that an 8 bit encoding uses and far less than the 43046721 16 bit encoding allows- meaning a character distribution will likely be focused on certain numbers courtesy of the relative encoding thing I was on about a few lines back.....
    You can twist this and know the length of a sentence: as well as space you also have fullstop and in computing you will often have end of line characters (usually 00 but it does change), you can search for 00 and then assuming you know how many bits encode each character you can then work sentence length out looking at the game.
    Games/characters will often have phrases which get repeated (See final fantasy and "kupo" as a basic example) or have words with nice things to latch onto "noooooooooooooooooooooooo" looks very obvious in code.

    I would go in depth and detail the countless other tricks available but it has been covered before in guides written by everyone who as thus far replied to you.
    You take