How do you handle game with custom encoding table?

Discussion in 'NDS - ROM Hacking and Translations' started by jjjewel, Jun 15, 2010.

Jun 15, 2010
  1. jjjewel
    OP

    Member jjjewel GBAtemp Maniac

    Joined:
    Dec 17, 2009
    Messages:
    1,004
    Country:
    United States
    Well, as the topic says, I'm trying to extract scripts from Vampire Knight DS but it used a very custom table. First the hiragana and katakana didn't follow any standard encoding system.

    Like usual unicode or utf-8 or shift-jis will have the character in this order ? ? ? ? ? ?, but this game's encoding is like
    9211=? 9212=? 9213=? 9214=? 9215=?
    9216=? 9217=? 9218=? 9219=? 921A=?
    921B=? 921C=? 921D=? 921E=? 921F=?
    and so on.

    Then Kanji's are even worse. I believe that the developer for Vampire Knight DS just reused the table from Hoshizora no Comic Garden because the Kanji's in the table are arranged in the order of Kanji's appeared in Hoshizora no Comic Garden game. So I couldn't use any standard table to automatically create the table for Kanji's. The only way I could do is extract a Hoshizora no Comic Garden script, replaced the known katakana & hiragana, then played the game to find out the kanji codes it used for that part.

    Like here's the raw script (Extracted from Hoshizora no Comic Garden):
    Code:
    9229-925E-9229-925E-922A-9239-9416-
    
    947C-947C-9213-9263-9416-
    
    9229-925D-923A-9226-9232-9251-9415-
    9245-922F-925B-923D-964E-9526-923A-923F-9252-922F-925B-923D-
    9852-9782-9226-9212-9829-922A-923C-9416-
    Then I replaced the script with known values and got this:
    Code:
    ???????
    
    ......???
    
    ???????
    ????964E9526??????
    98529782??9829???
    Then I'll need to play the game until I reached the part where some character say these dialogs and map the values 964E, 9526, 9852, 9782, 9829 to the Kanji's that appear on the game screen.

    It will take forever to complete this, I guess. The worst thing is in Vampire Knight, the scripts don't seem to be stored in order that they appear in the game. Their scripts are stored in order of their lengths (character counts), which make hacking even more difficult. T_T (Not to mention that you'll need to play Hoshizora no Comic Garden first to make the encoding table before you can get the Kanji table for Vampire Knight.)

    So, before I totally give up, I just wonder if there's any easy way to handle table making? The other games I tried to hack always used standard encoding or only slightly modified table so I'm totally not experienced with custom table at all. Is there supposed to be something stored in the game that tells you how the characters are mapped? I'm not sure if I missed anything and made the hacking more difficult than necessary or if the game is really super hard to hack.

    P.S. The whole dialog scripts for both game are stored in Arm9.bin.
     


  2. DarthNemesis

    Member DarthNemesis GBAtemp Maniac

    Joined:
    Feb 19, 2008
    Messages:
    1,208
    Country:
    United States
    Generally people find the graphics for the font and dump them into a bitmap so they can all be identified beforehand.
     
  3. jjjewel
    OP

    Member jjjewel GBAtemp Maniac

    Joined:
    Dec 17, 2009
    Messages:
    1,004
    Country:
    United States
    The font is also in a weird format. I'll post the link here in case anyone want to take a look.

    The ASCII alphabets at the top can be viewed with solid 1bpp in CT2 at width x height 8x10 but for the kana's & kanji's, I tried and the closest thing I could view is at GB 2bpp 8x10 but the fonts are blurry and I couldn't even view the whole characters. (I did try to guess some of them and put them in the table but it was so difficult especially for the complicated Kanji's.) But, still, I wonder if I can view the font, do I really need to manually make encoding table from the font? (I'm trying to find out if there's any easier way.)

    Here's the font file if anyone wants to give it a try.
    http://www.mediafire.com/file/wityyjzmwqq/FDT.bin

    And the font is compressed in some compression type that CT2 can't extract. (.R00 file) I got a program to extract it from a Chinese web, but I totally don't know what the compression is called.

    Here's the original file for the font. If anyone knows something about compression and figure out what this compression is called, please also let me know. I'll really appreciate that.

    Compressed font (FDT.R00)
    http://www.mediafire.com/file/wxydwmeuzly/FDT.R00
     
  4. akamepi

    Newcomer akamepi Newbie

    Joined:
    Aug 18, 2010
    Messages:
    7
    Country:
    Indonesia
    jjjewel, is this means that you are going to translate Vampire Knight DS? If that so, it would be reaaaally cooool!!! I can't speak Japanese. I can't hack ROM (not yet, hope soon in the near future). But I still want to help if this project going to be real. Because I really really really want to play VK in english. I even started to learn hacking ROM. But it is soooo difficult. But I'm not gonna give up!! Not until VK fully translated. So, if you really serious in translating VK please let me know if I can do something to help. [​IMG]

    Sorry for my bad english [​IMG]
     
  5. jjjewel
    OP

    Member jjjewel GBAtemp Maniac

    Joined:
    Dec 17, 2009
    Messages:
    1,004
    Country:
    United States
    No. From the information that I posted, it means that I should give up this game. T_T

    To hack this game, I'll have to manually identify a few thousands Kanji's to extract the scripts. And then this game used compressed graphics which I don't know the type of compression so I can't do a thing with graphics in this game. Then the scripts are in Arm9.bin which is difficult to modify. Then, the scripts aren't sorted in order they appear in the game, but in order of their lengths. Adding all these together, I could finally give up this game peacefully. [​IMG]" (These are just the preliminary hacking steps, not including translation and editing yet.)

    Anyway, I know there are people on a Chinese web that are currently translating this game to Chinese language. So if anyone interested in hacking this game, you might want to find someone who can communicate in Chinese and ask for information from the Chinese site. (As far as Google Translation translated their conversations for me, they already managed the font table and compression.)
     
  6. rastsan

    Member rastsan 8 baller, Death Wizard

    Joined:
    May 28, 2008
    Messages:
    963
    Location:
    toronto
    Country:
    Canada
    Time to get yourself some Optical character recognition software. (I recommend Nuance software)
    This is pretty much what I have and am doing for 7th dragon. the font is literally repeated thoughout the game in bin files with the text. Except each bin (that has font in it) has a slightly different one so encoding is real difficult. one table for one size font. I gave up on picking and pecking a making my own custom table form another table and have been slowly and patiently trying to do it through the OCR.
    The good news is it gets easier. IF you plan on dedicating yourself to it.
    And yes I have seen that file type before - in ppc - windows mobile. Check and see if this game is a direct port from that platform. The ds is compatible with that architecture. and the nds sdk actually makes it pretty easy. (it says you just have get rid of certain types of variables).

    What you have got yourself there is a multicolor font. This isn't the first time I have seen this either. There is a way to view it right You have to play with it though.
    7th dragon has a slightly similar thing also the same font I was talking about before
    IT has the 8by8 regular then the 10 by 12 regular then the 12 by 12 multicolor all stacked up beside each other.

    I can see how you might get confused. Yes you can see it more clearly in gb but try solid 1bpp. offset 61A and size 16 by 20. I know two letters in one tile ???? yeah but it works. When I get time later I'll use the ocr software to dump it. Then all you have to do is match the numbers to the lettters.
     
  7. FAST6191

    Reporter FAST6191 Techromancer

    pip
    Joined:
    Nov 21, 2005
    Messages:
    21,706
    Country:
    United Kingdom
    "standard table to automatically create the table for Kanji's"

    It seems I am a bit late to the party (and have nothing more to offer really*) but what program/table maker do you have for adding Kanji to tables and what ordering (not that there really is one) methods does it support (amount of strokes and the like).
    I have several programs to handle adding lots of Roman characters, kana and other languages/subsets of languages with small numbers/defined orders of glyphs but nothing for serious use Kanji (it is mainly one by one or I dump a known encoding in there and play with it from there).

    *interesting to see reuse like this, obviously we have the common encodings and things like the capcom table but somewhat different games is new to me. My piece of advice here is that order can be random but more likely there is a pattern of sorts- I have seen things like order based on what appears first in the script and count in the script (most common first to least common or vice versa).

    Edit: as for OCR crystaltile2 does have limited support. I would hope it has Gothic typeface rather than Kaisho or Gyosho (or worse) though.
     
  8. jjjewel
    OP

    Member jjjewel GBAtemp Maniac

    Joined:
    Dec 17, 2009
    Messages:
    1,004
    Country:
    United States
    To rastsan;
    This game used the small font (the LC10 font) which probably is difficult for OCR program. (This font is quite difficult to read to begin with.) And this game is originally created for Nintendo DS. I tracked back the company's game history until I found out that their previous games stored text in Arm9.bin, otherwise I would have never imagined that all the scripts would be stored there.
    Thank you very much for you information and suggestion. But for now this game seems to be too much for me to handle. T_T


    Well, I used Shift-JIS table as a base. Some games like Kurayami no hate de kimi wo matsu or Bakumatsu renka shinsengumi or Signal DS used tables that are close to Shift-JIS. I just added or subtracted some numbers to create encoding table for these games. But it's not the case for Vampire Knight where the Kana's and Kanji's are all quite random.

    Ex. standard Shift-JIS
    Warning: Spoilers inside!

    Ex. Bakumatsu Renka Shinsengumi DS (Same table as Shakugan no Shana DS)
    Warning: Spoilers inside!

    Ex. Signal DS
    Warning: Spoilers inside!
     
  9. rastsan

    Member rastsan 8 baller, Death Wizard

    Joined:
    May 28, 2008
    Messages:
    963
    Location:
    toronto
    Country:
    Canada
    Actuallly not a big deal. For difficult cases I use the one on one export feature of ct2 then batch enlarge and change the dpi quality at the same time- then use the ocr letter by letter. But thats only for the characters the ocr software doesn't get the first time around with one big ocr dump.

    Theres a couple table sites that talk about custom encoding and those international standards - small differences and the weird reasons some countries and thus companies have for doing custom tables.
    And efforts to get things unified.

    Okay if someone else is going to pick up Vampire Knight DS as a project pm and I'll do up your dump for the characters.
     
  10. jjjewel
    OP

    Member jjjewel GBAtemp Maniac

    Joined:
    Dec 17, 2009
    Messages:
    1,004
    Country:
    United States
    A bit more info in case anyone wants to try hacking it.

    Table for English ASCII (half-width) font
    (For Vampire Knight DS and Hoshizora no Comic Garden)

    10-1F = common signs (!, #, ., etc.)
    20-29 = 0-9
    2A-30 = some more signs (:, ;, ?, etc.)
    31-4A = A-Z
    4B-50 = more signs ([, ], ^, etc.)
    51-6A = a-z

    You can try it with Vampire Knight's arm9.bin.
    In this pic below, I just enter the ASCII codes at Hex 001B8ED0.
    It is the text "Please input name." at the bottom left screen.

    Anyway, there are still big problems with script extraction/insertion
    and graphic compression that you have to deal with if you want to fully hack it.

    [​IMG]


    Edit: Some more info here:

    The text at game start is at 128144 (With its pointer at 0AF2CC).
    So if you read values from 000AF2CC and skip 4 bytes at a time,
    you should be able to extract some text there.

    Ex. For Vampire Knight's arm9.bin. At 000AF2CC, you'll get the value 44-81-12-02.
    Read the first 3 numbers backward and you'll get 12-81-44. That's the address
    where text starts.

    The next one, just skip 4 bytes from AF2CC to AF2D0, and you'll get 64AC1202.
    Read it backward to get 12AC64 for the next text address, etc.

    Some codes in the game;
    01030x = Choices
    (It can be 010301, 010302, 010303, etc. depends on how many choices there are.)

    010A020001 = Your Last Name
    010A020002 = Your First Name

    This game is very fun to hack but I don't have any plan to translate it.
    (I'm just hacking it for fun. [​IMG]) But if anyone can translate it and just
    needs help with hacking, I might be able to help. (But can't really guarantee.
    I could extract some text now but I'm not sure how difficult it is to
    insert the text back without messing up anything. Plus, I still have no
    idea about compression type for this game's graphics.)
     
  11. CantStrafeRight

    Newcomer CantStrafeRight Newbie

    Joined:
    Jan 5, 2011
    Messages:
    7
    Country:
    United Kingdom
    I've been working making a translation of this game for a friend.
    and I've been speaking to jjjewel (the OP) and to cut a long story short I said I would post a guide on RAM hacking with enthesis on hacking Vampire Knight DS's memory.



    I'll be using DeSmu 0.9.6 for this guide. I tried a few other emulators but I found this the best for Ram hacking (in my opinion that is). You could also use an emulator that can make uncompressed savestates, and just open them in a hex editor but I dont recommend that.





    So open DeSmu, load your rom, and go into Tools>View Memory.
    You should see is something like this:

    [​IMG]

    Two things to note:
    1 The DS loads the ARM9 file into RAMwhen it starts so whats currently in the RAM is mostly the ARM9 file with all the addresses jjjewel mentioned exactly the same, except....
    2 The first address is 02000000 so you need to add this for jewel's addresses to work.

    Go into start new game and in the memory viewer got to address 022e9058

    [​IMG]

    Here you will see where your characters name is stored. It's 2 bytes per character and a byte with 00 in it between the first and last name.
    If you change your characters name for example to 111 111 you can see the memory viewer update in real time.

    [​IMG]

    This means you can use this to find a characters value within the game.







    but what if you know the value but not what the character is, or you want to test a line out to make sure it looks right?
    As well as viewing the memory you can change it.

    VERY IMPORTANT Go back out of the name select screen and go back in BUT do not select anything more.

    The game normally only updates on-screen text when its supposed to change, except from here. If you go into the change name screen, and dont select anything, it updates the text on the bottom left every frame. Meaning you can update it in real time. Nowhere else in the game does this.

    So using the address jewel found that stores the please enter name text (001B8ED0) and adding it to our starting address (02000000) we get the address 021B83D0.
    By simply editing the values at that address we can make the game display what ever line we want. (00 is used to tell the game that the chunk of text is over)

    [​IMG]

    Its also possible to change the in game text and character name during the game, but you have to force it into redrawing the modified text.
    You can do this by moving to the next line of dialog, or by going into the Menu > Options > back to game.





    When I get some more time I'll write up a little more about how to find memory addresses, and I'll list some addresses that could be useful to anyone looking into this game.
     
  12. jjjewel
    OP

    Member jjjewel GBAtemp Maniac

    Joined:
    Dec 17, 2009
    Messages:
    1,004
    Country:
    United States
    Wow, thank you very much, CantStrafeRight. Nice tutorial and everything is easy to understand with the explanations and screenshots. [​IMG]

    I've never tried this method before. Maybe I can try this for some other games too. [​IMG]
     
  13. FAST6191

    Reporter FAST6191 Techromancer

    pip
    Joined:
    Nov 21, 2005
    Messages:
    21,706
    Country:
    United Kingdom
    Nice work CantStrafeRight- I have been toying with using saves (the result of character name entry) to figure out table encodings and forgot most of these entry screens are auto updated (I probably should have remembered as I tried to use one to demonstrate OAM handling once and faced auto updates there as well).
     
  14. CantStrafeRight

    Newcomer CantStrafeRight Newbie

    Joined:
    Jan 5, 2011
    Messages:
    7
    Country:
    United Kingdom
    Sorry its taken me a while getting back to this but some of my other projects have needed all my time.


    So in this part I'm going to explain how to find memory addresses in the first place.

    before I get into this it's important to understand that some things stored in memory will be in the same address every time, but other things will not. It all depends on how the programmer choose to do things.

    Things that do move around will have a pointer but I've been unable to find information on how the DS does pointers yet so I'm only going to get into the stuff that stays in the same address in this guide.


    =====================


    So lets say I didn't know where in memory the game stored the players name and wanted to find it.

    So I open up DeSmu, load the rom and go to the enter name screen. Now enter a character that we know the value of as the first character in the name. For this example I'll use '1'.
    So go into Emulation menu > cheats > Search

    [​IMG]

    imho the byte size stuff is broken and should be left alone. If ypu want to search for a value that is more than one byte I would just search for a byte of the value.

    Select exact value search and click on Search.
    So we want to look for a memory address that is storing the the value that represents '1'.
    We know '1' is represented by 9372 but as I said earlier it's best to search for just one byte so we will search for the last byte, 72.
    but of course 72 is the hex value, and we search in decimal with DeSmu so we have to convert it to decimal (114)
    So put it in and search (this searches for memory addresses that currently store the number 114)

    [​IMG]

    You may not get the same number of results as me but it will be close.
    close the search box and change the '1' to a '2' and repeat the search but this time search for 115 (the last byte of how 2 is represented)
    There's probably only one result now
    Unless you reset the search it will only search the addresses that were left from previous searches. So by searching what is currently in the address you are looking for you can eliminate all the other addresses.
    once your down to a few addresses or just 1 you can click on the view addresses button and take a note of the addresses and look them up in the memory viewer like in my previous post.

    [​IMG]

    ========================

    So lets say you dont know the exact number stored in the address your looking for.
    This is when you use the comparative search.

    With the comparative search you tell it if the value in the address has increased, decreased, is the same or is different since your last search.

    for example

    You could enter '1' as the first character as your name and start the comparative search.
    Then do a search for a value that is still the same (its still ''1' as you haven't changed it)
    Change the first character to a '2' and search for a value that has increased (the value that represents '2' is most probably higher than the value thats represents '1')
    Then do a search for a value that is still the same
    change it back to a '1' and do a search for a value that has decreased

    Eventually you will find the address you are looking for. If you dont you either made a mistake or the value is not always in the same address,
     
  15. jjjewel
    OP

    Member jjjewel GBAtemp Maniac

    Joined:
    Dec 17, 2009
    Messages:
    1,004
    Country:
    United States
    Thank you very much, CantStrafeRight. Yeah, it makes sense to me. I've seen relative search in a few HexEditors and emulators but I've never known how to use them. And I thought the Cheat search is only used for cheating so I've never touched it. [​IMG] Now I feel a bit enlightened. [​IMG]
    Thank you.

    Let me know if there's anything I can help.

    For the text insertion, I've been thinking if it's possible to relocate all the texts based on the order they appear in the game instead of text length.

    For example, you have texts stored as Text001, Text002, Text003, Text004, ... But from the arm9 pointers, you should be able to find out the order the text appear. Then you can swap all the texts and update your pointers to have the texts in order. This way, it will be much easier when you make text extractor/inserter. (Since you don't change the text length here, the swapped texts will still be the same length.)

    Anyway, the problem is that you might have to check if there are anything else in-between these texts. For example, if you have Text001, Text002, Code001, Text003, Code002, Text004, ... You might mess up something when you swap the text.
     
  16. CantStrafeRight

    Newcomer CantStrafeRight Newbie

    Joined:
    Jan 5, 2011
    Messages:
    7
    Country:
    United Kingdom
    I'm under the impression the game's text branches and because it uses the same lines of dialog in different branches it's impossible to have the whole script in order. (I could be wrong but thats how it looks to me)

    I was thinking about making a program that would extract the script and put it in order, and then once its been translated put it back in. I could maybe try this on the first section of the game first to see it will work and then try doing bigger chunks of the game.

    Maybe its a waste of time, but I'd really like to be able to find out which character is saying each line. I would imagine translating it without that could be a pain, but up until now I've not found any hints as to how it's stored.
     
  17. jjjewel
    OP

    Member jjjewel GBAtemp Maniac

    Joined:
    Dec 17, 2009
    Messages:
    1,004
    Country:
    United States
    Ah, I missed your post. Didn't notice it until today.

    Yeah, then it will be a bit difficult to rearrange the text. But are you sure the game reuses the same lines? I can see that many texts are repeated so many times.

    For example, if you start from 0x0B2CE0 in arm9.bin, you will find some short texts there. Some words like ??, ??, ??are repeated many times, and each of them only has one pointer pointed to it. If the game reuses the texts, there should be just one occurrence of the text with many pointers pointed to each of them. Anyway, I only did a brief checking, so I might have missed something.

    Also, I doubt that the speakers' names in the dialog box is around this area too. Like, you'll see some occurrences of words without any punctuation. Ex. ??, ??, ??, ???, ??, etc. around this area. But then, for example, the word ??? without any punctuation (its hex code is 9422745F00) only appears once in the whole arm9.bin and only has one pointer pointed to it. So it doesn't fit as a dialog speaker.

    Then at 0x0B46EC you will find Kaname's name (????, hex code A46DA33A9210A064) but only one pointer in the whole arm9.bin points to this name. (When Kaname appears a lot in the game and there would be at least hundreds of dialogs with him as the speaker.)

    So, there might be other pointers that point to the pointer for Kaname's name. But I still don't see anything that supports this theory.

    Anyway, if you keep working on it, you should be able to figure it out one day. Looking at other project's discussion threads or helping out with other projects will help too. (I figured out something about the game I've tried to hack for 14 months just a few days ago when I helped looking at someone else's project. [​IMG])
     
  18. CantStrafeRight

    Newcomer CantStrafeRight Newbie

    Joined:
    Jan 5, 2011
    Messages:
    7
    Country:
    United Kingdom
    It wouldn't store the characters thats speaking like it stores the script. If it was me programing it I would use a number to represent each character and a second number to represent if they should appear/what facial expression they should have.


    I'll make some time tomorrow to sit down with the game and look into stuff, and Ill get back to you if I find anything interesting.
     
  19. jjjewel
    OP

    Member jjjewel GBAtemp Maniac

    Joined:
    Dec 17, 2009
    Messages:
    1,004
    Country:
    United States
    You might consider checking Hoshizora no comic garden's arm9, too. It uses almost the same font table, except some Kanji's. Anyway, I don't know if it will help but there might be something similar between the 2 games.

    Also, there is this old game called Simple DS Series Vol. 11 - Mou Ichido Kayoeru - The Otona No Shougakkou. It was made by this same company. There are something similar to Vampire Knight and Hoshizora. (Not the same, but it might give you some idea. It was this game that made me realize that the whole game scripts are stored in arm9.bin.)
     
  20. CantStrafeRight

    Newcomer CantStrafeRight Newbie

    Joined:
    Jan 5, 2011
    Messages:
    7
    Country:
    United Kingdom
    I made a program that counted all the pointers and checked if any of them pointed to the same address and none do. While there are still pointers that point to other pointers none point to the exact same address. (So it looks like I'm wrong about how it may re-use text)

    It looks like some may point to stuff before the list of pointers so it might be worth looking in there for useful stuff.


    I took a look at Hoshizora no comic garden and it looks like wherever it stores who is speaking it also stores if they are on the right or the left screen.


    I was thinking about maybe trying to extract all the script up to the point where you get to move around, and then we could build up the encoding table to cover all the symbols in it. As a sort of test run.

    I was also thinking about maybe setting up an online file we could both edit and keeping an encoding table there so we can both update it and use it.




    BTW do you know of any sites that have pictures of the Vampire Knight characters along with their names? While I can work some out some I have no idea who the rest are.
     

Share This Page