ROM Hack text extracter/inserters

azerty1

Well-Known Member
OP
Member
Joined
Mar 22, 2009
Messages
160
Trophies
0
Age
29
Website
Visit site
XP
99
Country
Canada
I've been playing around with the Tactical Guild rom in an attempt to learn how to hack roms to translate them. finding the text and graphics was rather easy and replacing the text was also simple. However, I'm not quite sure how to make a text extracter and inserter specific to a game. I do have a lot of experience with programming, however all i have ever done was solve math problems on topcoder.
could anyone provide a link to a guide to coding these or give me an explanation of what is needed?
 

Kazetsukai

Member
Newcomer
Joined
Aug 18, 2009
Messages
15
Trophies
0
Location
Hamilton, New Zealand
XP
2
Country
New Zealand
Haha we just started trying to hack Tacticslayer, I think they are both by Ninjastudio so chances are they are the same format. It is a stream of codes, each code eats a certain number of bytes (or in some cases terminated strings), then it reads another code. Can't think of a way to write an extracter/inserter without knowing all the possible codes, or at least how many bytes each code eats.

Have a look at the thread I started Here

What language do you code in?
 

azerty1

Well-Known Member
OP
Member
Joined
Mar 22, 2009
Messages
160
Trophies
0
Age
29
Website
Visit site
XP
99
Country
Canada
i can c++ as well as i can code in java, but i started in java so that's my main, really...

the thing was that from what i could see, there were about 350 files in the scenario folder with the script in them, all .dat files so i just went and extracted the whole thing manually into one file and was wondering if it would be possible to write something to extract and insert text into that...presumably, i'd also have to figure out how it read the text, but i was hoping there'd be some generalized method of doing so...

i looked at your spreadsheet a bit but i don't quite see how most of that has to do with the text itself...oh, by the way, i have little to no experience with romhacking, so i'm more or less fumbling around with the tools my sister left around on the computer...
 

Kazetsukai

Member
Newcomer
Joined
Aug 18, 2009
Messages
15
Trophies
0
Location
Hamilton, New Zealand
XP
2
Country
New Zealand
Yup that sounds familiar, will have a look at Tactical Guild later today.

I am using C# which is very similar to Java in style. Important thing you need to be able to do to make extractor is load a file and read it byte by byte.
From there it would be a matter of figuring out how big the header is, where it ends and the codes for controlling the scenario start, then reading the codes sequentially.

So in pseudocode I imagine it would be like:

CODEStream s = whatever you do to load a file as a stream in java

s.seek(end of header);

while (s.position() < s.length())
{
ÂÂÂÂbyte b = s.readByte();ÂÂ // Might be ubyte or something in Java? Needs to be unsigned
ÂÂÂÂ
ÂÂÂÂif the byte =
ÂÂÂÂÂÂÂÂ0x01, start reading in bytes until you hit 0x00, convert those bytes to a string, repeat three times
ÂÂÂÂÂÂÂÂ0x04, read next byte as index of face image to use, another byte, a byte for expression, another byte, a byte for the name to use
ÂÂÂÂÂÂÂÂetc...
}
and then you could store everything in a new file, in some format that you could easily edit and change back to the original format afterwards.

Disclaimer: I am new and this may be a dumb way of doing it. But it seems to work. Also this may only apply to Tacticslayers. I am not aware of a generalized method.
 

azerty1

Well-Known Member
OP
Member
Joined
Mar 22, 2009
Messages
160
Trophies
0
Age
29
Website
Visit site
XP
99
Country
Canada
wouldn't it be better to just search for the strings, replace them as needed, reset the number of textboxes (number of times you have to press A to continue)
that way you wouldn't have to care much about the face images and stuff, right?
 

Kazetsukai

Member
Newcomer
Joined
Aug 18, 2009
Messages
15
Trophies
0
Location
Hamilton, New Zealand
XP
2
Country
New Zealand
We could do that, but we would have to determine where each string started manually. Strings start with 0x01, but 0x01 can also show up as a parameter to various things; 0x01 as a face is Risa's face in Tacticslayer, 0x01 as a name is the player's name, and so on. Because of that, we can't just search through for 0x01 to find strings.

Assuming we could get the locations of all the strings, there is still another elusive detail. I think (but am not sure) that a certain amount of the scenario file is read in, and a certain code is prompting it to read more. This assumption is based on the game crashing every time it reached a certain point in the file, no matter what the contents were. In theory, understanding the code should allow the scenario to be lengthened/shortened.

Also we like the idea of having tags like "Takumi - Normal Face - Sweat Drop" in the script, should help with translation.

Still haven't confirmed that Tactical Guild uses a similar scenario format, will check that now.
 

azerty1

Well-Known Member
OP
Member
Joined
Mar 22, 2009
Messages
160
Trophies
0
Age
29
Website
Visit site
XP
99
Country
Canada
well, even so, in order to locate the strings, couldn't you just apply the shift-jis table, locate all the places where you get, like, 3 or 4 japanese characters in a row? from what i could see, there aren't really any instances in the whole scenario file that has anything with 82xx82xx that's not a string...

oh, and how did you figure out what each byte meant? the only way i can think of to figure that out is manually screwing with it and testing for results over and over again...

[CODE/]

[/CODE]
 

Kazetsukai

Member
Newcomer
Joined
Aug 18, 2009
Messages
15
Trophies
0
Location
Hamilton, New Zealand
XP
2
Country
New Zealand
Yeah that would mostly work, but I think we would get a couple of strange things happening if we changed the length of any of the strings, and it would be nice to have the freedom to do that.

Lol yes, we have been doing a lot of that over the past few days. Using no$gba emulator to see results quickly.
 

azerty1

Well-Known Member
OP
Member
Joined
Mar 22, 2009
Messages
160
Trophies
0
Age
29
Website
Visit site
XP
99
Country
Canada
i'm pretty sure you'd just have to find something in there that denotes the string length...i'm pretty sure that's what other people have done...(like the soma bringer translation). then again, to find that, that would also just be trial and error with corruption...i'll probably download tactics layer and see if there are any similarities between teh two games.

oh, and byt the way, i finished a quick script to extract all the scenarioxxx.dat files into hex strings so that should help, i guess...then all i need to do is read the table, apply it to the text, and get rid of all the junk so that I get some nice japanese strings...Then once I figure out how the string length is defined, I just need to code a method to change it accordingly to the english text's size, right?

one more thing...how do you get that pink code box to show up like that?
 

Kazetsukai

Member
Newcomer
Joined
Aug 18, 2009
Messages
15
Trophies
0
Location
Hamilton, New Zealand
XP
2
Country
New Zealand
In Tacticslayer the strings are just null terminated, so it keeps reading them until it sees a 0x00. The only problem is every time we change for example the number of strings being read, it starts having weird effects later in the file.

Yeah that script should help. We have been using a program called frhed, free hex editor. It looks like all the strings in Tactical Guild start with a few codes, and then the string, then a sea of 0's until the next little batch of codes. Strings might just read until they hit 0's?

[ code ][ /code ] without spaces.
 

azerty1

Well-Known Member
OP
Member
Joined
Mar 22, 2009
Messages
160
Trophies
0
Age
29
Website
Visit site
XP
99
Country
Canada
hmm...Tactical Guild just might work the same way, all i can find are blocks of text followed by some arbitrary amount of zeroes...the beginning of each file, which i thought were the headers are all very different from each other, which is also very annoying...i thought they would have some information about the number of strings or the number of bytes in each string but it didn't seem to relate...i'll try screwing around with the text and see its effects on the game, i guess...
 

azerty1

Well-Known Member
OP
Member
Joined
Mar 22, 2009
Messages
160
Trophies
0
Age
29
Website
Visit site
XP
99
Country
Canada
ok...so i tried messing with the text, and apparently the number of zeroes seem to mean something...if i keep the changed text within the borders of what was already there, then everything works as planned.

ok...i don't know how to make images appear on boards properly...i'll just put links, then...

this is what i edited versus what it used to be.
http://s756.photobucket.com/albums/xx210/a...p;current=4.jpg
http://s756.photobucket.com/albums/xx210/a...p;current=1.jpg

this works out fine and all that's needed here, is to write a quick script making sure that line breaks occur on spaces...
http://s756.photobucket.com/albums/xx210/a...p;current=3.jpg

butthis is what happens if i don't start on the same byte...
http://s756.photobucket.com/albums/xx210/a...p;current=2.jpg

going on after is OK until the end of the textbox, but before, the text just gets ignored. nothing seems to cause any crashing, so that's good, though...
i think i'm going to need some advice by some experienced people on how to figure out how the length of the string is defined, though...i'm guessing different games have different text output methods, but is there some general method of finding them out?
oh yeah, i don't really understand why, but even though my edited text started after the original, it still displayed both right from the start of the textbox. so i'm guessing whatever it does, it starts the output from where the text starts.
 

Kazetsukai

Member
Newcomer
Joined
Aug 18, 2009
Messages
15
Trophies
0
Location
Hamilton, New Zealand
XP
2
Country
New Zealand
That is odd, I had a play around with it and couldn't get anything to display on lines that weren't being used either. My assumption would be a big array somewhere that said what line each string starts on or something...

The one thing I did find out though was that the last thing before you start getting text strings, that 0x1e 0x05 0x00 0x03 appears to be a delay, change the 03 to something bigger and it spends longer waiting before the first text box. Also the 0x1e 0x21 appears to be to do with the fade, the next two bytes determining start/speed of fade or something.

Good luck with decoding the text format, and good work so far
 

azerty1

Well-Known Member
OP
Member
Joined
Mar 22, 2009
Messages
160
Trophies
0
Age
29
Website
Visit site
XP
99
Country
Canada
hmmm...thanks a lot...if that's how they made that part, then it probably means that all they defined was the total number of textboxes for each part rather than the length of bytes of each textbox...i'll play around with it later...that arbitrary number of zeroes is still getting on my nerves, though....

so...line breaks are some number of zeroes between 4 and 10, from what i can see...either that, it's filling the rest of the textbox with empty spaces...

found the bytes that define the picture, but i can't seem to find what the name is...
 

Site & Scene News

Popular threads in this forum

General chit-chat
Help Users
    Xdqwerty @ Xdqwerty: @K3Nv2, prove it