ROM Hack script dumper - are they pointers?

plasturion · Feb 1, 2017

Hi everyone,
Lately I work on translation project of figure skating game -Kurukuru * Princess: Tokimeki Figure * Mezase! Vancouver. Dumping is mannually because I was forced to ommit pointers sections, it was to dificult to update, I'm not sure how to read/modify them.
It would be huge of help if could someone help of these data analyse because I can't use existing tools like TED, strings are mixed, but they are not single byte (NULL) terminated, It's always random vaule. And section where pointers are is not common, and it don't indicate any position in file. I think that they may indicate pointers itself (after some calculation). I just compared two script files - one original, one Korean and every about 55 bytes theres 4byte word that may indicate that adress of string but not directly, I guess debugger here is needed, set breakpoints at memory map, and some asm skill and I suck in low level code, so It would be great help if any one could explain, or write algorithm, best in c, or just write a dumper/updater of that words. There's also few positions with 10 byte difference at the begining. Here's example screen from comparision of these two files. I put that korean script into japanese rom and it works, so method is the same.
I forgot to metion that korean version of game is called Figure Princess.

HinaNaru Cutie · Jul 13, 2020

plasturion said:
Hi everyone,
Lately I work on translation project of figure skating game -Kurukuru * Princess: Tokimeki Figure * Mezase! Vancouver. Dumping is mannually because I was forced to ommit pointers sections, it was to dificult to update, I'm not sure how to read/modify them.
It would be huge of help if could someone help of these data analyse because I can't use existing tools like TED, strings are mixed, but they are not single byte (NULL) terminated, It's always random vaule. And section where pointers are is not common, and it don't indicate any position in file. I think that they may indicate pointers itself (after some calculation). I just compared two script files - one original, one Korean and every about 55 bytes theres 4byte word that may indicate that adress of string but not directly, I guess debugger here is needed, set breakpoints at memory map, and some asm skill and I suck in low level code, so It would be great help if any one could explain, or write algorithm, best in c, or just write a dumper/updater of that words. There's also few positions with 10 byte difference at the begining. Here's example screen from comparision of these two files. I put that korean script into japanese rom and it works, so method is the same.
I forgot to metion that korean version of game is called Figure Princess.

hey did you ever figure out how to english patch the game?? if you still need help i could send you a link to a discord group chat the does english patch for games =O

plasturion · Jul 13, 2020

Well i gave up on that puzzle, no thanks I made only a demo translation with all limitations (intro, few conversations, some menus) and that's enough for me.

HinaNaru Cutie · Jul 13, 2020

plasturion said:
Well i gave up on that puzzle, no thanks I made only a demo translation with all limitations (intro, few conversations, some menus) and that's enough for me.

ooh damn that sucks Q_Q...i was really hoping for the full translation..sigh guess it will never happen,, but thank you <3

plasturion · Dec 10, 2020

I spend some time with script files to analyse it.. and it seems it wasn't so hard as I thought:
First i just tried to find all differences between korean and japanese version (decompressed with LZSS - ay0001.rtz) and...
let's say they are two sections:
1 - control section - [0x38c0 - 0x8800] - 4 byte differences with some unregular next steps
2 - text block - [0x9000 - 0xD2300 ] - all different

Code:

compare differences at control section:
000038C7: 4474958A - 8F1FBE0C, step:0
000038CD: E129FB49 - B0D3EE57, step:6
000038F5: 5D6B472D - 517B9FF9, step:40
000038FB: 37F220B2 - B2CC2F1F, step:6
...
00008749: 36F2D537 - D24E1213, step:139
00008760: 9466618C - C95DD411, step:23
00008779: E972620F - 20827BCD, step:25
000087CC: 6B30036B - A45A1F59, step:83

what is in common for all that differences: they are a 4-byte long, and previous byte is 0x09.
This section is not a regular pointer's table, that's why steps are bigger or smaller and some instructions are between them.
Earlier I thought that's some kind address but it seems that's only unique signature to determine position of the text at text block at the end.
So i just searched appearances of that signature at text block using n^2 compare and i have:

Code:

Found positions at text block:
000038C7: 0000A209
000038CD: 0000C9ED
000038F5: 0000A836
000038FB: 00009E36
...
00008749: 00009DF8
00008760: 0000B55D
00008779: 0000CB5C
000087CC: 0000ABCF

all of them exist and appear only once, that's what we expected, and some addresses in "control section" stores the same signature (pointing the same text), that's comon effecient way of using pointers for space compressing.
So i just sorted all of the signatures and:

Code:

Sorted:
00009019 - 021E573C, len: 0
00009054 - 024F813C, len: 59
0000908C - 02CD97AD, len: 56
000090B6 - 0406FEBE, len: 42
...
0000D257 - FEA0A8BE, len: 65
0000D266 - FF50F19D, len: 15
0000D2AB - FF8F3049, len: 69
0000D2F0 - FFD986BD, len: 69

It seems, all of the signatures are in incrasing order, but as i checked it may be whatever even byte inversion for all signatures works here. I wanted to make it more visible in hex editor so I used my specific order and it worked too.
Another thing that I found is one byte that is placed right after every signature in the text block - it determines text length.

I guess from this point is possible now create a text extrator ( with text flow order) and inserter updating text length addresses, making readable signatures and with the same order as appearing at the control section.

...but anyone want to translate this game? Tools are needed or not?
---------
extracted text presents like that:

Code:

---------- 0054 ----------
ほら　足はもっと高く！
さいごまで　しっかりバランス！
---------- 0055 ----------
こんにちは　斉藤コーチ
やってますね
---------- 0056 ----------
あら　こんにちは
練習を見に　いらしたんですか？
---------- 0057 ----------
はい
---------- 0058 ----------
今年は　４年に一度の
冬季国際フィギュア大会が
ひらかれる年でしょう？
---------- 0059 ----------
そろそろ　それに出場する
日本代表選手を決めはじめる
時期ですから
---------- 0060 ----------
選手たちのようすが
気になったもので…
---------- 0061 ----------
みんな　すばらしい選手たち
ばかりですよ
---------- 0062 ----------
とくに今年は…
---------- 0063 ----------
世界のトップスケーター
として活躍している
安野美樹（やすのみき）
---------- 0064 ----------
「リンクのようせい」と呼ばれ
人気を集めている
浅香真子（あさかまこ）

first chapter starts from - 54th pointer at control section but it keep order from that point...
maybe we can grab here who is talking.
We can't say who is saying dialogue. It's possible to take it from here if we figure out some intructions.
It's very possible that 4 bytes before "09" are important bytes too because it starts with that sequence:

Code:

Saito: 00 06 12 49
BdGuy: 00 06 1B 49
Saito: 01 06 12 49
BdGuy: 02 06 1B 49
BdGuy: 02 06 1B 49
BdGuy: 02 06 1B 49

so we can say Saito coach is 12, and blondie guy is 1B. cool

Ayaka: 65
Riko: 09
Luna: 03
Chihiro: 0B
...so after applying improvements to our text dumper tool we can recieve format like that:

Code:

---[ Saito Coach ]- 0132 ---
まずは　そのまちになれて
コンディションを　ばんぜんに
ととのえることが　たいせつよ

---[ Ayaka ]------- 0133 ---
はい　わかりました！

---[ Ayaka ]------- 0134 ---
ここは　ひかえ室　だね

---[ Riko ]-------- 0135 ---
ね　ねぇ　あやか＠＠＠＠＠！
あそこにいる子　もしかして…

---[ Ayaka ]------- 0136 ---
え…！？
もしかして　鏡ゆいちゃん！？

---[ Yui ]--------- 0137 ---
あら　こんにちは

---[ Ayaka ]------- 0138 ---
すすす　すご～い！
ホンモノの　ゆいちゃんだ！
かわいい～～

---[ Yui ]--------- 0139 ---
うふふ　ありがとう
えっと…あなたたちも
フィギュアスケートを？

...so great and nice, isn't it?

It seems Yui has 02 code.

I made last comparision in this script file, I just extracted all signatures from textblock one by one and compared them with sorted values.
we've got 4 more strings. That signatures match japanese and korean, so they are the same, and we can find them before 0x38c0.

Code:

000092E1: 130CBDAE new!
00009343: 14B64E73 new!
0000B1F8: 87744906 new!
0000B739: 9A079B65 new!

let's check what contain that strings and where signatures in control section are...

Code:

000092E1: str: ") " - 0x2B2D
00009343: str: "Del(" - 0x2B23
0000B1F8: str: "slot unavailable" - 0x2E8E
0000B739: str: "…" - 0x5E52, 0x6648

...second method find more... especially last signature is important to include in dialogs.
I've got full dump all of the script files now... but there's no interest in this project so... my work ends here.
Things to do:
1 add additional single, multiple appearances of "..." to dialogs, or none.
2. add additional three strings positions that we found unchanged in purpose to place them at top of the textblock (possible without chaging signature)
3. make positions database. (store number of pointers, positions of singatures of three unchanged strings, and signature positions at control section for every file)
4. create inserter with compare function in purpose to look for the same strings (same text for same signature)
possible with create new ordering system for them.
5. it's possible that output file have to fit in less than 64KB, but that must be tested first.
----------
point 1-3 has accomplished, completed database size for all 29 script files: 20KB
-----
point 4 completed! dumper / inserter for all the script files works perfectly great!

HinaNaru Cutie · Jan 4, 2021

plasturion said:
I spend some time with script files to analyse it.. and it seems it wasn't so hard as I thought:
First i just tried to find all differences between korean and japanese version (decompressed with LZSS - ay0001.rtz) and...
let's say they are two sections:
1 - control section - [0x38c0 - 0x8800] - 4 byte differences with some unregular next steps
2 - text block - [0x9000 - 0xD2300 ] - all different

Code:

compare differences at control section: 000038C7: 4474958A - 8F1FBE0C, step:0 000038CD: E129FB49 - B0D3EE57, step:6 000038F5: 5D6B472D - 517B9FF9, step:40 000038FB: 37F220B2 - B2CC2F1F, step:6 ... 00008749: 36F2D537 - D24E1213, step:139 00008760: 9466618C - C95DD411, step:23 00008779: E972620F - 20827BCD, step:25 000087CC: 6B30036B - A45A1F59, step:83

what is in common for all that differences: they are a 4-byte long, and previous byte is 0x09.
This section is not a regular pointer's table, that's why steps are bigger or smaller and some instructions are between them.
Earlier I thought that's some kind address but it seems that's only unique signature to determine position of the text at text block at the end.
So i just searched appearances of that signature at text block using n^2 compare and i have:

Code:

Found positions at text block: 000038C7: 0000A209 000038CD: 0000C9ED 000038F5: 0000A836 000038FB: 00009E36 ... 00008749: 00009DF8 00008760: 0000B55D 00008779: 0000CB5C 000087CC: 0000ABCF

all of them exist and appear only once, that's what we expected, and some addresses in "control section" stores the same signature (pointing the same text), that's comon effecient way of using pointers for space compressing.
So i just sorted all of the signatures and:

Code:

Sorted: 00009019 - 021E573C, len: 0 00009054 - 024F813C, len: 59 0000908C - 02CD97AD, len: 56 000090B6 - 0406FEBE, len: 42 ... 0000D257 - FEA0A8BE, len: 65 0000D266 - FF50F19D, len: 15 0000D2AB - FF8F3049, len: 69 0000D2F0 - FFD986BD, len: 69

It seems, all of the signatures are in incrasing order, but as i checked it may be whatever even byte inversion for all signatures works here. I wanted to make it more visible in hex editor so I used my specific order and it worked too.
Another thing that I found is one byte that is placed right after every signature in the text block - it determines text length.

I guess from this point is possible now create a text extrator ( with text flow order) and inserter updating text length addresses, making readable signatures and with the same order as appearing at the control section.

...but anyone want to translate this game? Tools are needed or not?
---------
extracted text presents like that:

Code:

---------- 0054 ---------- ほら　足はもっと高く！さいごまで　しっかりバランス！ ---------- 0055 ---------- こんにちは　斉藤コーチやってますね ---------- 0056 ---------- あら　こんにちは練習を見に　いらしたんですか？ ---------- 0057 ---------- はい ---------- 0058 ---------- 今年は　４年に一度の冬季国際フィギュア大会がひらかれる年でしょう？ ---------- 0059 ---------- そろそろ　それに出場する日本代表選手を決めはじめる時期ですから ---------- 0060 ---------- 選手たちのようすが気になったもので… ---------- 0061 ---------- みんな　すばらしい選手たちばかりですよ ---------- 0062 ---------- とくに今年は… ---------- 0063 ---------- 世界のトップスケーターとして活躍している安野美樹（やすのみき） ---------- 0064 ---------- 「リンクのようせい」と呼ばれ人気を集めている浅香真子（あさかまこ）

first chapter starts from - 54th pointer at control section but it keep order from that point...
maybe we can grab here who is talking.
We can't say who is saying dialogue. It's possible to take it from here if we figure out some intructions.
It's very possible that 4 bytes before "09" are important bytes too because it starts with that sequence:

Code:

Saito: 00 06 12 49 BdGuy: 00 06 1B 49 Saito: 01 06 12 49 BdGuy: 02 06 1B 49 BdGuy: 02 06 1B 49 BdGuy: 02 06 1B 49

so we can say Saito coach is 12, and blondie guy is 1B. cool
View attachment 238128 View attachment 238129 View attachment 238130
Ayaka: 65
Riko: 09
Luna: 03
Chihiro: 0B
...so after applying improvements to our text dumper tool we can recieve format like that:

Code:

---[ Saito Coach ]- 0132 --- まずは　そのまちになれてコンディションを　ばんぜんにととのえることが　たいせつよ ---[ Ayaka ]------- 0133 --- はい　わかりました！ ---[ Ayaka ]------- 0134 --- ここは　ひかえ室　だね ---[ Riko ]-------- 0135 --- ね　ねぇ　あやか＠＠＠＠＠！あそこにいる子　もしかして… ---[ Ayaka ]------- 0136 --- え…！？もしかして　鏡ゆいちゃん！？ ---[ Yui ]--------- 0137 --- あら　こんにちは ---[ Ayaka ]------- 0138 --- すすす　すご～い！ホンモノの　ゆいちゃんだ！かわいい～～ ---[ Yui ]--------- 0139 --- うふふ　ありがとうえっと…あなたたちもフィギュアスケートを？

...so great and nice, isn't it? It seems Yui has 02 code.

I made last comparision in this script file, I just extracted all signatures from textblock one by one and compared them with sorted values.
we've got 4 more strings. That signatures match japanese and korean, so they are the same, and we can find them before 0x38c0.

Code:

000092E1: 130CBDAE new! 00009343: 14B64E73 new! 0000B1F8: 87744906 new! 0000B739: 9A079B65 new!

let's check what contain that strings and where signatures in control section are...

Code:

000092E1: str: ") " - 0x2B2D 00009343: str: "Del(" - 0x2B23 0000B1F8: str: "slot unavailable" - 0x2E8E 0000B739: str: "…" - 0x5E52, 0x6648

...second method find more... especially last signature is important to include in dialogs.
I've got full dump all of the script files now... but there's no interest in this project so... my work ends here.
Things to do:
1 add additional single, multiple appearances of "..." to dialogs, or none.
2. add additional three strings positions that we found unchanged in purpose to place them at top of the textblock (possible without chaging signature)
3. make positions database. (store number of pointers, positions of singatures of three unchanged strings, and signature positions at control section for every file)
4. create inserter with compare function in purpose to look for the same strings (same text for same signature)
possible with create new ordering system for them.
5. it's possible that output file have to fit in less than 64KB, but that must be tested first.
----------
point 1-3 has accomplished, completed database size for all 29 script files: 20KB
-----
point 4 completed! dumper / inserter for all the script files works perfectly great!

well dang i do apologize for responding late O_O; i was offline doing other things, but i am glad you figured it out =O, that's pretty neat, but i don't know who would translate the game - =O have you've tried discord translation group?? or on here? make a post to see if anyone can help you out even on twitter.
would really support the translation for this game.

plasturion · Jan 4, 2021

No problem, no I didn't try to look for discord yet, but I made here another topic, so let's wait for now.
How about you? Would you like to participate, join to project, create a project page, translate, became a project manager, assistent?
Who knows how big team we need.

HinaNaru Cutie · Jan 4, 2021

plasturion said:
No problem, no I didn't try to look for discord yet, but I made here another topic, so let's wait for now.
How about you? Would you like to participate, join to project, create a project page, translate, became a project manager, assistent?
Who knows how big team we need.

ah okay, that's good.
xD i hope, hehe but i can't really translate sadly, i can only just do the noob stuff x.x". like reshare, do as much as i can to get the word around and that's about it on my end. it would be nice to translate stuff but i am only limited to the languages i know.

thank you though

plasturion · Jan 4, 2021

Well, if you speek english natively I think it's a great adventage.
I believe you could be good language style corrector or add something from time to time to refill meanings of previous context.
It's good to enrich the story with proper english.
I can send you original script of the first chapter, and translated by deepl machine.
Maybe you could try and check how it looks.

HinaNaru Cutie · Jan 5, 2021

plasturion said:
Well, if you speek english natively I think it's a great adventage.
I believe you could be good language style corrector or add something from time to time to refill meanings of previous context.
It's good to enrich the story with proper english.
I can send you original script of the first chapter, and translated by deepl machine.
Maybe you could try and check how it looks.

yeah i do, would be a good idea.
true, wait you translated with deepl machine? hopefully i am getting this right, but sure let me see how it looks so far =O

ROM Hack script dumper - are they pointers?

temporary hermit

Well-Known Member

temporary hermit

Well-Known Member

temporary hermit

Well-Known Member

temporary hermit

Well-Known Member

temporary hermit

Well-Known Member

Similar threads

Popular threads in this forum