Just a quick update for those actively watching the progress of this, I've gotten the script dumper fairly close to being done. My tool is successfully dumping the text from the file with the script from the beginning of the game, which this file is the second largest script file. The text, when dumped, makes a txt file that is about 331kb in size (which you can guess is fairly large for a purely text file).
What's making it take so long is that the pointers for the text (for those unfamiliar with the term, it's a number (address) telling the game where to look in the file for a specific text string) are scattered throughout the file and it's not just text in the file, but plenty of other stuff that needs to be left alone. Some text strings have multiple pointers, but usually only one is the actual pointer. And some text strings seemingly have no pointer (either they are dud text that the developer decided not to use but never removed or they are using another format for their pointers, which seems doubtful).
I'm going to start looking at the general text (items, menu text, etc.) soon and once I get that stuff dumped, I'll start looking for translators and this project will start for real.
Edit
The items text file seems to use a repeating byte jump pattern that goes:
92-bytes
48-bytes
48-bytes
40-bytes
repeat...
I've got to check the other general text files, but this type of pointer format is quite easy to dump/insert, so that's some good news. As for if we end up needing more bytes than what is allotted by default, I'll have to find where it's calculating the pattern and change it, but it's already a comfy amount of space so it might not be needed.
Edit 2
Alright, I've gone through and made note of (I think) all of the files that have text from the CPK.
I also went through the EBOOT and made note of where the text is located and I've figured out the pointer system for the EBOOT. It's a little bit weird, but nothing difficult.
And from what I can tell, the text in the script files that I said appear to have no pointers, they seem to be in the EBOOT and maybe a few of the other files. So it's seeming like I was right, the text with no pointers is unused text that the developer moved elsewhere and then didn't clean up the trash left behind.
There is also a good deal of image-based text, located in the .tex files. I'll have to research these more, but I'm assuming they will need a tool programmed to convert to png and then back again. I don't really have experience in that area, but one thing at a time. Let's get the plain text part done, first.
The next step is I need to go through all of the files with text and I need to document how their pointers work, so that I can then get my tool working with them. Because there appears to be at least a few different setups for files with text.