ROM Hack Translation Need help with Xenoblade Chronicles 3 and bf3.ard

Jora66 · Jul 25, 2022

Can someone help to extract .bdat from bf3.ard?

imusiyus · Jul 29, 2022

They changed bdat storage format, cant edit anything right now

imusiyus · Jul 29, 2022

# Xenoblade Chronicles arh/ard unpacker
# Author Lukas Cone in 2018, Version 2
# script for QuickBMS http://quickbms.aluigi.org
# 1st iteration 1:15 hours
# 2nd iteration 40 minutes
endian little
comtype zstd
open FDSE "bf3.ard" 1

idstring 0 "arh1"

get unk long
get numnodes long
get offset1 long
get size1 long
get offset2 long
get size2 long
get tocoffset long
get numfiles long

filexor "0x33 0xB5 0xE2 0x5D"
GoTo offset1 0 SEEK_SET
get chunksize long
for i = 0 < numfiles
savepos OFFSET
get fname string
putarray 0 i fname
math OFFSET n OFFSET
math OFFSET + offset1
putarray 3 i OFFSET
get fid long;
next i

GoTo offset2 0 SEEK_SET
set numusednodes long 0
for i = 0 < numnodes
get id1 SIGNED_LONG
get id2 SIGNED_LONG
putarray 1 i id1
putarray 2 i id2
if id2 > -1
putarray 4 numusednodes i
math numusednodes + 1
endif
next i
filexor ""

GoTo tocoffset 0 SEEK_SET
for i = 0 < numfiles
get coffset longlong
get csize long
get ucsize long
get compressed long
get id long
getarray Filename01 0 i
getarray fileoffset 3 i
strlen filenamelen Filename01
math Fileend = fileoffset
math Fileend - filenamelen
string cstr = ""
callfunction FindFilename 1
strlen NAME_LENGTH cstr
math filenameoffset - fileoffset
math filenameoffset n filenameoffset
# print "%filenameoffset%"
if filenameoffset > 0
string Filename01 < filenameoffset
endif
string cstr + Filename01

if compressed > 0
GoTo coffset 1 SEEK_SET
idstring 1 "xbc1"
get _numfiles long 1
get _ucsize long 1
get _csize long 1
get _hash long 1
getdstring _fname 28 1
savepos OFFSET 1
if NAME_LENGTH < 1
string cstr P "[%id%] %_fname%"
endif
clog cstr OFFSET csize ucsize 1
else
if NAME_LENGTH < 1
string cstr P "[%id%] %Filename01%"
endif
log cstr coffset csize 1
endif
next i

startfunction FindFilename
for g = 0 < numusednodes
getarray f 4 g
getarray id1 1 f
getarray id2 2 f
if id1 <= fileoffset && id1 > Fileend
set filenameoffset SIGNED_LONG id1
# print "%filenameoffset% %id2%"
set currentChar SIGNED_LONG f
callfunction LOOPNODES 1
string cstr r cstr
# print "%cstr%"
break
endif
next g
endfunction

startfunction LOOPNODES
set cci SIGNED_LONG id2
getarray id1 1 id2
getarray id2 2 id2
math currentChar ^ id1
# print "%currentChar% %id1% %id2%"
string _cstr = currentChar
string cstr + _cstr
set currentChar SIGNED_LONG cci
# print "%cstr%"
if id2 > 0
callfunction LOOPNODES 1
endif
endfunction

kirei39 · Jul 31, 2022

Is it still not possible to extract the contents of bf3.ard?

masagrator · Jul 31, 2022

kirei39 said:
Is it still not possible to extract the contents of bf3.ard?

You have literally BMS script for unpacking it above your post...

archerboy · Jul 31, 2022

imusiyus said:
They changed bdat storage format, cant edit anything right now

Anyone smart and/or bored enough to join me trying to crack the new bdat?
I got some of the basics down - positions for file length, number of table rows, the table offset - but beyond that I'm struggling.

masagrator · Jul 31, 2022

archerboy said:
Anyone smart and/or bored enough to join me trying to crack the new bdat?
I got some of the basics down - positions for file length, number of table rows, the table offset - but beyond that I'm struggling.

Upload it here I guess? English and Japanese ones to figure out encoding differences.

kirei39 · Aug 1, 2022

masagrator said:
You have literally BMS script for unpacking it above your post...

I tried it and I get this error, could you please help me?

archerboy · Aug 1, 2022

masagrator said:
Upload it here I guess? English and Japanese ones to figure out encoding differences.

Good idea. I've attached EN and JP bdats and then a pair of JP/EN for a dialogue that only has 2 lines.
As for my progress, I was planning on just forking the XBTool so forgive the formatting

C#:

            //High level
            if (table.ReadUTF8(0, 4) != "BDAT") return;
            int fileLength = file.ReadInt32(12); // Assuming for now ReadUInt/Int32 reads 4 bytes (e.g. 0x12-0x15)
            // Remaining bytes until a "BDAT.0" contain information for multi-tables

            // Individual table \\
            // Assuming we're starting on the byte right AFTER the "0" in BDAT.0
          
            //x00-x01 - First 2 bytes after always empty?
            ??Type = table.ReadUInt32(2); // "03 00" in all Evt + Game bdats, "07 00" in everything else (e.g. tq bdats)
            ItemSize = table.ReadUInt32(6); // Number of rows in the table aka individual lines
          
            ?? = table.ReadUInt32(10); // Consistently "01" in Evt
            ?? = table.ReadUInt32(14); // Empty in dialogue, has values in Game bdats
            ?? = table.ReadUInt32(18); // Consistently "30" in Evt

            ??Type = table.ReadUInt32(22); // "39" in all Evt, "45" in everything else
            ?? = table.ReadUInt32(26); // ??
            TableOffset? = table.ReadUInt32(34); // An offset for some part of the table
            // For shorter dialogues, this takes you from first ??Type (2) to the byte right before the whole "MÒ}ŒDé..."
            // But not valid for longer dialogues

            StartOfDialogueString = "MÒ}ŒDéNUðC¯Û";
            StartOfDialogueHex = { 0x4D, 0xD2, 0x7D, 0x8C, 0x44, 0xE9, 0x4E, 0x55, 0xF0, 0x43, 0xAF, 0xDB };

Feels like a lot more 32 bit properties when compared to 2/DE which were primarily 16, so take my findings with a grain of salt.

masagrator · Aug 1, 2022

kirei39 said:
I tried it and I get this error, could you please help me? View attachment 320632

You must copy both arh and ard to one folder, then open arh through quickbms.

masagrator · Aug 1, 2022

archerboy said:

Good idea. I've attached EN and JP bdats and then a pair of JP/EN for a dialogue that only has 2 lines.
As for my progress, I was planning on just forking the XBTool so forgive the formatting

C#:

            //High level
            if (table.ReadUTF8(0, 4) != "BDAT") return;
            int fileLength = file.ReadInt32(12); // Assuming for now ReadUInt/Int32 reads 4 bytes (e.g. 0x12-0x15)
            // Remaining bytes until a "BDAT.0" contain information for multi-tables

            // Individual table \\
            // Assuming we're starting on the byte right AFTER the "0" in BDAT.0
       
            //x00-x01 - First 2 bytes after always empty?
            ??Type = table.ReadUInt32(2); // "03 00" in all Evt + Game bdats, "07 00" in everything else (e.g. tq bdats)
            ItemSize = table.ReadUInt32(6); // Number of rows in the table aka individual lines
       
            ?? = table.ReadUInt32(10); // Consistently "01" in Evt
            ?? = table.ReadUInt32(14); // Empty in dialogue, has values in Game bdats
            ?? = table.ReadUInt32(18); // Consistently "30" in Evt

            ??Type = table.ReadUInt32(22); // "39" in all Evt, "45" in everything else
            ?? = table.ReadUInt32(26); // ??
            TableOffset? = table.ReadUInt32(34); // An offset for some part of the table
            // For shorter dialogues, this takes you from first ??Type (2) to the byte right before the whole "MÒ}ŒDé..."
            // But not valid for longer dialogues

            StartOfDialogueString = "MÒ}ŒDéNUðC¯Û";
            StartOfDialogueHex = { 0x4D, 0xD2, 0x7D, 0x8C, 0x44, 0xE9, 0x4E, 0x55, 0xF0, 0x43, 0xAF, 0xDB };

Feels like a lot more 32 bit properties when compared to 2/DE which were primarily 16, so take my findings with a grain of salt.

So those individual strings in BDATS are read by some offsets or have some size checks? If not, then they are pretty easy to dump and push back from what I see.

botik · Aug 1, 2022

Script paladins_akpk.bms is suitable for unpacking sound pck files. But the received files wav are not played.

kirei39 · Aug 1, 2022

masagrator said:
You must copy both arh and ard to one folder, then open arh through quickbms.

Thank you!

masagrator · Aug 1, 2022

archerboy said:

Good idea. I've attached EN and JP bdats and then a pair of JP/EN for a dialogue that only has 2 lines.
As for my progress, I was planning on just forking the XBTool so forgive the formatting

C#:

            //High level
            if (table.ReadUTF8(0, 4) != "BDAT") return;
            int fileLength = file.ReadInt32(12); // Assuming for now ReadUInt/Int32 reads 4 bytes (e.g. 0x12-0x15)
            // Remaining bytes until a "BDAT.0" contain information for multi-tables

            // Individual table \\
            // Assuming we're starting on the byte right AFTER the "0" in BDAT.0
      
            //x00-x01 - First 2 bytes after always empty?
            ??Type = table.ReadUInt32(2); // "03 00" in all Evt + Game bdats, "07 00" in everything else (e.g. tq bdats)
            ItemSize = table.ReadUInt32(6); // Number of rows in the table aka individual lines
      
            ?? = table.ReadUInt32(10); // Consistently "01" in Evt
            ?? = table.ReadUInt32(14); // Empty in dialogue, has values in Game bdats
            ?? = table.ReadUInt32(18); // Consistently "30" in Evt

            ??Type = table.ReadUInt32(22); // "39" in all Evt, "45" in everything else
            ?? = table.ReadUInt32(26); // ??
            TableOffset? = table.ReadUInt32(34); // An offset for some part of the table
            // For shorter dialogues, this takes you from first ??Type (2) to the byte right before the whole "MÒ}ŒDé..."
            // But not valid for longer dialogues

            StartOfDialogueString = "MÒ}ŒDéNUðC¯Û";
            StartOfDialogueHex = { 0x4D, 0xD2, 0x7D, 0x8C, 0x44, 0xE9, 0x4E, 0x55, 0xF0, 0x43, 0xAF, 0xDB };

Feels like a lot more 32 bit properties when compared to 2/DE which were primarily 16, so take my findings with a grain of salt.

Ok, made a script for dumping texts.
I won't be bothering with making it to something that can push new translation since I am not that interested.
But maybe my scribbles will help somebody.

Python:

import glob
import json
import os
import sys
#import pymmh3

def read8(file, _signed = False):
	return int.from_bytes(file.read(1), byteorder="little", signed=_signed)

def read16(file, _signed = False):
	return int.from_bytes(file.read(2), byteorder="little", signed=_signed)

def read32(file, _signed = False):
	return int.from_bytes(file.read(4), byteorder="little", signed=_signed)

def readString(myfile):
	chars = []
	while True:
		c = myfile.read(1)
		if c == b'\x00':
			return str(b"".join(chars).decode("UTF-8"))
		chars.append(c)

def CheckMagic(file):
	if (file.read(4) != b"BDAT"):
		print("WRONG MAGIC!")
		print("offset: 0x%x" % (file.tell() - 4))
		sys.exit()

def CheckVersion(file):
	version = read8(BDAT_file)
	if (version != 4):
		print("Unsupported version of BDAT: %d" % version)
		sys.exit()

if (os.path.isdir(os.path.normpath(os.path.abspath(sys.argv[1])))): 
	files = glob.glob("%s/**/*.bdat"% sys.argv[1], recursive=True)
elif (os.path.isfile(os.path.abspath(sys.argv[1]))):
	files = [os.path.abspath(sys.argv[1])]
else:
	print("os couldn't detect if it's a file or directory!")
	sys.exit()

for i in range(0, len(files)):
	print(files[i])
	BDAT_file = open(files[i], "rb")
	CheckMagic(BDAT_file)
	CheckVersion(BDAT_file)
	header_size = read16(BDAT_file)
	isArchive = bool(read8(BDAT_file))
	assert(isArchive == True)
	subfile_count = read32(BDAT_file)
	file_size = read32(BDAT_file)
	assert(header_size == BDAT_file.tell())

	offset_table = []
	for x in range(0, subfile_count):
		offset_table.append(read32(BDAT_file))
	assert(offset_table[0] == BDAT_file.tell())

	DUMP = []
	for x in range(0, subfile_count):
		BDAT_file.seek(offset_table[x])
		start_subfile_offset = BDAT_file.tell()
		CheckMagic(BDAT_file)
		CheckVersion(BDAT_file)
		header_size = read16(BDAT_file)
		isArchive = bool(read8(BDAT_file))
		assert(isArchive == False)
		TypeInfo = read32(BDAT_file)
		match(TypeInfo):
			case 3 | 7:
				pass
			case _:
				print("detected unknown infotype: %d" % TypeInfo)
				sys.exit()
		entry_count = read32(BDAT_file) #This doesn't match always string count
		if (entry_count == 0):
			continue
		unk = read32(BDAT_file)
		unk = read32(BDAT_file)
		table1_offset = read32(BDAT_file) #unk
		table2_offset = read32(BDAT_file) #Table consists of [int32 hash, int32 ID]
		table3_offset = read32(BDAT_file)
		sizeof_table3_entry = read32(BDAT_file)
		string_block_offset = read32(BDAT_file)
		string_block_size = read32(BDAT_file)
		BDAT_file.seek(start_subfile_offset + table2_offset)
		table2 = []
		for y in range(0, entry_count):
			entry = {}
			entry["Hash"] = BDAT_file.read(4).hex().upper()
			entry["ID"] = read32(BDAT_file)
			table2.append(entry)
		BDAT_file.seek(start_subfile_offset + table3_offset)
		table3 = []
		match(TypeInfo):
			case 3:
				assert (sizeof_table3_entry == 0xA)
				for y in range(0, entry_count):
					entry = {}
					entry["Hash"] = BDAT_file.read(4).hex().upper()
					entry["unk"] = read16(BDAT_file)
					entry["StringOffset"] = read32(BDAT_file) #relative to string_block_offset
					table3.append(entry)
			case 7:
				assert (sizeof_table3_entry == 0x18)
				for y in range(0, entry_count):
					entry = {}
					entry["Hash"] = BDAT_file.read(4).hex().upper()
					entry["ControlStringOffset"] = read32(BDAT_file) #relative to string_block_offset
					entry["unk"] = BDAT_file.read(0xC).hex().upper()
					entry["StringOffset"] = read32(BDAT_file) #relative to string_block_offset
					table3.append(entry)
		BDAT_file.seek(start_subfile_offset + string_block_offset)
		BDAT_file.seek(1, 1)
		#it's murmur3 hash related to original filename without type, we can assume all hashes are murmur3
		hash = read32(BDAT_file, _signed=True)
		#assert doesn't work for "game" folder, guess those archives contain files that were originally named differently.
		#assert (hash == pymmh3.hash(os.path.basename(files[i])[:-5]))
		BDAT_file.seek(4, 1)
		STRINGS = []
		for y in range(0, entry_count):
			match(TypeInfo):
				case 3:
					BDAT_file.seek(start_subfile_offset + string_block_offset + table3[y]["StringOffset"])
					STRINGS.append(readString(BDAT_file))
				case 7:
					entry = []
					BDAT_file.seek(start_subfile_offset + string_block_offset + table3[y]["ControlStringOffset"])
					entry.append(readString(BDAT_file))
					BDAT_file.seek(start_subfile_offset + string_block_offset + table3[y]["StringOffset"])
					entry.append(readString(BDAT_file))
					STRINGS.append(entry)
		DUMP.append(STRINGS)
	if (len(DUMP) != 0):
		os.makedirs("DUMP/%s" % os.path.relpath(os.path.dirname(files[i]), os.getcwd()), exist_ok=True)
		JSON_file = open("DUMP/%s.json" % os.path.relpath(files[i], os.getcwd())[:-5], "w", encoding="UTF-8")
		json.dump(DUMP, JSON_file, indent="\t", ensure_ascii=False)
		JSON_file.close()
	else:
		print("Strings not detected in this file!")

It uses table3 to determine offsets of strings. There is no size check for individual strings.
Entries can reuse existing strings, so you can get repeated lines while in string block it's put only once (empty line is also a valid line

).

Provide file or folder with files as argument, it will create DUMP folder in the same place as your script with json dumps.
File won't be created if there are no strings to dump.
TypeInfo 7 seems to contain informations about what character is using which line, but they are not using names to identify them, only some short strings like "3" or "tV". Named them "ControlString".

Example:
Type 7:

JSON:

[
	[
		[
			"3",
			"Anyone else get the feeling\nColony 9 folks might be eating\na bit too many Spongy Spuds...?"
		],
		[
			"6",
			"Seemed to me like they really\nfell in love with them, yeah.\nThey seem to be always eating them."
		],
		[
			"5",
			"Eating 'em's one thing, but\ndid you hear 'em talkin' about\n\"saving Aionios\" with 'em?"
		],
		[
			"4",
			"Isn't that what we're doing too,\ncuriously enough?"
		],
		[
			"3",
			"'Course it's not! ...Uh, or is it?"
		],
		[
			"2",
			"Well, if nothing else, Spongy Spuds are\nhelping ease the food shortages, so people\ndon't have to fight so hard over resources."
		],
		[
			"2",
			"And if they don't have to fight,\nisn't that saving the world...?\nIn some sense, at least?"
		],
		[
			"3",
			"Don't get me wrong, I'm not arguing with\nthe logic, I'm just trying to point out that\nthere's an important point you're missing."
		],
		[
			"4",
			"And that is...?"
		],
		[
			"3",
			"It's this: eating nothing but potatoes...\nis *boring* as all get-out!"
		],
		[
			"tV",
			"Manana cannot take that lying down!"
		],
		[
			"tV",
			"Potato cuisine contain hidden depth!\nAmount of possible variation so high,\nis practically endless!"
		],
		[
			"tV",
			"With wings of Manana at spatula,\ncan eat potatoes all life and never\nnot be satisfied!"
		],
		[
			"2",
			"Well, I daresay this might just be\nManana's time to shine, then."
		],
		[
			"5",
			"Seeing as how that lot always basically\njust steam their spuds, I think that might\nbe a mercy."
		],
		[
			"tV",
			"Manana will get on case immediately!\nTime to save world with potatoes is *now*!"
		]
	]
]

Type 3:

JSON:

[
	[
		"[ML:undisp ](background audio of playing with boy B in fountain)",
		"[ML:undisp ](background audio of playing with boy A in fountain)",
		"C'mon, hurry up!",
		"The Queen's Anniversary's gonna start without us, guys!",
		"Yeah, move your feet!",
		"Hup, hup. Run like you mean it!",
		"[ML:undisp ](adlib breathing hard while running)",
		"(pant) Slow down, guys!",
		"Is it true, though? There's gonna be fireworks?",
		"Yeah! Saw them setting it up yesterday.",
		"There were loads of them.\nIt'll be worth it, promise!",
		"[ML:undisp ](adlib astonished gasp)"
	]
]

So if you want to edit those BDATs, you must watch out for those things when making tool for applying translation:
- file_size (main BDAT)
- offset_table (main BDAT, if it contains more than 1 sub BDAT - like in quest.bdat)
- string_block_size (sub BDAT)
- table3 (sub BDAT, since it contains entries with offsets of used strings)

Idaten · Aug 1, 2022

masagrator said:
if you want to edit those BDATs, you must watch out for those things when making tool for applying translation:
- file_size (main BDAT)
- offset_table (main BDAT, if it contains more than 1 sub BDAT - like in quest.bdat)
- string_block_size (sub BDAT)
- table3 (sub BDAT, since it contains entries with offsets of used strings)

Hi. I was searching in Google for a tool to edit bdat files (the text, translation etc). So I found this thread nad registered an account to ask you about something.
You look like someone who has pretty much good understanding of this kind of things. I tried your unpacking script and it worked as it should.
Now tell me, would it be possible for you to make a script that is packing these edited json files back to bdat? That would be extremely helpful for many people (including me, and I but I know some people from Croatia that were interested in translating this game).
I don't have much money but I could donate you in crypto a little. It looks like you're the only person who actually could make it without spending days on figuring out how to make it working. Thanks in advance!

imusiyus · Aug 2, 2022

deleted

masagrator · Aug 2, 2022

imusiyus said:
So far, there are no tools available for converting Json back to bdat

Because json is a custom storage format and script must be adjusted to specific way how data are stored by author in json.

imusiyus said:
And if you use some very simple scripts, it is easy to convert excel data back to bdat
Although one more step, it doesn't consume too much time

Then provide something that won't ruin offsets and size checks. Saying "it's very simple" without actually providing anything helps nobody.

Idaten said:
Hi. I was searching in Google for a tool to edit bdat files (the text, translation etc). So I found this thread nad registered an account to ask you about something.
You look like someone who has pretty much good understanding of this kind of things. I tried your unpacking script and it worked as it should.
Now tell me, would it be possible for you to make a script that is packing these edited json files back to bdat? That would be extremely helpful for many people (including me, and I but I know some people from Croatia that were interested in translating this game).
I don't have much money but I could donate you in crypto a little. It looks like you're the only person who actually could make it without spending days on figuring out how to make it working. Thanks in advance!

As I said - I'm not interested.

imusiyus · Aug 2, 2022

masagrator said:
Because json is a storage format and script must adjusted to specific way how data are stored by author in json.

Then provide something that won't ruin offsets and size checks. Saying "it's very simple" without actually providing anything helps nobody.

As I said - I'm not interested.

I do work on it and have just finished my edits - I just need some time ......

Diran · Aug 4, 2022

Anyone got progress on texture and bone extraction? '3'

Tried the modified XC2 dumper from BlockBuilder's github but only the meshes import. Save for the first time where I got Noah's jacket texture but that's it, idk why nothing else comes out.

Might try to contact whoever made the 3DSMax script, just wanna use these puppies already. ; ;

its_pencil · Aug 5, 2022

Diran said:
Anyone got progress on texture and bone extraction? '3'

Tried the modified XC2 dumper from BlockBuilder's github but only the meshes import. Save for the first time where I got Noah's jacket texture but that's it, idk why nothing else comes out.

Might try to contact whoever made the 3DSMax script, just wanna use these puppies already. ; ;

the textures are separate from the model files and are in the chr/tex/nx/h folder
u could figure out which texture belongs to which model by opening the .wismt file of the model u want in a hex editor and seeing the texture names in there but the BMS script exports the .wismt names in that texture folder incorrectly so :/

ROM Hack Translation Need help with Xenoblade Chronicles 3 and bf3.ard

New Member

Member

Member

New Member

The patches guy

Active Member

The patches guy

New Member

Active Member

Attachments

The patches guy

The patches guy

Well-Known Member

New Member

The patches guy

New Member

Member

The patches guy

Member

Member

New Member

Similar threads

Popular threads in this forum