OK then starting on s03e12.e
A FPK file as you say
<div class='codetop'>CODE</div><div class='codemain' style='height:200px;white-space
re;overflow:auto'>
00000000 | 7330 3365 3132 2E65 7A4A 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 4801 0000 AF72 0000 | s03e12.ezJ......................H....r..
00000028 | 7330 3365 3132 2E65 7A4A 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 F873 0000 AF72 0000 | s03e12.ezJ.......................s...r..
00000050 | 7330 3365 3132 2E65 7A4A 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 A8E6 0000 AF72 0000 | s03e12.ezJ...........................r..
00000078 | 7330 3365 3132 2E65 7A4A 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 5859 0100 AF72 0000 | s03e12.ezJ......................XY...r..
000000A0 | 7330 3365 3132 2E65 7A4A 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 08CC 0100 AF72 0000 | s03e12.ezJ...........................r..
000000C8 | 7330 3365 3132 2E65 7A4A 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 B83E 0200 AF72 0000 | s03e12.ezJ.......................>...r..
000000F0 | 7330 3365 3132 2E65 7A4A 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 68B1 0200 AF72 0000 | s03e12.ezJ......................h....r..
00000118 | 7330 3365 3132 2E65 7A4A 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 1824 0300 AF72 0000 | s03e12.ezJ.......................$...r..
</div>
ASH does indeed start at 0148 and the one after at 73F8
Extracted the first, second and third ASH files (I probably should call them s03e12.ezJ but hey) which are identical
Even a basic zip file of the s03e12.e ends up at around 26KB where the single ASH file ends up just under 24 which makes it a fairly safe bet they will be the same (ASH files themselves are about 29KB).
It is cheesy film time shortly so I will leave the checks and assume for now the rest are the same.
The padding with odd bytes and such are classic indicators of LZ compression ( <a href="http://www.romhacking.net/docs/281/" target="_blank">http://www.romhacking.net/docs/281/</a> ) although I do not see a flag at the start of the file (on the DS at least if a file starts with 10, 11 or sometimes 40 it tends to mean LZ compression, especially if that value comes before a magic stamp) or something that might otherwise indicate it which is not so much an indicator of no compression but an indicator that it does not use SDK stuff if it does (it still might be BIOS compatible but some of the LZ tools might stumble a bit) not to mention I shaved near 20% in a basic zip which I should not have been able to do if it was compressed. To this end what you just described might also be a form of pointer compression (if you can reference something), markup,in section pointers or possibly even a measure of programming ability.
Each file is 72AF long and I still can not see a length indicator although if it is section by section pointers.
Getting back on topic you the first bit of text you say is there and the next you grabbed from the RAM I presume. Sadly it seems to be nearly 2am and figuring out a properly custom compression format is not something I should really be attempting but if I leave you with the following (I tried to line things up) it might shine a light on some of the happenings. Those little sections are probably pointers and/or indicators for the repeated section. I have made no effort to decode any of it and I have not tried to figure out the flags.
<div class='codetop'>CODE</div><div class='codemain' style='height:200px;white-space
re;overflow:auto'>Start of text (the basic section)
<!--coloro:#FF8C00--><span style="color:#FF8C00"><!--/coloro-->E3 08 E1 48<!--colorc--></span><!--/colorc--> 71 93 8E 0C B7 B0 AA BF 1D E0 30 D9 8D D8 E0 27 D7 E3 22 E1 47 <!--coloro:#9932CC--><span style="color:#9932CC"><!--/coloro-->E3 15 0B<!--colorc--></span><!--/colorc--> B2 AC B7 BF B2 D6 A6 8D E0 07 9D 73 7C A5 8E E1 5B E1 5B E3 0C E3 00
The second section as it appears in ram
<!--coloro:#FF8C00--><span style="color:#FF8C00"><!--/coloro-->E3 08 E1 48<!--colorc--></span><!--/colorc--> <!--coloro:#00BFFF--><span style="color:#00BFFF"><!--/coloro-->E4 7D E4 57 E1 4A E0 14 98 7A E1 4B A4 97 85<!--colorc--></span><!--/colorc--> <!--coloro:#9932CC--><span style="color:#9932CC"><!--/coloro-->E3 15 0B<!--colorc--></span><!--/colorc--> <!--coloro:#FF0000--><span style="color:#FF0000"><!--/coloro-->CE DC B7 BA E1 52 A4 0C 85 A1 87 7B A0 7D 88 E1 4F<!--colorc--></span><!--/colorc--> E3 0C 00
07 1E 1E (Random flag or something)
<!--coloro:#00BFFF--><span style="color:#00BFFF"><!--/coloro-->E4 7D E4 57 E1 4A E0 14 98 7A E1 4B A4 97 85<!--colorc--></span><!--/colorc--> (next part of file)
13 0E 22 * (another flag)
<!--coloro:#FF0000--><span style="color:#FF0000"><!--/coloro-->CE DC B7 BA E1 52 A4 0C 85 A1 87 7B A0 7D 88 E1 4F<!--colorc--></span><!--/colorc--> (more file)</div>
*An utterly redundant piece of compression but the first rule of this sort of thing is never assume that it is worthwhile when it comes to automated lossless compression.
Looking at the new ASH files.
It mostly seems to be gibberish but at 872 hex there is some ASCII.
The most common character is 02 by almost double the next (0F) but I will leave that for later.
"Basically, the text letters are 1 byte each up to E0-E4 which then have 2 bytes (E3-15 for <enter>, E3-0c for !, etc.). After that, E5-FF don't seems to be used for text.
So I tracked down the dialogs bytes. "
Not quite sure but if I understood it the table runs something like
01 to E0 or so will only be decoded as a single byte where something starting with E4 or higher will decode with 2 bytes which is remarkably similar to properly implemented unicode ( <a href="http://www.joelonsoftware.com/articles/Unicode.html" target="_blank">http://www.joelonsoftware.com/articles/Unicode.html</a> ) and more to the point chimes quite well with the NFTR character map.
crystaltile2 spat this at me
<div class='codetop'>CODE</div><div class='codemain' style='height:200px;white-space
re;overflow:auto'>20=
21=!
22="
25=%
26=&
27='
28=(
29=)
2A=*
2B=+
2C=,
2D=-
2E=.
2F=/
30=0
31=1
32=2
33=3
34=4
35=5
36=6
37=7
38=8
39=9
3A=:
3B=;
3F=?
41=A
42=B
43=C
44=D
45=E
46=F
47=G
48=H
49=I
4A=J
4B=K
4C=L
4D=M
4E=N
4F=O
50=P
51=Q
52=R
53=S
54=T
55=U
56=V
57=W
58=X
59=Y
5A=Z
5C=\
5F=_
60=`
61=a
62=b
63=c
64=d
65=e
66=f
67=g
68=h
69=i
6A=j
6B=k
6C=l
6D=m
6E=n
6F=o
70=p
71=q
72=r
73=s
74=t
75=u
76=v
77=w
78=x
79=y
7A=z
00A1=¡
00B0=°
00BF=¿
00C0=À
00C1=Á
00C2=Â
00C4=Ä
00C7=Ç
00C8=È
00C9=É
00CA=Ê
00CB=Ë
00CC=Ì
00CD=Í
00CE=Î
00CF=Ï
00D1=Ñ
00D2=Ò
00D3=Ó
00D4=Ô
00D6=Ö
00D7=×
00D9=Ù
00DA=Ú
00DB=Û
00DC=Ü
00DF=ß
00E0=à
00E1=á
00E2=â
00E4=ä
00E7=ç
00E8=è
00E9=é
00EA=ê
00EB=ë
00EC=ì
00ED=í
00EE=î
00EF=ï
00F1=ñ
00F2=ò
00F3=ó
00F4=ô
00F6=ö
00F9=ù
00FA=ú
00FB=û
00FC=ü
0152=Œ
0153=œ
201C=“
201D=”
201E=„
2026=…
203B=※
2160=Ⅰ
2161=Ⅱ
2162=Ⅲ
2164=Ⅴ
2169=Ⅹ
2190=←
2191=↑
2192=→
2193=↓
21D2=⇒
21D4=⇔
25CB=○
2606=☆
266A=♪
3000=
3001=、
3002=。
300C=「
300D=」
300E=『
300F=』
3041=ぁ
3042=あ
3043=ぃ
3044=い
3045=ぅ
3046=う
3047=ぇ
3048=え
3049=ぉ
304A=お
304B=か
304C=が
304D=き
304E=ぎ
304F=く
3050=ぐ
3051=け
3052=げ
3053=こ
3054=ご
3055=さ
3056=ざ
3057=し
3058=じ
3059=す
305A=ず
305B=せ
305C=ぜ
305D=そ
305E=ぞ
305F=た
3060=だ
3061=ち
3062=ぢ
3063=っ
3064=つ
3065=づ
3066=て
3067=で
3068=と
3069=ど
306A=な
306B=に
306C=ぬ
306D=ね
306E=の
306F=は
3070=ば
3071=ぱ
3072=ひ
3073=び
3074=ぴ
3075=ふ
3076=ぶ
3077=ぷ
3078=へ
3079=べ
307A=ぺ
307B=ほ
307C=ぼ
307D=ぽ
307E=ま
307F=み
3080=む
3081=め
3082=も
3083=ゃ
3084=や
3085=ゅ
3086=ゆ
3087=ょ
3088=よ
3089=ら
308A=り
308B=る
308C=れ
308D=ろ
308F=わ
3092=を
3093=ん
309B=゛
309C=゜
30A1=ァ
30A2=ア
30A3=ィ
30A4=イ
30A5=ゥ
30A6=ウ
30A7=ェ
30A8=エ
30A9=ォ
30AA=オ
30AB=カ
30AC=ガ
30AD=キ
30AE=ギ
30AF=ク
30B0=グ
30B1=ケ
30B2=ゲ
30B3=コ
30B4=ゴ
30B5=サ
30B6=ザ
30B7=シ
30B8=ジ
30B9=ス
30BA=ズ
30BB=セ
30BC=ゼ
30BD=ソ
30BE=ゾ
30BF=タ
30C0=ダ
30C1=チ
30C2=ヂ
30C3=ッ
30C4=ツ
30C5=ヅ
30C6=テ
30C7=デ
30C8=ト
30C9=ド
30CA=ナ
30CB=ニ
30CC=ヌ
30CD=ネ
30CE=ノ
30CF=ハ
30D0=バ
30D1=パ
30D2=ヒ
30D3=ビ
30D4=ピ
30D5=フ
30D6=ブ
30D7=プ
30D8=ヘ
30D9=ベ
30DA=ペ
30DB=ホ
30DC=ボ
30DD=ポ
30DE=マ
30DF=ミ
30E0=ム
30E1=メ
30E2=モ
30E3=ャ
30E4=ヤ
30E5=ュ
30E6=ユ
30E7=ョ
30E8=ヨ
30E9=ラ
30EA=リ
30EB=ル
30EC=レ
30ED=ロ
30EF=ワ
30F2=ヲ
30F3=ン
30FB=・
30FC=ー
4E0A=上
4E0B=下
4E0D=不
4E16=世
4E2D=中
4E57=乗
4E88=予
4E8B=事
4EA4=交
4EBA=人
4ED5=仕
4EE3=代
4EE4=令
4EF2=仲
4F11=休
4F1A=会
4F1D=伝
4F4D=位
4F53=体
4F5C=作
4F7F=使
4FE1=信
4FEE=修
5024=値
5099=備
512A=優
5149=光
5165=入
5168=全
5175=兵
5177=具
51A5=冥
51FA=出
5206=分
5229=利
524D=前
5263=剣
529B=力
529F=功
52A0=加
52B9=効
52C7=勇
52D5=動
52DD=勝
5316=化
5317=北
534A=半
539F=原
53C2=参
53CB=友
53E4=古
5408=合
540D=名
5439=吹
546A=呪
5473=味
547D=命
54C1=品
54E1=員
5668=器
56DE=回
56F3=図
5730=地
57CE=城
5834=場
5897=増
58C1=壁
58EB=士
58F2=売
5927=大
5929=天
5931=失
5B88=守
5B9A=定
5B9D=宝
5BA2=客
5BA4=室
5BC6=密
5BFE=対
5C01=封
5C06=将
5C11=少
5C71=山
5CB8=岸
5CF6=島
5D16=崖
5DE8=巨
5DEE=差
5E1D=帝
5E2B=師
5E73=平
5E74=年
5F37=強
5F8C=後
5FA9=復
5FC3=心
601D=思
6027=性
606F=息
614B=態
6210=成
6211=我
6226=戦
6240=所
624B=手
6280=技
63A2=探
63DB=換
6483=撃
653B=攻
6557=敗
6570=数
6574=整
6575=敵
6587=文
65AC=斬
65AD=断
65B0=新
65B9=方
65C5=旅
65CF=族
65E5=日
660E=明
6642=時
6697=暗
6700=最
6708=月
6728=木
672C=本
6797=林
679C=果
6A29=権
6A5F=機
6B66=武
6B7B=死
6C17=気
6C34=水
6C5D=汝
6C7A=決
6D3B=活
6D77=海
6E08=済
6E1B=減
706B=火
708E=炎
70B9=点
7121=無
7279=特
7363=獣
738B=王
7406=理
751F=生
7528=用
7537=男
753A=町
753B=画
754C=界
75C5=病
767A=発
7687=皇
76EE=目
76F8=相
771F=真
7740=着
77E5=知
7834=破
795E=神
79C1=私
7A2E=種
7A7A=空
7ADC=竜
7B2C=第
7BB1=箱
7CFB=系
7D1A=級
7D4C=経
7D50=結
7DE8=編
7DF4=練
8001=老
8005=者
8010=耐
8056=聖
81EA=自
8239=船
884C=行
88C2=裂
88C5=装
89AA=親
8A00=言
8A08=計
8A3C=証
8A66=試
8A71=話
8AAC=説
8ABF=調
8B0E=謎
8CA0=負
8CA9=販
8CB7=買
8CDE=賞
8DE1=跡
8E0A=踊
8EAB=身
8ECA=車
8ECD=軍
901A=通
9023=連
9053=道
9078=選
907A=遺
90AA=邪
914D=配
91D1=金
9577=長
9593=間
95A2=関
95D8=闘
968E=階
96E8=雨
96EA=雪
9727=霧
9810=預
98A8=風
98DB=飛
99AC=馬
9A0E=騎
9A13=験
9AD8=高
9B54=魔
9CE5=鳥
9ED2=黒
FF01=!
FF05=%
FF06=&
FF1A=:
FF1D==
FF1F=?
FF21=A
FF22=B
FF23=C
FF5E=~
FF74=エ
FF75=オ
FF76=カ
FF77=キ
FF78=ク
FF79=ケ
FF7A=コ
FF7B=サ
FF7C=シ
FF7D=ス
FF7E=セ
FF7F=ソ
FF80=タ
FF81=チ
FF82=ツ
FF83=テ
FF84=ト
FF85=ナ
FF86=ニ
FF87=ヌ
FF88=ネ
FF89=ノ
FF8A=ハ
FF8B=ヒ
FF8C=フ
FF93=モ
FF94=ヤ
FF95=ユ
FF96=ヨ
FF97=ラ
FF98=リ
FF99=ル
FF9A=レ
FF9B=ロ
FF9C=ワ</div>
A slightly tweaked NFTR picture (tweaked for clarity)
<img src="http://pix.gbatemp.net/32303/dqmj21616font.jpg" border="0" class="linked-image" />
ü is 00FC . Some of the more exotic characters got broken but it should not be a problem.