ChepChep sent me some 16 color IMYs , so I've got the IMY decoding totally sorted now. Just need to take care of the re-encoding.
I'll make the code for IMY -> PNG -> IMY available once I get the FakeCompression and reencoding sorted.
For those interested 0x00 - 0x1F is Header.
0x00 - 0x02 is file type signature: IMY in ASCII
0x04 - 0x06 is unknown data that seems to be related to the content (0x03 and 0x07 may be included in this but I haven't seen any files where they are non-00 yet)
0x08 - 0x09 is Width stored little Endian (for a 16 color image the final width is twice this value but that's done after pixel deinterlacing)
0x0A - 0x0B is unknown data (A4 04 appears to indicate that an IMY has multiple sections contained in it e.g icons, fonts ,etc) A4 08 seems to indicate an IMY is a single image)
0x0C - 0x0D is Height stored little Endian
0x0E - 0x0F is number of Palette colors stored Little Endian (16 or 256)
0x10 - 0x1F appeas to be all 00s
For 16 color palette
0x20 - 0x5F is Palette
For 256 color palette
0x20 - 0x0419 is Palette
In either case these are RGBA8888 colors, stored Big Endian
After the palette the next 2 bytes are either the jump offset from the end of those bytes to the start of the compressed data or 00 indicating that the following 4 bytes contain the jump offset from the end of those 4 bytes (stored Little Endian in either case, I may be wrong on the second part, since the assembly code there seems to be a very complicated way of doing that, so my byte ordering may be a little out)
After that is a sequence of 1 byte command blocks (use 1-16 half words of uncompressed data, a reference to an already decoded half word neighbor on the previous row or to the left (since the other neighbors aren't known), or a reference to 1-16 key blocks (either stored after the instructions and before the uncompressed data or referring to previous nearby uncompressed data, there's a limit on how far back you can go since the command block also sets the negative offset). Commands are processed until the output is equal to height x width (so there's no need for a terminating command).
After that is the key blocks (optionally)* , followed by uncompressed data.
*I'm not sure if key blocks are ever actually used or if that command only ever refers to nearby uncompressed data but it's theoretically possible (my gut feeling is that its probably not used, since storing the first instance as uncompressed data and then referencing that seems to always be at least as efficient).
After the data is decompressed , pixels have to be deinterlaced (Skyblade describes this on page 5 and his code is exactly correct). The color palette doesn't seem to be interlaced though , it seems to be stored in standard PNG palette format, so I think he made a mistake somewhere on that one.
In the case of 16 color / 4 bitDepth images , there's one more step: Each "byte" is actually 2 pixels consisting of 4 bits each, stored "Little Endian" , and the width is double that given in the header , so the width needs to be double and each byte translated into 2 bytes each containing the equivalent of a 4-bit value (since palette indexes in a PNG are always 8 bits even if the palette is only 16 colors).