Lots of more info, especially about the DATA32 mode. This should allow to get the DATA32_SUPPORT working (at least for reading, I haven't yet tested writing).
First of, selecting DATA32 mode requires setting both 40048D8h.bit1 and 4004900h.bit1. For DATA16 mode, both bits should be probably cleared, but DATA16 seems to be working also when only either of the bits is zero (I haven't found any difference there, it doesn't seem to matter which bit is cleared, or if both are cleared).
Next, in DATA32 mode, one should use the two IRQ flags in 4004900h instead of the RXRDY, TXRQ, DATAEND, CMD_BUSY bits in 400481Ch. Those bits do still exist, but their meaning is somehow changed. For example, DATAEND and CMD_BUSY are getting toggled
before reading the last block.
Below are 400481Ch (8 digits) and 4004900h (4 digits) values, logged before sending the command, and before/during/after reading data, showing the differences between DATA32 and DATA16 mode.
When BLK_COUNT=1 (single block):
Code:
;data32 mode: ;data16 mode:
; 20800421 1802 ; 20800421 1800
;---
; 20800425 1B02 DATAEND + FLG32's ; 41800421 1800 RXRDY+CMDBUSY
; 20800425 1B02 ; 41800421 1800
; 20800425 1A02 bit8 cleared ; 41800421 1800
; 20800425 1802 bit9 cleared ; 21800425 1800 DATAEND + CMDrdy
;---
; 20800421 1802 DATAEND acked ; 20800421 1800
When BLK_COUNT=2 (two blocks):
Code:
;data32 mode: ;data16 mode:
; 20800421 1802 ; 20800421 1800
;---
; 41800421 1B02 RXDRY+FLG32's ; 41800421 1800 RXRDY+CMDBUSY
; 41800421 1B02 ; 41800421 1800
; 41800421 1A02 bit8 cleared ; 41800421 1800
; 20800425 1B02 bit8 set, DATAEND, NUM=1 ; 41800421 1800 NUM=2(not1)
;---
; 20800425 1B02 ; 41800421 1800
; 20800425 1B02 ; 41800421 1800
; 20800425 1A02 bit8 cleared ; 41800421 1800
; 20800425 1802 bit9 cleared, NUM=1(not0) ; 21800425 1800 DATAEND + CMDrdy, NUM=2(not0)
;---
; 20800421 1802 DATAEND acked
The bottom line would be using 4004900h.bit8 instead of RXRDY for reading (and probably also instead of TXRQ for writing; but haven't tested that yet).
And using 4004900h.bit9 instead of DATAEND/CMD_BUSY. Or, you could also combine them: Check DATAEND and CMD_BUSY as usually, but do
also ensure 4004900h.bit9=0 before treating the transfer as completed.
Note that 4004900h.bit8/9 are cleared automatically by hardware (unlike RXRDY,TXRQ,DATAEND which must be acknowledged manually).
Another odd effect is that DATA32_BLK_COUNT is decremented during DATA32 transfer, except after the last block: at that point the transfer is completed, but without decreasing the counter from 0001h to 0000h (ie. it stays set to 0001h). Whilst, DATA16_BLK_COUNT isn't decremented (instead DATA16 mode is apparently using an internal counter register, which isn't visible via I/O ports).
And one small detail: 400481Ch.bit29 appears to be just inverse of 400481Ch.bit30, ie.
400481Ch.bit29 = CMD_READY (?)
400481Ch.bit30 = CMD_BUSY
don't know if that rule does always apply, and if the two bits are toggled exactly at the same time.
When looking at 4004900h.bit8/9 versus RXRDY, TXRQ, DATAEND, CMD_BUSY behaviour in DATA32 mode, it does somehow look as if DATA32 mode is transferring the incoming data to a FIFO (and toggles DATAEND/CMD_BUSY when writing to FIFO completed, but before the CPU starts reading the FIFO).
If that's right, then DATA16 mode might work without FIFO, ie. directly reading halfwords from the SD/MMC serial bus as they do arrive (that might work if it pauses the CLK signal when the CPU is reading too slow, or when applying WAITs when the CPU is reading too fast; though could mean really huge WAITs as the sd/mmc clock can be configured to very slow settings with only some kilobits/second.
Checking a hardware timer before/after reading could be used to confirm if there are WAITs occurring. Measuring the CLK signal would be also interesting to see if CLK gets paused between separate halfwords.
Anybody ready to do some scope tests (or donate the loads of spare microSD adaptors that you've hoarded in your stash)?