If the ZLT feature is enabled and the size passed to usbDsEndpoint_PostBufferAsync() is aligned to the endpoint packet size, a ZLT packet will be issued.
Otherwise, no ZLT packet is sent even if the feature is enabled on the target endpoint - ZLT packets are not needed while handling unaligned block sizes according to the USB bulk transfer specs.
I just wanted to clear that up for anyone else reading this thread.
use IDsEndpoint command 5 (SetZeroLengthTerminate) to configure the endpoint to send zero length packets appropriately.
This may not always be a desirable behaviour - if the host device doesn't count for a ZLT packet being sent by the Switch, a timeout error will be triggered on the Switch - this is because most USB backends require you to use a bigger transfer length (+1 byte at least) if a ZLT packet is to be expected.
If you're dealing with file transfers and your transfer block size is aligned to the endpoint packet size, then you have to ask yourself if sending a ZLT packet in each loop iteration (except, probably, for the last one) is really the best way to go.
I faced this very same issue while developing the USB transfer protocol for the ongoing nxdumptool rewrite. The best solution I could come up with to avoid sending ZLT packets right after each and every aligned data block was making the host script report back the endpoint packet size from the selected device descriptor during the initial handshake.
This way, it is possible to accurately know when and where to send a ZLT. I agree there should be a way to know which device descriptor / speed was selected by the USB host, though.