So as said before I will point you to the functions that you have to call consecutively to get sounds to work.
The first thing to call of course is the intialisation of AX:
https://github.com/dimok789/loadiin...htly-8f0f7a8/src/sounds/SoundHandler.cpp#L225
Use init with parameters as the other function is old and should not be used anymore with sound lib 2.
After that you have to register a callback function in the AX frame cycle.
https://github.com/dimok789/loadiin...htly-8f0f7a8/src/sounds/SoundHandler.cpp#L242
Just some C function that will be called every 3ms. In this C callback you have to handle your voices. Which you still have to create (you can of course create the voices before registering for the AX frame callback).
Here is my frame callback:
https://github.com/dimok789/loadiin...htly-8f0f7a8/src/sounds/SoundHandler.cpp#L279
(Don't worry about it being C++ its a static function in a class which works very alike to a C function but just with a namespace before it)
So as you see you will have to make a state machine handling all your voices in that function. You must check their current status and process them accordingly. Don't forget to protect the voice with a mutex if you access them from another thread. The AX callback is its own thread.
Ok now to the point on how to aqcuire a voice.
https://github.com/dimok789/loadiine_gx2/blob/Loadiine-nightly-8f0f7a8/src/sounds/Voice.h#L47
This is just a struct with a few internal states. I treat it as a void* because I dont care about its internal variables (which isnt very "clean"). The priority of the voices go from 1 to 31 where 1 is the highest priority.
Once you have acquired a voice you have to define the type of voice. It's either a streaming voice or a normal voice. I always use normal voices as I can do streaming with them as well.
Once you define the type of the voice you can set its volume (AXSetVoiceVe) and mix (AXSetVoiceDeviceMix). The volume is set by the upper 16 bit of the uint you see in that code and max is 0x8000. As for mixing you can do some crazy mixing of up to 6 channels in it. I juse use the two channels of the stereo output in one voice at full volume. It is the simplest form of mixing. That also means that both channel play exactly the same samples here and therefore its only "mono" in my code. To do real stereo you would need to acquire two voices and play the different samples of a channel accordingly. Also I setup the same mixing for TV (device = 0) and DRC (device = 1). So actually one voice does the work for 4 channels. You can of course assign each channel its own voice and play different sounds on them.
Thats all you will actually need to setup the voices.
Now to start playing the voice with your samples. The first step is to set the buffer you want to play. This is set by the AXSetVoiceOffsets function in my code.
https://github.com/dimok789/loadiine_gx2/blob/Loadiine-nightly-8f0f7a8/src/sounds/Voice.h#L79
It takes a struct as an argument that defines the sample buffer, format of it and the offset in the buffer to play and if it should be looped or not. Here is the struct:
https://github.com/dimok789/loadiine_gx2/blob/Loadiine-nightly-8f0f7a8/src/sounds/Voice.h#L155
The sample format should be ADPCM (big endian s8 or s16 samples). The format can either be 8 bit = 0x0A or 16 bit 0x19. If you just want the samples to be played once then define the loop parameter as 0 otherwise 1. Then you define the current position and end position in samples count (so either in 8 bit or 16 bit steps, I use always 16 bit samples so 2 byte per sample). You can also define a loop offset where it should continue to play once it is reached the end sample. The offset is an offset from the start in samples count. It can be any location in memory and does not have to be directly behind the first buffer. With this you can make a 2nd buffer, fill it up with samples and make the voic jump to the start of the 2nd buffer as soon as the end sample of the first buffer is reached. Once it switches to the 2nd buffer you change the end sample variable to the end value of the 2nd buffer and the loop offset back to the start of the 1st buffer. While the 2nd buffer is playing you re-fill the samples of the first buffer. When the 2nd buffer is finished playing and the end sample of the 2nd buffer is reached it will jump back to 1st buffer start. This can go on as long as you want and you can make an endless flawless playing of a stream with it. And that is exactly what I do there. I check in each AX frame callback the loop counter of the voice with AXGetVoiceLoopCount(voice) and if it is changed then I know it switched to a knew buffer. In that case I change the end sample of the voice with AXSetVoiceEndOffset() and then change the loop offset with AXSetVoiceLoopOffset().
So this was a little detour to how to make a stream. Now back to how to start the voice playing. The first initial buffer is set with the function AXSetVoiceOffsets() which takes a pointer to the struct i described before as an argument together with the voice that is used.
After you set the initial sample buffer what is left to do is to define the pitch of the voice by calling AXSetVoiceSrc. It takes another struct that defines a fixed point converted pitch value. See here for the calculation:
https://github.com/dimok789/loadiine_gx2/blob/Loadiine-nightly-8f0f7a8/src/sounds/Voice.h#L91
I dont remember what the last 3 unsigned int values were but they seemed to be always 0. The next function sets the voice interpolation type with AXSetVoiceSrcType which defines what kind of converter should be used. I chose 1 as it is linear interpolation which is ok in most use cases.
The last thing to do is to actually start the voice by calling AXSetVoiceState() with the voice and 1 as parameter (or 0 to stop).
Well that is all to it. I hope I didnt leave anything out. To stop playing you just call AXSetVoiceState() with 0 and then free the voice with AXFreeVoice().
If you still have troubles with getting this to work then let me know. I will write up a simple example application on how to use it which will probably take less time for me than writing this huge post (sorry about it
).