A quick update for anyone who cares.
I ended up going the route of a HDMI-to-USB capture device as there seemed to be no point in getting an HDMI to optical converter since they were more expensive and more restrictive (and harder to find).
I bought a VIVITAR HDMI-to-USB Capture Card from Walmart for like $20. It looks very unremarkable as it's literally a little flash-drive sized looking thing with a male USB end (output) and female HDMI (input).
This capture card behaves oddly in that the audio is captured as a single channel. The 48KHz Stereo signal is folded (interleaved? interlaced?) into a single channel @ 96KHz (so 2x 48KHz channels into 1x 96KHz channel) with data appearing at the non-audible range above 48KHz for one of the stereo channels. In order to record any audio properly, you have to remap that data to go into the adjacent channel, which I did using a tool called "Virtual Audio Cable" and "mono-to-stereo" by ToadKing (more info here:
https://github.com/ToadKing/mono-to-stereo/releases and here
https://www.reddit.com/r/obs/comments/mm0saj/psa_if_youre_using_one_of_those_cheapo_1020ish/ )
I verified that the stereo mapping was correct (and not just doubling a single mono signal) by playing a game, recording audio, and verifying that the left channel contained content properly (so having something happen on the left-hand side and confirming the recording on the left channel did carry the sound content vs the right channel).
In other words, Vivitar uses a FAKE mono signal to capture audio and it's really a stereo 48KHz signal that's transmitted using a single channel by doubling the KHz container range. I have no idea on the technical reasons why this device behaves that way, all I can comment is on how I fixed it and verified the solution to be accurate.
FWIW, hope that's helpful to someone.