Most soundtracks are ripped from the games themselves, or CDs if that is a thing for the game. Be it from sound test, from hosted files within the game or from fun with emulators (dropping channels, blanking sound effects in various manners*). Some will also seek better quality hardware (see discussions of various DACs and other chips/output methods as it pertains to different models of SNES or megadrive maybe), maybe even improve things themselves with a visit to an electronics components shop or two.
*main three being make the thing play as mute, make the thing never play and make the thing play but the sound effect it uses be blank.
There was a spell a while back for remaking the Donkey Kong soundtrack when people discovered the synth . You get into questionable territory there (the dev presumably knew what the chip sounded like and worked accordingly).
As you asked about it though and if I had to reconstruct a track from footage that is laced with sound effects there are two main techniques that people would be using.
1) Game music is kind of repetitive by design. Someone more musically inclined than I* would probably spot the core loop of the song on first listen, I might need active consideration for that one.
Anyway isolate progressions and use them to backfill areas that are suffering from overlay. Hopefully you get a nice clean example of the main loop but you might have to go note by note.
You might have to get better still
1a) As noted in the Donkey Kong thing above then if you can figure out what instrument library they are using, or indeed if other instruments are just a pitch tweak of something else (some things might be a simple piano tone say and then pitch bent to the different notes on the scale used) and you can isolate that (in audacity then analyse-plot spectrum) and recreate it. With this you can recreate songs from scratch if you really needed to.
2) If you are exceptionally lucky the sounds will be able to be filtered out. High pass, low pass and all that goodness there. This will tend to be more for things like vocals that are in a narrower band than human hearing is in total and the instruments you want are outside it, however it could happen that an effect is way outside the frequency of the song so you have that as an option.
2a) This I will often employ with some of the hardware methods.
As any schoolboy will tell you then waves can cancel out. In this case you find the sound you don't want similar to above, invert it, align it and it will cancel itself out of the sound. Now you are not going to be left with a clean audio in 99.99% of cases but it will be enough to do something with that is more than silence
*you need not be anywhere near as good as this guy but
Hopefully you do know your trackers though, even if only to understand the options and limitations that the original composer was likely working to. History thereof in the absence of something better. Main open source one is openmpt if you need one.
For the most part there is not going to be much in the way of getting the computer to do it for you -- how do you eat an elephant... one bite at a time. Do also do yourself a favour and check to see if the videos have been released in different forms (languages, footage on a livestream, condensed footage for clip segment)... for while they might not be completely clean a better selection of samples is good to have.