What’s the process for decompiling?
Complicated and varies by language, processor used and a few other things.
Generally though
Most people vaguely involved in hacking, cheat making or whatever will have heard of disassembly.
Here this is where the binary code as it would be run on the CPU is taken and fired through a program to turn that into the human readable form of the operations (instructions in computer parlance) that the CPU does, maybe with a couple of concessions for obvious things. The disassembly turning the binary into assembly code, assembly being a (CPU and in many ways system and even OS specific) means of programming a computer.
However assembly is very hard/tedious to use so we have higher level languages to abstract some of the complexity away, make it easier to convert between systems and also more secure a bit later down the line (very hard to do the ASLR thing that is bothering Switch cheat makers when you code in assembly, not to mention if you are concerned with every little thing then security can slip away from you). Bonus is this was historically considered essentially one way for reasons we will cover shortly.
Indeed one of the big draws of the PS1 and N64 were the abilities to program in C (thousands of new kids taught per year) rather than assembly (only highly paid nerds and greybeards by this point in history, today it is even worse, and generally even then horrendous to work with, port around to other systems and debug) or Basic (slow as sin).
What each of these languages brings to the table and makes more complicated for decompilation efforts varies. Traditionally it got broken down into low level languages and runtime based languages (think Java, Python, Perl and the like where you essentially have a script that a program runs) and you can then decompile that more easily, though lines got blurry and get blurry even within runtime based stuff.
Back to assembly though. In the fundamental basics of computing/programming (which, as with a lot of things, we get to largely credit one Alan Turing) there is something known as the halting problem. The general idea is a program will eventually make a choice of what to do, even an utterly basic conversion program will probably have something here. See loops in any intro to programming course (
https://www.tutorialspoint.com/cprogramming/c_loops.htm ) for the overview of that one, though for the sake of debate we can consider the IF ELSE, which is to say IF this happens do that ELSE do something other, and quite often such things run IF IF IF IF IF IF... IF ELSE and might well exist within another loop, or have the computer have options to break out of it on its own terms and do something else (see interrupts).
When pulling apart the program by itself with just the binary you have no idea what the state is supposed to be as the program is not running to generate the state to compare things against (it is why we had dynamic recompilation emulators back... prior to that first N64 emulator in a technical sense but that was probably the big introduction to the world but general decompilation has taken until now really and still has major caveats and will probably only work for straight C for a while yet with C++ being a ways out).
Decompilers, or the people making them, however realise that most of the time one particular branch in the path of the code is usually taken so don't have to keep tabs on potentially thousands of possibilities expanding as the program goes on.
They can also note that certain constructions within assembly are the result of standard libraries (programming is all about reusing stuff already done rather than having to do it yourself) or common programming approaches (there are limited ways to do anything, certainly limited ways to sensibly do anything when you have limited computational resources), which is something the dynamic recompiler also makes use of.
It is quite possible to scan a program if you can get it into plain form (no compression, no encryption, possibly optimise things or indeed deoptimise things) to find where all the particular subroutines lie, name them something and watch how the main program builds things up.
If you have some nice debug options left over, or some leaks from other parts of the program for function names or something that can also help (Diablo also saw similar treatment as a basis of this --
https://www.gamasutra.com/view/news...red_Diablo_source_code_released_on_GitHub.php and
video).
Can also help to run the program a few times to generate info -- ROM hackers have done lesser versions of this for years (FCEUX for the NES has a nice feature,
https://tasvideos.github.io/fceux/web/help/fceux.html?TraceLogger.html , where you do everything but what you want, and then what you want, and the different things that had not happened before are what it says you might want to look at this. GBA hackers will often run things, note all calls to the ROM from compression functions in the BIOS and thus have a short list of places to look for good stuff and what it is supposed to be once you get there on the compression side of things, indeed compression tools may even be able to accept these logs).
With one or more of those you can then start to abstract the basics away back into the functional (maybe even actual) equivalent of the language it came from (or was translated into in some cases) and be left with fewer things to have to manually understand, and an easier job with the high level stuff.
It should also be noted that inline assembly is a thing -- C is great and all but occasionally you do need to get down and dirty with the CPU. To do this a dev might drop into some assembly for a little bit to make it as tight (or as complicated if they are trying to prevent hackers -- see
deobfuscation by optimisation for a somewhat related concept, and another thing to feed into your decompilation tools) as they need it to be. With no C or whatever to back into then this can frustrate decompilation efforts. Today this is not so commonly seen (Microsoft for their stuff yanked it in favour of something called compiler intrinsics a while back, and as mentioned certain types of security don't play well with it) but back in the day it was common for a lot of things.
It gets an awful lot more complicated as you can then start to get into really fun areas of maths, probability, practical computation, compiler theory (I barely even scratched the surface of that one) and computation in general to hone this method down, to say nothing of it helping if you are already a competent programmer in the language and possibly device in question (and maybe history of it all -- compilers today are a lot better than they were back then, and that also makes things harder if you only have things compiled on new compilers where the optimise things a lot more).
To that end not a magic bullet, the resulting code has serious legal concerns (I can see some arguments in favour of, such as interoperability, that might hold up, however it is quite literally the opposite of independent recreation or clean room reverse engineering. Though at the same time outside of Nintendo sending in a lawyer to do the equivalent of
this they can't use the code either), only works for a limited set of cases (so mostly C or scripting languages, and C# for reasons I am not going to get into here. Most games from the PS2 on up being C++ or worse. Most things older than the PS1 and N64 being assembly other than the PC which might well have been C for a while -- 1978 was the first and by 1989/1990 it was more or less dominant for most things), probably has to be trained on a given system (fortunately most things people care about are for X86 Windows and there are only half a dozen popular compilers with only a few major versions each, though this is not consoles of the mostly still programmed in C era) and still requires some fairly expert abilities to pick through once you start making headway (you probably don't have comments, design docs, or nice names for things to give a hint as to their function).
Edit. Probably should have noted some actual decompilers and projects. People do take all that and put it into programs for people to use. The people behind the famous IDA disassembly/debug/reverse engineering tool you likely see in almost any hacker talk using assembly have a version of one for a few processor types
https://www.hex-rays.com/products/decompiler/
A few universities and PHD students also have some good stuff there (it really is cutting edge stuff). Not sure what we have in the open source world of particularly notable quality though
https://rada.re/n/radare2.html has some stuff and
https://www.nsa.gov/resources/everyone/ghidra/ also is worth mentioning in this.