Super Mario 64 has been decompiled

halo5307

Well-Known Member
Newcomer
Joined
Dec 29, 2015
Messages
84
Trophies
0
Age
34
XP
215
Country
United States
People should be doing things with it right now.
Is anyone going to talk about the contents besides me? This is the entire code of Super Mario 64 we have here.

I want to see people analyzing it.

Nobody is going to spend quality time with a leaked project simply out of respect for the developers. The fact that you have the audacity to supplement the leak, and then demand people look into it (when it's very clear you know absolutely nothing about the code) is ridiculous.

That being said, I would like to thank the original developers behind the project. I've always been interested in learning how to decompile projects, so this is quite cool! If any of you are in the thread, where/ how did you learn to decompile programs?
 

nim-ka

Member
Newcomer
Joined
Sep 5, 2018
Messages
21
Trophies
0
Age
25
XP
92
Country
United States
I've always been interested in learning how to decompile projects, so this is quite cool! If any of you are in the thread, where/ how did you learn to decompile programs?

Most of my work was on cleaning up the C, but I've started learning how to decompile, simply through asking questions to people in the community about MIPS assembly, how the ROM is layed out, etc. It's not very hard to decompile a given bit of ASM if you know MIPS and C; it also requires experience with the compiler's output to know how to make optimized ASM match the compiled C.
 

halo5307

Well-Known Member
Newcomer
Joined
Dec 29, 2015
Messages
84
Trophies
0
Age
34
XP
215
Country
United States
Most of my work was on cleaning up the C, but I've started learning how to decompile, simply through asking questions to people in the community about MIPS assembly, how the ROM is layed out, etc. It's not very hard to decompile a given bit of ASM if you know MIPS and C; it also requires experience with the compiler's output to know how to make optimized ASM match the compiled C.

Very interesting, thanks! Until now I haven't even heard of MIPS assembly. I'll make sure to take a look at it.

My experience with programming is really limited to just Java, but C is on my radar. I hear projects are a good way to learn -- while something like this is out of my scope, it does give me incentive to start something smaller.
 

mountainflaw

Member
Newcomer
Joined
Jul 11, 2018
Messages
22
Trophies
0
Age
34
XP
107
Country
Australia
Basically, this started off a disassembly with almost no symbols named using n64split. People would just pick a file to convert to C in a manner that nim described.
 

halo5307

Well-Known Member
Newcomer
Joined
Dec 29, 2015
Messages
84
Trophies
0
Age
34
XP
215
Country
United States
Basically, this started off a disassembly with almost no symbols named using n64split. People would just pick a file to convert to C in a manner that nim described.
So you guys must've had to use the code itself to infer what symbol to give it. Makes sense to me. Must've been a pain dealing with 20+ year old code. Hope there wasn't too many roadbumps on your side.

I'm at work currently, but I'd love to read the documentation :P

--------------------- MERGED ---------------------------

Excuse me. I've been looking at the code on and off since I posted this thread. I have a different vision for porting it than total accuracy.
Anyone can read code. I'm willing to bet that if I pointed to a line of code and asked you explain exactly what it did, you would have no idea. It's for this reason programmers are paid so much, it's hard work! It's even rarer for passion projects like this to come to fruition, because you are essentially working for free. So please, treat the devs with respect.

I can infer this because of how excited you are for the project, and how angsty you got when you realized that nobody was researching the code alongside you.
If it's publicly available, it's public domain. It's what I live by.
There are....so many things wrong with this statement. Best of luck with your future mate, thanks for the interesting find.
 
  • Like
Reactions: Zidapi

nim-ka

Member
Newcomer
Joined
Sep 5, 2018
Messages
21
Trophies
0
Age
25
XP
92
Country
United States
Just wanna reiterate, lots of people *are* investigating SM64's code with this, it's just a relatively obscure community
 

FAST6191

Techromancer
Editorial Team
Joined
Nov 21, 2005
Messages
36,798
Trophies
3
XP
28,321
Country
United Kingdom
So does this mean it's graphics can be drastically upgraded now?
Others already went but that will depend upon what you mean by that.

It is early efforts at N64 3d and later devs (homebrew, demoscene and commercial) will likely have discovered more ways to push the hardware to the limit (don't know if we will be seeing expansion pack edition any time soon). Such things will be far easier added at source level than playing ROM hacker to do that -- most ROM hacking is limited by practicality to fun things like big head, simple texture or colour swaps, simple alignments tweaks to more closely match hit boxes, simple things to change sizes of items but keep them essentially the same rather than implementing a trick found by some dev 5 years into the system life to effect a slightly better anti aliasing or whatever. I don't know how far the code leans into the hardware at this point (sometimes such things make it harder to adapt, even within the hardware) but if it is reasonable within hardware then it is should be doable.

If source is available then realistically it could also be ported to another system, like the PC, and then all bets are off. Others have mentioned how tricky such things likely will be -- early 3d does not translate that well to modern opengl or directx approaches to the world owing to some quirks so it is a bit more work in some regards than it would be to convert a 360 game to a modern PC, though still far easier than a mechanics recreation followed by an asset conversion/recreation. Or go another way notice the general lack of widescreen hacks of PS1-N64 era games when later stuff is littered with them (even within what is still almost hardware backwards compatibility) -- there is a reason for that one.

The N64 however is something of a poster child for the texture replacement at emulator level approach (usually referred to by names like HD texture replacement) -- it was around before, and has been taken to greater heights on newer systems, but the N64 is where most of the world was introduced to it all. Initially I am not expecting too much here (it has already been mapped fairly well and replacements already achievable) but it does open the door to stick a whole bunch of debug flags in a game that an emulator might be more carefully crafted around, future emulators given to use (part of the problem with texture hacks is how not portable they actually are, and how much effort over baseline emulation they require), or non hacker/coder end users to swap out their own textures at file level to create a nice mix and match approach.
As such things have happened for years then technically speaking this is not going to see drastic upgrades in the short term beyond what you, a special PC emulator and finding a working download of a texture pack can probably do over the next few hours if you were so inclined.
 

Ecchi95

Well-Known Member
OP
Member
Joined
Jul 7, 2019
Messages
121
Trophies
0
Age
29
XP
891
Country
United States
I'm willing to bet that if I pointed to a line of code and asked you explain exactly what it did, you would have no idea. It's for this reason programmers are paid so much, it's hard work! It's even rarer for passion projects like this to come to fruition, because you are essentially working for free. So please, treat the devs with respect.

I can infer this because of how excited you are for the project, and how angsty you got when you realized that nobody was researching the code alongside you.
I am a programmer. I'm making my own Nintendo 64 emulator.

Code:
#include <badn64.h>
#include <stdio.h>

/*
RAM: rambus dynamic Random-Access Memory
REG: rdram REGisters
RCP: Reality CoProcessor
LEO: LEO registers (64DD)
IPL: Initial Program Load (64DD)
STA: STAtic ram
ROM: ROM (cartridge)
PIF: Program Information File
*/

/* enumerated memory segments */
enum SEGMENT{RAM, REG, RCP, LEO, IPL, STA, ROM, PIF};

/* lookup table for the purpose of translating n64 memory segments to our segments */
BYTE LUT[0x20]=
{/* 0x00           0x03 0x04 0x05 0x06      0x08                                    */
    RAM,   0,   0, REG, RCP, LEO, IPL,   0, STA,   0,   0,   0,   0,   0,   0,   0,
   
 /* 0x10                                                                       0x1F */
    ROM,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0, PIF
};

WORD loop;

/* function for sign extension of immediate values */
WORD extend(HALF imm);
WORD extend(HALF imm)
{
    WORD extended;
    if(sign(imm)==0x8000)
    {
         extended=0xffff0000|imm;
    }
    else
    {
         extended=imm;
    }
    return extended;
}

/* function for reading 32 bits */
WORD read32(BYTE *mem[8], WORD ptr);
WORD read32(BYTE *mem[8], WORD ptr)
{
    BYTE temp=(ptr>>24)&0x1f;
    switch(LUT[temp])
    {
         case REG:
              ptr&=0x0fffff;
              break;
         case PIF:
              ptr&=0x3fffff;
              break;
         default:
              ptr&=0xffffff;
              break;
    }
    return (WORD)((mem[LUT[temp]][ptr]<<24)|(mem[LUT[temp]][ptr+1]<<16)|(mem[LUT[temp]][ptr+2]<<8)|mem[LUT[temp]][ptr+3]);
}

void handle_pi(BYTE *mem[8], WORD ptr);
void handle_pi(BYTE *mem[8], WORD ptr)
{
    WORD dram_addr, cart_addr, rd_len, wr_len;
    dram_addr=read32(mem,0xa4600000);
    cart_addr=read32(mem,0xa4600004)&0xffffff;
    rd_len=1+read32(mem,0xa4600008);
    wr_len=1+read32(mem,0xa460000c);
    if(ptr==0x600008)
    {
    }
    else if(ptr==0x60000c)
    {
         for(loop=0;loop<wr_len;loop+=1)
         {
              mem[RAM][dram_addr+loop]=mem[ROM][cart_addr+loop];
         }
    }
}

/* function for writing 32 bits */
void write32(BYTE *mem[8], WORD ptr, WORD val);
void write32(BYTE *mem[8], WORD ptr, WORD val)
{
    BYTE handle=9;
    BYTE temp=(ptr>>24)&0x1f;
    switch(LUT[temp])
    {
         case REG:
              ptr&=0x0fffff;
              break;
         case RCP:
              handle=(ptr&0xf00000)>>20;
              ptr&=0xffffff;
              break;
         case PIF:
              ptr&=0x3fffff;
              break;
         default:
              ptr&=0xffffff;
              break;
    }
    mem[LUT[temp]][ptr]=(val>>24)&0xff;
    mem[LUT[temp]][ptr+1]=(val>>16)&0xff;
    mem[LUT[temp]][ptr+2]=(val>>8)&0xff;
    mem[LUT[temp]][ptr+3]=val&0xff;
    switch(handle)
    {
         case 6:
              handle_pi(mem,ptr);
              break;
    }
}

int main(int argc, char **argv)
{
    const char names[32][3]=
    {
         "r0", "at", "v0", "v1",
         "a0", "a1", "a2", "a3",
         "t0", "t1", "t2", "t3",
         "t4", "t5", "t6", "t7",
         "s0", "s1", "s2", "s3",
         "s4", "s5", "s6", "s7",
         "t8", "t9", "k0", "k1",
         "gp", "sp", "fp", "ra"
    };
    BYTE *mem[8], delay=0, depth=0;
    WORD CPU[32]={0},CP0[32]={0},pc=0xa4000040,tick=0,ticks,inst;
    unsigned long long sixtyfour;
    WORD hi, lo;
    FILE *cart=fopen(argv[1],"rb");
    fseek(cart,0,SEEK_END);
    loop=ftell(cart);
    WORD sizes[8]=
    {
         0x800000, /* RAM */
         0x100000, /* REG */
         0x80001c, /* RCP */
         0x0005c0, /* LEO */
         0x400000, /* IPL */
         0x008000, /* STA */
         loop,     /* ROM */
         0x000800  /* PIF */
    };
    for(loop=0;loop<8;loop+=1)
    {
         mem[loop]=(BYTE*)calloc(sizes[loop],1); /* allocate our memory segments and init to zero */
    }
    fseek(cart,0,SEEK_SET);
    fread(mem[ROM],1,sizes[ROM],cart);
    fclose(cart);
    for(loop=0;loop<0xfc0;loop+=1)
    {
         mem[RCP][0x40+loop]=mem[ROM][0x40+loop];
    }
    write32(mem,0xa4001000,0x3c0dbfc0); /* SP IMEM +0 */
    write32(mem,0xa4001004,0x8da807fc); /* SP IMEM +4 */
    write32(mem,0xa4001008,0x25ad07c0); /* SP IMEM +8 */
    write32(mem,0xa400100c,0x31080080); /* SP IMEM +0xC */
    write32(mem,0xa4001010,0x5500fffc); /* SP IMEM +0x10*/
    write32(mem,0xa4001014,0x3c0dbfc0); /* SP IMEM +0x14*/
    write32(mem,0xa4001018,0x8da80024); /* SP IMEM +0x18*/
    write32(mem,0xa400101c,0x3c0bb000); /* SP IMEM +0x1C*/
    write32(mem,0xa4040010,1);          /* SP status */
    write32(mem,0xa4300004,0x02020102); /* MI version*/
    write32(mem,0xa4600014,0x40);       /* PI dom1 latency */
    write32(mem,0xa4600018,0x12);       /* PI dom1 pulse width */
    write32(mem,0xa460001c,7);          /* PI dom1 page size */
    write32(mem,0xa4600020,3);          /* PI dom1 dom1 release*/
    CPU[S4]=1;                          /* tv type */
    CPU[S6]=0x3f;                       /* seed */
    CPU[T3]=pc;
    CPU[SP]=0xa4001ff0;                 /* Stack Pointer */
    CPU[RA]=0xa4001550;                 /* Return Address*/
    /*printf("How many ticks? ");*/
    /*scanf("%d",&ticks);*/
    while(tick<0xffffffff)
    {
         if(pc==read32(mem,0xb0000008))
         {
              printf("%10d\n",tick);
              break;
         }
         inst=read32(mem,pc);
         switch(op(inst))
         {
              case SPECIAL:
                   switch(funct(inst))
                   {
                        case SLL:
                             CPU[rd(inst)]=CPU[rt(inst)]<<sa(inst);
                             break;
                        case SRL:
                             CPU[rd(inst)]=CPU[rt(inst)]>>sa(inst);
                             break;
                        case SLLV:
                             CPU[rd(inst)]=CPU[rt(inst)]<<(CPU[rs(inst)]&0x1f);
                             break;
                        case SRLV:
                             CPU[rd(inst)]=CPU[rt(inst)]>>(CPU[rs(inst)]&0x1f);
                             break;
                        case JR:
                             if(delay==0)
                             {
                                  delay=1;
                             }
                             else if(delay==3)
                             {
                                  pc=CPU[rs(inst)]-4;
                                  if(pc==CPU[RA]-4)
                                  {
                                       depth-=1;
                                       printf("%10d %08x: ",tick+1,pc+4);
                                       for(loop=0;loop<depth;loop+=1)
                                       {
                                            printf(" ");
                                       }
                                       printf("returned from function\n");
                                  }
                                  delay=0;
                             }
                             break;
                        case JALR:
                             if(delay==0)
                             {
                                  CPU[rd(inst)]=pc+8;
                                  delay=1;
                             }
                             else if(delay==3)
                             {
                                  pc=CPU[rs(inst)]-4;
                                  delay=0;
                             }
                             break;
                        case MFLO:
                             CPU[rd(inst)]=lo;
                             break;
                        case MULTU:
                             sixtyfour=CPU[rs(inst)]*CPU[rt(inst)];
                             hi=sixtyfour>>32;
                             lo=sixtyfour&0xffffffff;
                             break;
                        case ADD:
                             CPU[rd(inst)]=CPU[rs(inst)]+CPU[rt(inst)];
                             break;
                        case ADDU:
                             CPU[rd(inst)]=CPU[rs(inst)]+CPU[rt(inst)];
                             break;
                        case SUBU:
                             CPU[rd(inst)]=CPU[rs(inst)]-CPU[rt(inst)];
                             break;
                        case AND:
                             CPU[rd(inst)]=CPU[rs(inst)]&CPU[rt(inst)];
                             break;
                        case OR:
                             CPU[rd(inst)]=CPU[rs(inst)]|CPU[rt(inst)];
                             break;
                        case XOR:
                             CPU[rd(inst)]=CPU[rs(inst)]^CPU[rt(inst)];
                             break;
                        case SLT:
                             CPU[rd(inst)]=0;
                             if((int)(CPU[rs(inst)])<(int)(CPU[rt(inst)]))
                             {
                                  CPU[rd(inst)]=1;
                             }
                             break;
                        case SLTU:
                             CPU[rd(inst)]=0;
                             if(CPU[rs(inst)]<CPU[rt(inst)])
                             {
                                  CPU[rd(inst)]=1;
                             }
                             break;
                   }
                   break;
              case REGIMM:
                   switch(rt(inst))
                   {
                        case BLTZ:
                             if(delay==0)
                             {
                                  delay=1;
                             }
                             else if(delay==3)
                             {
                                  if(CPU[rs(inst)]<0)
                                  {
                                       pc+=extend(imm(inst)<<2)-4;
                                  }
                                  pc+=4;
                                  delay=0;
                             }
                             break;
                        case BGEZ:
                             if(delay==0)
                             {
                                  delay=1;
                             }
                             else if(delay==3)
                             {
                                  if(CPU[rs(inst)]>=0)
                                  {
                                       pc+=extend(imm(inst)<<2)-4;
                                  }
                                  pc+=4;
                                  delay=0;
                             }
                             break;
                        case BLTZL:
                             if(delay==0)
                             {
                                  if(CPU[rs(inst)]<0)
                                  {
                                       delay=1;
                                  }
                                  else
                                  {
                                       pc+=4;
                                  }
                             }
                             else if(delay==3)
                             {
                                  pc+=extend(imm(inst)<<2);
                                  delay=0;
                             }
                             break;
                        case BGEZL:
                             if(delay==0)
                             {
                                  if(CPU[rs(inst)]>=0)
                                  {
                                       delay=1;
                                  }
                                  else
                                  {
                                       pc+=4;
                                  }
                             }
                             else if(delay==3)
                             {
                                  pc+=extend(imm(inst)<<2);
                                  delay=0;
                             }
                             break;
                        case BLTZAL:
                             if(delay==0)
                             {
                                  CPU[RA]=pc+8;
                                  delay=1;
                             }
                             else if(delay==3)
                             {
                                  if(CPU[rs(inst)]<0)
                                  {
                                       pc+=extend(imm(inst)<<2)-4;
                                  }
                                  pc+=4;
                                  delay=0;
                             }
                             break;
                        case BGEZAL:
                             if(delay==0)
                             {
                                  CPU[RA]=pc+8;
                                  delay=1;
                             }
                             else if(delay==3)
                             {
                                  if(CPU[rs(inst)]>=0)
                                  {
                                       pc+=extend(imm(inst)<<2)-4;
                                  }
                                  pc+=4;
                                  delay=0;
                             }
                             break;
                        case BLTZALL:
                             if(delay==0)
                             {
                                  CPU[RA]=pc+8;
                                  if(CPU[rs(inst)]<0)
                                  {
                                       delay=1;
                                  }
                                  else
                                  {
                                       pc+=4;
                                  }
                             }
                             else if(delay==3)
                             {
                                  pc+=extend(imm(inst)<<2);
                                  delay=0;
                             }
                             break;
                        case BGEZALL:
                             if(delay==0)
                             {
                                  CPU[RA]=pc+8;
                                  if(CPU[rs(inst)]>=0)
                                  {
                                       delay=1;
                                  }
                                  else
                                  {
                                       pc+=4;
                                  }
                             }
                             else if(delay==3)
                             {
                                  pc+=extend(imm(inst)<<2);
                                  delay=0;
                             }
                             break;
                   }
                   break;
              case J:
                   if(delay==0)
                   {
                        delay=1;
                   }
                   else if(delay==3)
                   {
                        pc=((pc&0xf0000000)|target(inst))-4;
                        delay=0;
                   }
                   break;
              case JAL:
                   if(delay==0)
                   {
                        CPU[RA]=pc+8;
                        delay=1;
                   }
                   else if(delay==3)
                   {
                        printf("%10d %08x: ",tick+1,pc);
                        for(loop=0;loop<depth;loop+=1)
                        {
                             printf(" ");
                        }
                        pc=((pc&0xf0000000)|target(inst))-4;
                        printf("jumped inside function (%08x)\n",pc+4);
                        depth+=1;
                        delay=0;
                   }
                   break;
              case BEQ:
                   if(delay==0)
                   {
                        delay=1;
                   }
                   else if(delay==3)
                   {
                        if(CPU[rs(inst)]==CPU[rt(inst)])
                        {
                             pc+=extend(imm(inst)<<2)-4;
                        }
                        pc+=4;
                        delay=0;
                   }
                   break;
              case BNE:
                   if(delay==0)
                   {
                        delay=1;
                   }
                   else if(delay==3)
                   {
                        if(CPU[rs(inst)]!=CPU[rt(inst)])
                        {
                             pc+=extend(imm(inst)<<2)-4;
                        }
                        pc+=4;
                        delay=0;
                   }
                   break;
              case BLEZ:
                   if(delay==0)
                   {
                        delay=1;
                   }
                   else if(delay==3)
                   {
                        if(CPU[rs(inst)]<=0)
                        {
                             pc+=extend(imm(inst)<<2)-4;
                        }
                        pc+=4;
                        delay=0;
                   }
                   break;
              case BGTZ:
                   if(delay==0)
                   {
                        delay=1;
                   }
                   else if(delay==3)
                   {
                        if(CPU[rs(inst)]>0)
                        {
                             pc+=extend(imm(inst)<<2)-4;
                        }
                        pc+=4;
                        delay=0;
                   }
                   break;
              case ADDI:
                   CPU[rt(inst)]=CPU[rs(inst)]+extend(imm(inst));
                   break;
              case ADDIU:
                   CPU[rt(inst)]=CPU[rs(inst)]+extend(imm(inst));
                   break;
              case SLTI:
                   CPU[rt(inst)]=0;
                   if((int)(CPU[rs(inst)])<(int)(extend(imm(inst))))
                   {
                        CPU[rt(inst)]=1;
                   }
                   break;
              case SLTIU:
                   CPU[rt(inst)]=0;
                   if(CPU[rs(inst)]<extend(imm(inst)))
                   {
                        CPU[rt(inst)]=1;
                   }
                   break;
              case ANDI:
                   CPU[rt(inst)]=CPU[rs(inst)]&imm(inst);
                   break;
              case ORI:
                   CPU[rt(inst)]=CPU[rs(inst)]|imm(inst);
                   break;
              case XORI:
                   CPU[rt(inst)]=CPU[rs(inst)]^imm(inst);
                   break;
              case LUI:
                   CPU[rt(inst)]=imm(inst)<<16;
                   break;
              case COP0:
                   switch(rs(inst))
                   {
                        case MT:
                             CP0[rd(inst)]=CPU[rt(inst)];
                             break;
                   }
                   break;
              case BEQL:
                   if(delay==0)
                   {
                        if(CPU[rs(inst)]==CPU[rt(inst)])
                        {
                             delay=1;
                        }
                        else
                        {
                             pc+=4;
                        }
                   }
                   else if(delay==3)
                   {
                        pc+=extend(imm(inst)<<2);
                        delay=0;
                   }
                   break;
              case BNEL:
                   if(delay==0)
                   {
                        if(CPU[rs(inst)]!=CPU[rt(inst)])
                        {
                             delay=1;
                        }
                        else
                        {
                             pc+=4;
                        }
                   }
                   else if(delay==3)
                   {
                        pc+=extend(imm(inst)<<2);
                        delay=0;
                   }
                   break;
              case BLEZL:
                   if(delay==0)
                   {
                        if(CPU[rs(inst)]<=0)
                        {
                             delay=1;
                        }
                        else
                        {
                             pc+=4;
                        }
                   }
                   else if(delay==3)
                   {
                        pc+=extend(imm(inst)<<2);
                        delay=0;
                   }
                   break;
              case BGTZL:
                   if(delay==0)
                   {
                        if(CPU[rs(inst)]>0)
                        {
                             delay=1;
                        }
                        else
                        {
                             pc+=4;
                        }
                   }
                   else if(delay==3)
                   {
                        pc+=extend(imm(inst)<<2);
                        delay=0;
                   }
                   break;
              case LW:
                   loop=CPU[rs(inst)]+extend(imm(inst));
                   CPU[rt(inst)]=read32(mem,loop);
                   break;
              case LBU:
                   loop=CPU[rs(inst)]+extend(imm(inst));
                   CPU[rt(inst)]=read32(mem,loop)>>24;
                   break;
              case SB:
                   loop=CPU[rs(inst)]+extend(imm(inst));
                   switch(LUT[(loop>>24)&0x1f])
                   {
                        case REG:
                             mem[REG][loop&0x0fffff]=CPU[rt(inst)]&0xff;
                             break;
                        case PIF:
                             mem[PIF][loop&0x3fffff]=CPU[rt(inst)]&0xff;
                             break;
                        default:
                             mem[LUT[(loop>>24)&0x1f]][loop&0xffffff]=CPU[rt(inst)]&0xff;
                             break;
                   }
                   break;
              case SW:
                   loop=CPU[rs(inst)]+extend(imm(inst));
                   write32(mem,loop,CPU[rt(inst)]);
                   break;
              case CACHE:
                   break;
         }
         pc+=4;
         tick+=1;
         if(delay>0)
         {
              delay+=1;
              tick-=1;
              if(delay==3)
              {
                   pc-=8;
                   tick+=1;
              }
         }
    }
    for(loop=0;loop<32;loop+=1)
    {
         printf("%s: %08x ",names[loop],CPU[loop]);
         if(loop%4==3)
         {
              printf("\n");
         }
    }
    printf("pc: %08x\n",pc);
    system("pause");
    for(loop=0;loop<8;loop+=1)
    {
         free(mem[loop]);
    }
    return 0;
}
 
  • Like
Reactions: DRAGONBALLVINTAGE

TheMrIron2

Well-Known Member
Member
Joined
Aug 5, 2017
Messages
218
Trophies
0
XP
978
Country
Ireland
It would be such a colossal pain in the ass to rework graphics code alone for PC. I'll explain in a little more detail, but my intention here is not to steal the spotlight from the actual developers who have done a ton of work.

Super Mario 64 was an early N64 game that was not among the platform's most extreme efforts, from a programming or technical perspective. But even despite this, it's worth noting a few things about the game - or the N64 hardware in general. The N64 had the main CPU, the function of which can be found by googling what a processor is, but it also had a separate graphics chip - though I'd say it was too primitive to be called a "GPU" by today's standards. This second chip was called the Reality Coprocessor, or RCP.

The RCP was further subdivided into the Reality Signal Processor (RSP), which handled the geometry and some audio, and the Reality Display Processor (RDP), which handled display lists and other functionality which I won't delve into in too much detail. The RSP used "microcode" at runtime, which could be compared to modern-day shader languages (eg. OpenGL's GLSL) - Mario 64 used the very first iteration of Nintendo's microcode, Fast3D (meaning, on a side note, it was never going to be able to push quite as many polygons as later N64 games - but microcode on N64 is a whole other discussion!). As FAST mentioned above, despite passing resemblances to modern shader languages, a lot of this graphics code is very old and is just not conducive to new graphics cards which are geared for use with modern rendering techniques. Even the Xbox 360 and PS3 were not particularly suited to the PS2's rendering techniques, let alone an even older console vs. the latest PCs.

So consider this, before considering a PC port; are you willing to rework all of the microcode-related code, all of the RCP/RDP code for graphics and audio and any MIPS code (if any still exists - I haven't looked at the code out of respect until it's finished) to port this game? Such an undertaking is massive even with knowledge of the system and the game. I really want everyone to settle down with their expectations of what can be done with this project.

More pragmatically, this could be used to attain better performance on real N64 hardware. That's nice, given that the game does not always maintain its 30FPS target, and many mods suffer from performance problems(mostly due to loading textures into the paltry 4KB cache and the high latency the RAM has, despite the system advantage of using high-bandwidth cartridges...) - so this opens up new avenues for the incredibly developed modding scene. The microcode could be updated, but even more simply, you could just compile the game with newer tools and with actual optimisation flags this time (Thanks, Nintendo!) to see some gains.
And while I am not ruling out a port of the game to other platforms (primarily PC, but not limited to it!) it would require a lot of work, even if knowing exactly what you're doing with an understanding of the original code, hardware and your target platform.

TLDR: Porting is complicated, if you want to understand why you can read the whole post but at least N64 owners can benefit from better performance if someone wants to work on that.
 
  • Like
Reactions: uyjulian

nim-ka

Member
Newcomer
Joined
Sep 5, 2018
Messages
21
Trophies
0
Age
25
XP
92
Country
United States
Some little comments:

any MIPS code (if any still exists - I haven't looked at the code out of respect until it's finished)
No assembly is left except that of a few audio functions whose C have not been made to match 1-1 with the assembly; for hacks and ports, this obviously doesn't matter

The microcode could be updated, but even more simply, you could just compile the game with newer tools and with actual optimisation flags this time (Thanks, Nintendo!) to see some gains.
Compiling SM64 with optimizations and using F3DEX/F3DEX2 microcode has been done and does allow for huge performance improvements. These have been implemented in the SM64 Randomizer
 
  • Like
Reactions: TheMrIron2

TheMrIron2

Well-Known Member
Member
Joined
Aug 5, 2017
Messages
218
Trophies
0
XP
978
Country
Ireland
No assembly is left except that of a few audio functions whose C have not been made to match 1-1 with the assembly; for hacks and ports, this obviously doesn't matter

Compiling SM64 with optimizations and using F3DEX/F3DEX2 microcode has been done and does allow for huge performance improvements. These have been implemented in the SM64 Randomizer

Thanks for the clarification, I appreciate it. I imagined that for a decompilation project, no MIPS code would remain as the end goal, but I couldn't be sure.

No surprise that newer microcode allows for massive performance improvements. I'm not sure how "huge" we're talking here - I could hardly imagine a 60FPS version of Mario 64 on real hardware - but newer microcode is stupidly more efficient than the original Fast3D, for sure. Factor 5 went ballistic with their own microcode instead of SGI's and fair play to them, they really pulled off some cool stuff.

I'm fairly sure the game gets penalised harshly based on fillrate and bandwidth rather than geometry (perhaps that's more of a general N64 issue); to my understanding, the game uses some Gouraud shading just to avoid loading new textures and wasting time with memory latency and the texture cache, and mod performance is heavily affected. Glad to see this improved at least.

Has anyone tried to make a "high resolution" patch for Mario 64, in the vein of GoldenEye 64's 640x480 hack? I'm not aware of one, but if one exists, pardon my ignorance - while I've contributed to a N64 emulator myself and I understand the hardware, I'm quite out of touch with the scene.
 

CrashOveride

Well-Known Member
Newcomer
Joined
May 29, 2017
Messages
50
Trophies
0
XP
127
Country
United States
>Has anyone tried to make a "high resolution" patch for Mario 64, in the vein of GoldenEye 64's 640x480 hack? I'm not aware of one, but if one exists, pardon my ignorance - while I've contributed to a N64 emulator myself and I understand the hardware, I'm quite out of touch with the scene.

SubDrag did a patch.... it lagged like hell.

>If it's available, it's public domain. It's what I live by.

This makes absolutely no sense in legal perspectives. That's like if I hack valve and steal the source to GoldSrc, then spread it around because the binary exists

Another example is including the SM64 rom with the Rom Manager. I mean it's publicly available right?
 
Last edited by CrashOveride,

Site & Scene News

Popular threads in this forum

General chit-chat
Help Users
    ButterScott101 @ ButterScott101: +1