SGI O2 / IP32 PROM rev4.3 firmware loop after SHDR handoff, looking for hardware-accurate guidance

theck42

New member
Mar 24, 2026
7
2
3
Hello everyone,


I am working on a hardware-oriented SGI O2 / IP32 simulation-emulation project and I am trying to stay as faithful as possible to the real machine, not just patch around boot loops.


Current state:


  • PROM handoff works
  • SHDR is detected correctly
  • post1 runs
  • firmware is copied to RAM at 0x81000000
  • firmware execution begins successfully
  • storage bridge is visible
  • CD-ROM and HDD #1 are both attached and probed

I tested two PROMs:


  • ip32prom.rev4.18.bin
  • ip32prom.rev4.3.bin

Interesting result:


  • rev4.18 tends to get stuck in a firmware loop that looks related to CP0 / TLB / cache / low-level init
  • rev4.3 does not end in the same loop; instead it appears to remain in a repeated firmware memory-clear / zeroing routine

From my traces, the system is already past the simple PROM stage and sees storage, so I suspect the next real problem is one of these:


  1. incomplete CP0 / TLB / MMU behavior
  2. inaccurate CRIME / MACE interaction
  3. missing hardware-level expectations during low-level firmware memory initialization
  4. incorrect assumptions about O2 onboard SCSI / PCI topology behind MACE

What I am trying to do:


  • stay on PROM rev4.3
  • model the O2 as a real IP32 system
  • eventually reach true ARCS boot behavior
  • then load sashARCS, fx.ARCS, and finally perform a real IRIX installation path using CD-ROM + HDD #1

Questions:


  1. On a real O2, during early firmware RAM execution after post1, what hardware blocks are most critical to get right first: CP0/TLB, CRIME memory behavior, MACE PCI bridge behavior, or onboard dual SCSI?
  2. Are there known PROM rev4.3 behaviors involving long memory-clear loops that usually indicate a missing hardware acknowledgement or MMU/TLB issue?
  3. Does anyone have notes about which CRIME/MACE registers the O2 PROM depends on most heavily before ARCS media loading begins?
  4. Are there known differences between rev4.3 and rev4.18 in early firmware behavior that could explain why one falls into a TLB-style loop and the other into a memory-clear loop?
  5. If anyone has logic traces, register notes, PROM reverse engineering notes, or hardware bring-up observations for early O2 firmware stages, I would be very grateful.

My goal is not just to “boot somehow,” but to reproduce the real machine behavior as closely as possible.


Thank you very much.
 
  • Like
Reactions: oliverx74
Regarding question 5, maybe these help:


 
thx

I’m currently working on building an SGI O2 emulator/simulator to install the OS and actually get it running. I’ll be back soon with updates on my progress!

[UARTINJECT] pc=0xffffffff81009824 injected=0x0d into serial1 RX
[UARTLINE] ds2502_get_eaddr: ds2502_read_rom failed!
[UARTLINE] ds2502_get_eaddr: ds2502_read_rom failed!
[UARTLINE] Cannot connect to keyboard -- check the cable.
[UARTLINE] Cannot open keyboard() for input
[UARTLINE] Cannot connect to keyboard -- check the cable.
[UARTLINE] Cannot open keyboard() for input
[UARTLINE] Initialized tod clock.
[UARTLINE] Initialized tod clock.
[UARTLINE] Initialized tod clock.
[UARTLINE] Initialized tod clock.
[UARTLINE] Initialized tod clock.
[UARTLINE] Initialized tod clock.
[UARTLINE] Running power-on diagnostics...
[UARTINJECT] menu detected, injected '2\r' into serial1 RX
[UARTLINE] System Maintenance Menu
[UARTLINE] 1) Start System
[UARTLINE] 2) Install System Software
[UARTLINE] 3) Run Diagnostics
[UARTLINE] 4) Recover System
[UARTLINE] 5) Enter Command Monitor
[UARTLINE] Option? 2
[UARTLINE] Installing System Software...
[UARTLINE] Press <Esc> to return to the menu.
[UARTLINE] 1) Remote Tape 2) Remote Directory X) Local CD-ROM X) Local Tape
[UARTLINE] Enter 1-4 to select source type, <esc> to quit,
PC final = 0x810090f8
Pas exécutés = 8000000
 
  • Like
Reactions: sdz
Running power-on diagnostics...



System Maintenance Menu

1) Start System
2) Install System Software
3) Run Diagnostics
4) Recover System
5) Enter Command Monitor

Option? 4


System Recovery...

Press <Esc> to return to the menu.



1) Remote Tape 2) Remote Directory X) Local CD-ROM X) Local Tape

Enter 1-4 to select source type, <esc> to quit,
or <enter> to start: 1
 
Terminal ready


ds2502_get_eaddr: ds2502_read_rom failed!
ds2502_get_eaddr: ds2502_read_rom failed!
adp78edtinit: init_adapter failedadp78edtinit: init_adapter failedCannot connect to keyboard -- check the cable.
Cannot open keyboard() for input
Cannot connect to keyboard -- check the cable.
Cannot open keyboard() for input

Initialized tod clock.

Initialized tod clock.

Initialized tod clock.

Initialized tod clock.

Initialized tod clock.

Initialized tod clock.



Running power-on diagnostics...



System Maintenance Menu

1) Start System
2) Install System Software
3) Run Diagnostics
4) Recover System
5) Enter Command Monitor

Option? 5
5
Command Monitor. Type "exit" to return to the menu.
> boot -f scsi(0)cdrom(4)partition(8)sashARCS

[ARCS-SHIM] sashARCS shim -> partition(7)/stand/fx.ARCS --x

[ARCS-SHIM] boot file not found in EFS partition 7: stand/fx.ARCS
>
 
well fwiw ip22 prom maps every memory bank for discovery using multiple 16MB TLB entries. then the whole memory clear is done after final bank map using VDMA.
but you really want to add a lot of introspection to your simulator to be able to figure out the problems.
Also, what is the most important? Correct CPU emulation before you move to anything else. Get your TLB and address translation and even FPU in order. PROM will use FPU as storage for extra variables in very early init. Then MC and memory emulation.
 
Last edited:
The goal is to achieve a fully functional and usable emulation.

Most importantly, we have already overcome the most complex part: the program correctly loads the PROM, the CPU, and the bus, and the project already has a stable MMIO infrastructure with CRIME, MACE, the PCI host, the timer, the UART, and a minimal PS/2 interface. The CPU also integrates several robust accelerations for resource-intensive PROM loops.

My assessment by critical point:

Overall execution infrastructure (CPU/PROM): 80%

The core boots, runs fully, and we have already overcome several performance hurdles thanks to the PROM accelerations.

Critical MMIO (CRIME, MACE, PCI, timer, UART): 75%

The infrastructure is in place and consistent. The bus already exposes the core blocks, the CRIME timer is modeled, the MACE startup polling is also implemented, and the local 16550 UART exists with a feedback loop.

Console/UART useful for commissioning: 80%

We have already observed that the PROM communicates via the UART and performs keyboard/TOD tests. In terms of visibility for debugging, we are therefore much better off than at the beginning. UART and console autoinjection is already implemented in the bus.

Keyboard/PS/2: 65%
Sufficient to overcome the initial hurdle, but it is not yet "perfectly finished." The PS/2 database exists in mmio_boot.py, but it is still a minimal model.

Video/GBE/Physical Display: 35%
We have MMIO presence and tracing around the console/video area, but not yet a video pipeline that I would consider "fully unlocked." The runtime environment can trace this area, but it's not yet a fully functional SGI O2 display.

ARC Environment/Boot Firmware Logic: 55%
This is where we are currently working. The good news is that we are no longer blocked by a hardware issue: we are now working on the firmware logic, specifically the ARC splitter and variables. Functional Boot from CD/Disc: 20-25%. We have done a significant amount of preliminary cleanup, but we cannot yet say that "the road to a bootable IRIX system is almost ready." The firmware/ARC logic still needs to be implemented before we can expect a clean boot.

In summary:

Minimal emulated hardware: very advanced
PROM firmware: very advanced
Splitter/ARC: in progress
Video: still far from perfect
OS boot: still far from perfect

To sum it up honestly:

We've completed about 75% of the low-level work, but only 25% of the work needed for a fully functional boot.

And the good news is that the current bottleneck is more firmware-related than hardware-related. This is great news for the future, as it means we've already accomplished a significant portion of the foundational work.
 
  • Out of the PROM stage: about 100%
  • Overall emulator boot progress: about 45%
What that means:

  • We are no longer stuck in PROM execution.
  • The emulator is now reaching later boot code in RAM / kseg0, not just PROM addresses.
  • But we are not yet near a full IRIX boot, because the current run still stops on a later CPU instruction issue at 0xffffffff81053160.
So the short English version is:

“PROM exit progress: 100%. Full boot progress: roughly 45%.”
 

About us

  • Silicon Graphics User Group (SGUG) is a community for users, developers, and admirers of Silicon Graphics (SGI) products. We aim to be a friendly hobbyist community for discussing all aspects of SGIs, including use, software development, the IRIX Operating System, and troubleshooting, as well as facilitating hardware exchange.

User Menu