O2 motherboard repairs

sdz

Member
Dec 10, 2025
41
78
18
Got two broken O2 motherboards a little while back.

Board #1:

S1.jpg

This board gave a solid red light when powering up, regardless if RAM or CPU was installed. No signs of life via serial or VGA. It also looks brand new, basically unused (except for a dent on the top serial port).

Since it behaved the same regardless if a CPU was inserted or not, I inspected the area near the CPU, and found this:

S2.jpg

S3.jpg

This is a factory defect, the inductor was never soldered on one side (the pad still has the ENIG finish, other side is soldered). This inductor powers the 133MHz oscillator on the other side of the board:

S4.jpg

Soldered the inductor and now the board is fully functional.
 
Board #2:

S1A.jpg

This board gave a solid amber light when powering up. No signs of life via serial or VGA. With no RAM installed, still solid amber light (it should be blinking amber if it got to the point of checking the RAM).
However, without a CPU module installed, it gave a solid red light. So at least it tries to do something when a CPU is installed.

Checked for shorts, broken components, broken traces, broken solder joints (CPU module connectors), power rails were all good, oscillators were all good.

So maybe the board is trying to start up, but something is wrong. Either some broken ASIC, or, corrupted PROM.

The PROM resides in the flash underneath the Dallas IC:

S2A.jpg

Now, I really don't want to use hot air to remove the flash IC. Besides the fact that I'll melt the socket, there is still a chance that the flash IC is still fine, just the data is corrupted for whatever reason. And from my experience, these old flash ICs do not appreciate that much being heated.

So, with a desoldering gun, I proceed to remove the Dallas socket:

S3A.jpg

Then, with the soldering iron, I heat up all the pins on one side, while gently lifting the IC up with tweezers, and then do the same for the other side:

S4A.jpg


Now, using my not at all cursed TSOP32 to DIP adapter, I proceed to dump the IC with a TL866 programmer:

S5A.jpg

S6A.png

And it worked, at least the flash IC isn't completely dead. Now, to get a PROM image and write to the flash.
I could desolder and dump the flash from the two other working O2 motherboards I have, but I'd like to avoid that.


My working O2 has 6.5.30 installed, and under /usr/cpu/firmware/ there's ip32prom.image (4.18) that can be used to flash the PROM with flashinst (if the system is working of course).

I scp-ed this file to a Linux box. Now, this may be a raw file, to be written byte by byte to the flash, or something else. Always good to make a diff first, even if what I dumped from the flash IC is maybe corrupted and maybe a different version.

S7A.png

And indeed, there is an extra header from 0x00 to 0xFF compared to the raw flash dump. Besides that, they are quite similar. No extra stuff at the end of the file.

So, just write the ip32rom.image to the flash IC, with an offset from 0x100.

S8A.png

S9A.png

And good news, it verified successfully, so the flash IC is OK (well, maybe it should be replaced in the future if something similar happens).

Soldered the flash back on the MB, soldered the socket, plugged in the Dallas IC and then plugged in the MB in the chasis with only the CPU installed, and, blinking amber light!

Good, it should be blinking as I haven't installed RAM yet. The fact that it's blinking means that the PROM was executed!

Inserted some RAM, and:

S10A.jpg
 
S11A.jpg

(yes, I don't have a SoG VGA monitor).

Also, if anyone is curious how the ASICs look like under the heatsinks (I wasn't able to find any photos), at least for this specific MB ( 030-1038-004 REV B):

S12A.jpg

The GBE is EVIL!
 
Last edited:
That missing solder on the inductor supposed to feed the oscillator is a really good catch.

Help clarify something for me - I've been running the ip32prom.rev4.18.bin through Gxemul in recent days, knowing that it's missing something at the beginning.

Now from what I know the ip32prom.rev4.18.bin that is floating around, it is usually obtained by using the dump command, which clearly doesn't capture what should be the whole contents of that PROM chip so it can boot, whereas the ip32prom.image does have the code necessary to boot successfully.

Speculating, but that second non-booting main board may have been accidentally flashed with ip32prom.rev4.18.bin by someone thinking that they were upgrading the PROM to 4.18.
 
  • Like
Reactions: chulofiasco
Those extra bytes at the start of ip32prom.image don't get written to the flash IC. Most likely they are there so that the flashing tool can idendify what it's actually flashing.
I don't think the flashing tool will let you flash a PROM image without that header. I can make more tests after I add a test socket (for the flash IC) to an O2 MB, so that I can easily recover it.
 
  • Love
Reactions: ruckusman
I think you're correct about them being instruction for flash - a sanity check, version check and perhaps functional check of various pin states.

It's tnteresting, because I was looking specifically at the opening ~32 bytes of the ip32prom.rev4.18.bin and using Gxemul, if I step through from 0x00000000

First output Gxemul gives is a TLBL exception - this could be incorrect BTW, or my misinterpretation of what Gxemul is actually doing.

Code:
[ exception TLBL vaddr=0x0000000000000000 pc=0x0000000000000000 ]
ffffffff80000180: 3c1abfc8        lui    k0,0xbfc8
GXemul>
ffffffff80000184: 375a8888        ori    k0,k0,0x8888
GXemul>
ffffffff80000188: 03400008        jr    k0    <[ARCBIOS entry]+0x888>

The arcbios entry (not implemented BTW) I thought was also interesting.

There's also two 16K boot blocks on the Atmel chip.

I think some interesting stuff happens there.

BTW I'm in the process of designing a test socket to read and flash the PROM IC in situ also - RM7900 upgrade related; Idea I had several years ago - the NVME, and your work has inspired me to get it done
 
Last edited:
  • Like
Reactions: mapesdhs
RM7900 upgrade sure would be nice :) . There is that 4.13 leaked PROM that might help.
The (other) PROM on the CPU module side should be pretty easy to figure out.
 
  • Like
Reactions: ruckusman
Patch SG0003530: O2 PROM rollup 4.11 1259999908 IRIX 6.3 Patches for RM5200 Support March 1999

That one interests me the most as it added another CPU to the PROM

There's a full suite of patches on the discord that Jan-jaap and jrra kindly provided - it's in there

https://discord.com/channels/524976640468713497/531292086125854720/1315661955117355118

https://discord.com/channels/524976640468713497/531292086125854720/1315738205584228393

It took me some time to read correctly the 32 bit boot time mode setup string from Chicago Joe 600Mhz mod - it's read out little endian - because the CPU isn't yet set to big endian - that caught me out - byte flipping made the values sensible and legal.

Now I've got the RM7965 data sheet, with luck the Xylinx PROM part is doable, as I perhaps know what I am doing as far as the RM7900 boot time mode setup string goes.

I need to cross compare between the RM7000C boot time mode setup and RM7900 values, I don't think the 600Mhz one would work for the RM7900.
 
  • Like
Reactions: mapesdhs and sdz
Thanks for linking those patches!

Good observation about the endianess! Would have likely been bitten by that too.

I will be making a PROM emulator of sorts, that replaces the Xilinx part, and is reprogrammable (one can change values live with an USB cable connected).
 
Thanks for linking those patches!

Good observation about the endianess! Would have likely been bitten by that too.

I will be making a PROM emulator of sorts, that replaces the Xilinx part, and is reprogrammable (one can change values live with an USB cable connected).

With all of the boot-time mode strings at cold boot = 0, MIPS boots in little endian, it's actually right there in the datasheet, because bit 8 hasn't been set to 1 for big endian.

https://picture.iczhiku.com/resource/eetop/ShkdaOJfHhEFimMM.pdf

Chicago Joe's set:
Hex: 0x80550102 -> Binary: 10000000 01010101 00000001 00000010

I used online byte swapping, or what I thought were byte swapping tools, on the hex, and on the 32 bit binary, they swapped by nibble, not byte - not helpful.

Seems there's no strict rule for this as it depends on the context, it can be by nibble, byte, double-byte or 32 bit word.

It really did my head in.

If I couldn't get Chicago Joes's known good working set to correctly correspond with the RM7000 datasheet...

I knew that I had no chance of getting the RM7900 bit time mode set string correct (It's longer than 32 bits) - from the RM7965 datasheet

https://www.datasheets360.com/pdf/-6475857418499999415 - pages 48 -> 52, significantly different.

BTW if you are able to read an R5K @ 180MHz or 200MHZboot-time string &/or RM5200 @ 300MHz mode string, would love to compare bits 17:16 of Chicage Joe's set as I've interpreted them according to the RM7000 data sheet, even though it's ignored, it's only one out of place so to speak.

I just bought a Tl866/T48, it's en route

A Xylinx PROM emulator, that would be a thing of beauty
 
Thanks for Chicago Joe's set. Wasn't able to find that one. I will compare it at some point against the datasheet.

I will dump all the PROMs I have, but waiting to get a few more CPU modules first. Currently I only have an R5k 180MHz and an R10k 150MHz.
 
I got the AT17LV65A.bin from discord - I mistakenly thought it was hosted on the sgidepot site - seems not - it won't allow me to upload the binary file, will forward to you on discord

Good thread here


The R5K boot time mode strings interest me from the perspective of verifying them against a data sheet to verify the RM7000C byte flipping
 
  • Like
Reactions: mapesdhs
I will dump the R5K PROM (180MHz, 512kB) really soon (maybe in a day or two). Almost finished designing the PROM emulator and I will need to validate some things on the O2 before ordering the boards (which will involve sniffing the PROM while the system starts.
 
The actual pathway of that boot time mode string is a bit mysterious - not sure if it travels from that Xylinx PROM chip to the Crime chip then to the CPU itself, I thought I need to know that to decrypt it, till I figured out the byte flipping (swizzling).
 
CRIME reads the Xilinx PROM, straps the SysAd bus accordingly and pulls the CPU out of reset.
On the PROM-less modules there are resistors straps on the bus.

Edit: Just so I don't spread misinformation. CPU mode bits are set like this:

On R5k/R7k modules, the CPU clocks a serial PROM (present on the CPU module) and reads it to configure itself.
On R10k/12k, the CPU can't configure itself via serial PROM. It needs to have the SysAD bus strapped externally. There are two main revisions of the carrier board:
-older revision with an FPGA. There is a PROM IC on this board, attached to the FPGA. This PROM holds the FPGA bitstream. Inside the FPGA bitstream there's also the SysAD (mode bits) config, that the FPGA sets on the bus.
-newer revision with ASIC. No PROM on this board. SysAD (mode bits) configuration is set by resistor straps.

CRIME 1.5 spec says that CRIME is responsable of strapping the bus. That is incorrect. CRIME never configures mode bits on any of the CPU modules (R5k/7k/10k/12k), it also doesn't read any serial PROM on the CPU module.
 
Last edited:
  • Like
Reactions: ruckusman
I got the AT17LV65A.bin from discord - I mistakenly thought it was hosted on the sgidepot site - seems not - it won't allow me to upload the binary file, will forward to you on discord

Good thread here


The R5K boot time mode strings interest me from the perspective of verifying them against a data sheet to verify the RM7000C byte flipping

R5K 180MHz 512kB PROM dump available here: https://forums.sgi.sh/index.php?threads/o2-cpu-modules-prom-dumps-and-analysis.1502/
 
Board #3:
S1.jpg

Got this board a few days ago as part of a very sorry looking system, supposedly broken (blinking amber light). After reseating everything, it worked, there was nothing wrong with it. Until I killed it just now.

I was using it to probe some signals, being a bit lazy, I did not disconnect the logic analyzer probes while trying to probe something with an oscilloscope. Result was that the 3.3V rail shorted to ground. System powered off by itself. Removed power, waited a bit, applied power and the system powered on.
However, all that I got was a solid amber light and exactly one newline over serial.

Checked all the power rails and they were fine, so were all the inductors on the board. CPU module was functioning OK in another system.

Removed the RAM, and I got a blinking amber light, which means that the PROM is executed and it actually detects that there is no RAM.
Did exactly what I did to MB#2, removed the flash IC and flashed it externally, and now the system is working again.

Though this might be interesting info, as at the moment a corrupt PROM is known to:
-give a solid amber light, regardless if RAM is present or not
-give a solid amber light with RAM installed, blinking amber light without RAM installed, otherwise dead system.
 
Board #4:

S1.jpg

This one showed no signs of life whatsoever, it wouldn't even turn on the PSU, even though the bypass jumper was installed.

On the O2 MB, the power button on the frontplane is connected to the Dallas IC. If the Dallas IC is actually programmed, and the power button is pressed, the Dallas IC will turn on the power supply (it pulls a signal low, that goes from the Dallas IC, through the frontplane, to the PSU). The jumper, when installed, connects that signal directly to ground, thus forcing the PSU on regardless if the Dallas IC is programmed or even present.

There is nothing else going on with that signal, so it was a weird that the system would not turn on with the jumper installed. Here is the jumper next to the Dallas IC:

S2.jpg

I proceeded to remove the Dallas IC, and after that, with the jumper installed, the system powered on. This made absolutely no sense, as there was nothing that the Dallas IC could do to prevent the system powering on when the jumper is installed. After installing the IC back, the system still powered on.

Inspected the other side of the board, and this is how the jumper solder joints looked like:
S3.jpg

One pin was totally floating. When I removed the Dallas IC, I probably moved it just enough so that the jumper started working. Soldered this back on.

Now, the system powered on, but not much else, just a solid amber light. Without the RAM installed it gave a blinking amber light, again meaning that the PROM is executed.
Without the PCI riser installed it did this:

S4.jpg

Which is normal, as it does not detect the one wire IC present on the riser (that holds the MAC address). What is not normal, is that beyond that it doesn't print anything.

This can indicate that the system has problem initializing/talking to some parts of the hardware (exact same behaviour with no Dallas IC installed or GBE not present on the board, likely with other faults, but I haven't confirmed yet). Would have been nice if the PROM had more debug prints, there was space for that, since the flash is about 20% empty. Another thing that it can indicate is that the PROM is corrupted, even though it kind of works.

Inspected traces, components and so on, really could not spot anything that would indicate a fault. So I proceeded to remove the flash (same as board #2 and #3), rewrite the PROM, and install it back in. After doing this the MB fired up:

S5.jpg

It also boots into Irix with no issues.
 
Besides the dysfunctional behaviour of that Dallas chip - Atmel PROM data corruption is emerging as a prime suspect in the non-functioning Mobo category.

I've got a solid red Mobo here, I think that's going to have to be my entry point in diagnosis, the Ateml PROM
 

About us

  • Silicon Graphics User Group (SGUG) is a community for users, developers, and admirers of Silicon Graphics (SGI) products. We aim to be a friendly hobbyist community for discussing all aspects of SGIs, including use, software development, the IRIX Operating System, and troubleshooting, as well as facilitating hardware exchange.

User Menu