SGI Fuel does not power up

jvakon

New member
Sep 2, 2020
7
0
1
Hi,

I recently salvaged an SGI Fuel from being disposed. The system does not power up and I get a whole bunch of ALERT messages when connected to the L1 port on the system board:

ALERT: NODE EEPROM read error, no acknowledge
ALERT: Error initializing the NODE 0 monitor, no acknowledge
ALERT: Error initializing the NODE 1 monitor, no acknowledge
ALERT: Error initializing the NODE 2 monitor, no acknowledge
ALERT: Error initializing the PIMM monitor, no acknowledge
ALERT: Error initializing the ODYSSEY monitor, no acknowledge
ALERT: Error initializing the BEDROCK monitor, no acknowledge

and so on, complaining on voltages, fan speeds, temperature sensors etc. Then, after many lines:

****************************************
controller firmware panic! resetting...
****************************************

IMAGE B: Rev. 1.30.16
[thread ID 300013cc stack]
TR: ffecb6e4 ffec85dc ffedc4c4 ffedc710 ffedc87e ffedaaa0 ffedb41a
TR: ffedb604 ffeb8918 ffeb8a5e ffec41ec ffe805cc ffeb1728 00000000

(if you see this, please email ssh@sgi.com and include
the output from the 'log' command and a description of
what caused the problem)
ALERT: Running Flash image A because image B failed during boot and is probably corrupt.

The same ALERT messages then repeat in semi-random order for flash image A. It does not drop to a prompt at any point.

Does anyone know what might be causing this? I think it's unlikely both flash images are corrupted. Is it a PSU problem?

Thanks for any hints!

John
 

Elf

Storybook
Feb 4, 2019
338
77
28
I am not sure whether a PSU issue would be my first guess, but, at least the power supply is easy to test? If you have a multimeter, try measuring the output of the standby power rail while the L1 controller is booting.

Measure it in a DC mode to get an idea of the level, and also in an AC mode for a very rough estimation of ripple.

Also, welcome to the user group :)
 

jvakon

New member
Sep 2, 2020
7
0
1
Hi Elf,

Thanks for the suggestion. I measured all lines of the PSU and, with the P1 molex disconnected from the main board I found two lines at +5.15V. With the molex in place on the MB these lines read +4.72V. No significant ripple in either case. Is it a case of a slight undervolt?

Thanks again.
 

jvakon

New member
Sep 2, 2020
7
0
1
OK, so I remeasured the voltage of the purple line, which should be 5VSB from what pinouts I can find, and it is +5.15V with the L1 booting. So, probably not an undervolt after all!

John
 

Elf

Storybook
Feb 4, 2019
338
77
28
Hm, I would guess something has gone wrong with the L1 controller or its connection to the rest of the system. Might be worth trying to disassemble things and see if there are any loose connectors or things that need cleaning?

Unfortunately the troubleshooting stage is pretty wide open from here...
 

jvakon

New member
Sep 2, 2020
7
0
1
A further update on this misbehaving Fuel. I managed to get to the L1 prompt (actually, the prompt was always there but was being overwritten by ALERT messages) so now I can get some more info on the system. env produces the following output for the power supply:

Description State Warning Limits Fault Limits Current
-------------- ---------- ----------------- ----------------- -------
12V Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.00
12V IO Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.00
5V Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 0.00
3.3V Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 0.55
2.5V Wait Pwr 10% 2.25/ 2.75 20% 2.00/ 3.00 0.05
1.5V Wait Pwr 10% 1.35/ 1.65 20% 1.20/ 1.80 0.23
5V AUX Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 3.95
3.3V AUX Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 0.00
PIMM 12V BIAS Wait Pwr 10% 10.80/ 13.20 20% 9.60/ 14.40 0.00
SRAM Wait Pwr 10% 2.25/ 2.75 20% 2.00/ 3.00 0.00
VCPU Wait Pwr 10% 1.44/ 1.76 20% 1.28/ 1.92 0.00
PIMM 1.5V Wait Pwr 10% 1.35/ 1.65 20% 1.20/ 1.80 0.00
PIMM 3.3V AUX Wait Pwr 10% 2.97/ 3.63 20% 2.64/ 3.96 0.00
PIMM 5V AUX Wait Pwr 10% 4.50/ 5.50 20% 4.00/ 6.00 0.00

5V AUX is reported as 3.95V, and there are no other live lines. Can anyone comment on whether this may cause the ALERT messages (and lack of power-up) in this Fuel?

Thanks!
John
 

indigofan

New member
Jun 8, 2020
6
1
3
I had an issue with mine, I reseeded the fans and voila, it worked.. Not sure if it will solve your issues, but maybe worth a try...?
 

jvakon

New member
Sep 2, 2020
7
0
1
Thank for the suggestion indigofan. No joy on reseating the fan connections I'm afraid. I also disconnected the HDD and CD-ROM, and even took out the graphics module. No change :(
 

Elf

Storybook
Feb 4, 2019
338
77
28
I would think 5V AUX being that low could cause instability in the L1 controller. It might be worth trying a PSU replacement!

I'd also try testing the PSU's 5V AUX line with it out of the case, under a bit of load (e.g. resistive load). That might let you know whether something is dragging the PSU's standby power low because it is almost shorting it out, or whether it is the PSU's fault. I would suspect the PSU first, but always good to check.
 

jvakon

New member
Sep 2, 2020
7
0
1
OK, I did a bit more investigation with the help of a friend more gifted in the electronics department...

When plugged out, the PSU 5V AUX line gives 5.15V under no load, and 5.05V under resistive load. We also measured again the voltage while the PSU was plugged in and the L1 was working (by sticking wires from the top of the connector), and that came to 5V too. So, we're fairly sure the PSU 5V AUX line works as it should, and that no load on the motherboard causes it to drop. We also tested and the short breaker of the PSU and that works fine.

Despite this, the env command still reports 3.7-3.9V in the 5V AUX line. I'm at a loss as to what might be happening electrically!
 

Elf

Storybook
Feb 4, 2019
338
77
28
Hm, there's a Dallas DS1780 which is used to monitor voltage rails that sometimes fails. It could cause the low voltage reading though I'm not sure that it would cause the L1 controller to panic, though I suppose anything is possible when it comes to an I2C device sending back data that may not be expected :)

You might try to locate the DS1780, see where standby power is fed to it, and see if that still reads as 5V. You can also try turning the environmental monitoring off (env off, I believe) and see if it boots, though it may not fix the issue; it's hard to say whether the L1 controller still tries to talk to the DS1780 when environmental monitoring is off, without scoping it out.

For reference: https://gainos.org/~elf/sgi/nekonomicon/forum/3/16206/1.html
 

jvakon

New member
Sep 2, 2020
7
0
1
Hi Elf,

Thanks for linking to the old post. Doesn't look like a quick fix - to be put on the back-burner, probably. Env off does not fix things, BTW.

Thanks again!

John
 

Elf

Storybook
Feb 4, 2019
338
77
28
Probably not a quick fix! Unfortunately that is how things go with these machines. Good luck though :)
 

About us

  • Silicon Graphics User Group (SGUG) is a community for users, developers, and admirers of Silicon Graphics (SGI) products. We aim to be a friendly hobbyist community for discussing all aspects of SGIs, including use, software development, the IRIX Operating System, and troubleshooting, as well as facilitating hardware exchange.

User Menu