Origin 300 cluster power cycle issues.

nondem

New member
Aug 5, 2020
16
2
3
Just got an L1 prompt and entered * pwr up
came back w/o saying anything:
001c02-L1>* pwr up
001c02-L1>


All of the bricks power lights are still off except the one.
Dunno if I should try to power them w/the buttons on them - and if so - at what point in the process?
 

nondem

New member
Aug 5, 2020
16
2
3
I'd like to take just a minute to say I appreciate the help...I'm really in the hotseat here.
 

nondem

New member
Aug 5, 2020
16
2
3
Just got an L1 prompt and entered * pwr up
came back w/o saying anything:
001c02-L1>* pwr up
001c02-L1>


All of the bricks power lights are still off.
Dunno if I should try to power them w/the buttons on them - and if so - at what point in the process?
Previously when it was improperly started it did at least say something at the * pwr up prompt.
 

nondem

New member
Aug 5, 2020
16
2
3
Most recent output and I tried a "pwr" command to get this status:


001c02-L1>* pwr up
001c02-L1>pwr u
001c02-L1>pwr
Supply State Voltage Margin Value
-------------- ----- --------- ------- -----
1.8V on 1.777V normal 0
12V <not present>
12V #2 on 12.000V N/A
3.3V NC 3.320V normal 0
12V IO NC 12.063V N/A
5V AUX NC 5.044V N/A
3.3V AUX NC 3.268V N/A
PCI 5V AUX NC 5.044V N/A
PCI 3.3V NC 3.320V N/A
PCI 2.5V on 2.509V normal 0
PCI 5V on 4.940V normal 0
XIO 12V BIAS <not present>
XIO 5V <not present>
XIO 2.5V <not present>
XIO 3.3V AUX <not present>
IP59 3.3V AUX NC 3.268V N/A
IP59 5V AUX NC 5.018V N/A
IP59 12V NC 11.938V N/A
IP59 VCPU on 1.283V normal 11
IP59 SRAM on 2.457V normal 0
IP59 1.5V on 1.480V normal 0
 

HAL

Administrator
Oct 22, 2019
33
24
8
You should be able to power up the other 3 bricks manually and say after 2 minutes restart the boot-brick (either from Irix or the L1) and the
it should discover the 3 bricks automatically.
If I was you I'd first check the thick numalink cable from the boot-brick to the 1st compute-brick thoroughly since if this connection is not up none
of the other bricks can be discovered.
 

nondem

New member
Aug 5, 2020
16
2
3
Ok - I followed the steps you mentioned...rebooted the one main brick after manually hitting the power on the other ones.
Looks like it may be fixed!!!!!

Here is the new hinv output:

vortex 1# hinv -vm
Location: /hw/module/001c02/node
IP59_4CPU Board: barcode NSG072 part 030-1989-003 rev -B
Location: /hw/module/001c02/IXbrick/xtalk/15
2U_INT_53 Board: barcode NST935 part 030-1809-006 rev -B
Location: /hw/module/001c02/IXbrick/xtalk/15/pci-x/0/1/ioc4
IO9 Board: barcode NSW748 part 030-1771-005 rev -B
Location: /hw/module/001c04/node
IP59_4CPU Board: barcode RBS343 part 030-1989-003 rev -C
Location: /hw/module/001c04/IXbrick/xtalk/15
2U_INT_53 Board: barcode RBH247 part 030-1809-006 rev -D
Location: /hw/module/001c08/node
IP59_4CPU Board: barcode RBT018 part 030-1989-003 rev -C
Location: /hw/module/001c08/IXbrick/xtalk/15
2U_INT_53 Board: barcode RBX752 part 030-1809-006 rev -D
Location: /hw/module/001c10/node
IP59_4CPU Board: barcode RBT012 part 030-1989-003 rev -C
Location: /hw/module/001c10/IXbrick/xtalk/15
2U_INT_53 Board: barcode RBX743 part 030-1809-006 rev -D
Location: /hw/module/001c10/IXbrick/xtalk/15/pci-x/0/1/ioc4
IO9 Board: barcode NJJ725 part 030-1771-005 rev -A
Location: /hw/module/001r06/router
ROUTER Board: barcode NCT816 part 030-1634-003 rev -A
16 1.0 GHZ IP35 Processors
CPU: MIPS R16000 Processor Chip Revision: 3.0
FPU: MIPS R16010 Floating Point Chip Revision: 3.0
CPU 0 at Module 001c02/Slot 0/Slice A: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 1 at Module 001c02/Slot 0/Slice B: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 2 at Module 001c02/Slot 0/Slice C: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 3 at Module 001c02/Slot 0/Slice D: 1.0 Ghz MIPS R16000 Processor Chip (enabl

ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 5 at Module 001c04/Slot 0/Slice B: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 6 at Module 001c04/Slot 0/Slice C: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 7 at Module 001c04/Slot 0/Slice D: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 8 at Module 001c08/Slot 0/Slice A: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 9 at Module 001c08/Slot 0/Slice B: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 10 at Module 001c08/Slot 0/Slice C: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 11 at Module 001c08/Slot 0/Slice D: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 12 at Module 001c10/Slot 0/Slice A: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 13 at Module 001c10/Slot 0/Slice B: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 14 at Module 001c10/Slot 0/Slice C: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 15 at Module 001c10/Slot 0/Slice D: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
Main memory size: 16384 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 16 Mbytes
Memory at Module 001c02/Slot 0: 4096 MB (enabled)
Bank 0 contains 512 MB (Premium) DIMMS (enabled)
Bank 1 contains 512 MB (Premium) DIMMS (enabled)
Bank 2 contains 512 MB (Premium) DIMMS (enabled)
Bank 3 contains 512 MB (Premium) DIMMS (enabled)
Bank 4 contains 512 MB (Premium) DIMMS (enabled)
Bank 5 contains 512 MB (Premium) DIMMS (enabled)
Bank 6 contains 512 MB (Premium) DIMMS (enabled)
Bank 7 contains 512 MB (Premium) DIMMS (enabled)
Memory at Module 001c04/Slot 0: 4096 MB (enabled)
Bank 0 contains 512 MB (Standard) DIMMS (enabled)
Bank 1 contains 512 MB (Standard) DIMMS (enabled)
ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 4 at Module 001c04/Slot 0/Slice A: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 5 at Module 001c04/Slot 0/Slice B: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 6 at Module 001c04/Slot 0/Slice C: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 7 at Module 001c04/Slot 0/Slice D: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 8 at Module 001c08/Slot 0/Slice A: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 9 at Module 001c08/Slot 0/Slice B: 1.0 Ghz MIPS R16000 Processor Chip (enabl ed)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 10 at Module 001c08/Slot 0/Slice C: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 11 at Module 001c08/Slot 0/Slice D: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 12 at Module 001c10/Slot 0/Slice A: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 13 at Module 001c10/Slot 0/Slice B: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 14 at Module 001c10/Slot 0/Slice C: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
CPU 15 at Module 001c10/Slot 0/Slice D: 1.0 Ghz MIPS R16000 Processor Chip (enab led)
Processor revision: 3.0. Scache: Size 16 MB Speed 333 Mhz Tap 0x15
Main memory size: 16384 Mbytes
Instruction cache size: 32 Kbytes
Data cache size: 32 Kbytes
Secondary unified instruction/data cache size: 16 Mbytes
Memory at Module 001c02/Slot 0: 4096 MB (enabled)
Bank 0 contains 512 MB (Premium) DIMMS (enabled)
Bank 1 contains 512 MB (Premium) DIMMS (enabled)
Bank 2 contains 512 MB (Premium) DIMMS (enabled)
Bank 3 contains 512 MB (Premium) DIMMS (enabled)
Bank 4 contains 512 MB (Premium) DIMMS (enabled)
Bank 5 contains 512 MB (Premium) DIMMS (enabled)
Bank 6 contains 512 MB (Premium) DIMMS (enabled)
Bank 7 contains 512 MB (Premium) DIMMS (enabled)
Memory at Module 001c04/Slot 0: 4096 MB (enabled)
Bank 0 contains 512 MB (Standard) DIMMS (enabled)
Bank 1 contains 512 MB (Standard) DIMMS (enabled)
Bank 2 contains 512 MB (Standard) DIMMS (enabled)
Bank 3 contains 512 MB (Standard) DIMMS (enabled)
Bank 4 contains 512 MB (Standard) DIMMS (enabled)
Bank 5 contains 512 MB (Standard) DIMMS (enabled)
Bank 6 contains 512 MB (Standard) DIMMS (enabled)
Bank 7 contains 512 MB (Standard) DIMMS (enabled)
Memory at Module 001c08/Slot 0: 4096 MB (enabled)
Bank 0 contains 512 MB (Standard) DIMMS (enabled)
Bank 1 contains 512 MB (Standard) DIMMS (enabled)
Bank 2 contains 512 MB (Standard) DIMMS (enabled)
Bank 3 contains 512 MB (Standard) DIMMS (enabled)
Bank 4 contains 512 MB (Standard) DIMMS (enabled)
Bank 5 contains 512 MB (Standard) DIMMS (enabled)
Bank 6 contains 512 MB (Standard) DIMMS (enabled)
Bank 7 contains 512 MB (Standard) DIMMS (enabled)
Memory at Module 001c10/Slot 0: 4096 MB (enabled)
Bank 0 contains 512 MB (Standard) DIMMS (enabled)
Bank 1 contains 512 MB (Standard) DIMMS (enabled)
Bank 2 contains 512 MB (Standard) DIMMS (enabled)
Bank 3 contains 512 MB (Standard) DIMMS (enabled)
Bank 4 contains 512 MB (Standard) DIMMS (enabled)
Bank 5 contains 512 MB (Standard) DIMMS (enabled)
Bank 6 contains 512 MB (Standard) DIMMS (enabled)
Bank 7 contains 512 MB (Standard) DIMMS (enabled)
ROUTER in Module 001c02/Slot 0: Revision 1: Active Ports [1,6,7,8] (enabled)
Integral SCSI controller 3: Version IDE (ATA/ATAPI) IOC4
Integral SCSI controller 2: Version IDE (ATA/ATAPI) IOC4
CDROM: unit 0 on SCSI controller 2
Integral SCSI controller 4: Version QL12160, low voltage differential
Disk drive: unit 1 on SCSI controller 4 (unit 1)
Integral SCSI controller 5: Version QL12160, low voltage differential
Integral SCSI controller 0: Version QL12160, low voltage differential
Disk drive: unit 1 on SCSI controller 0 (unit 1)
Integral SCSI controller 1: Version QL12160, low voltage differential
IOC3/IOC4 serial port: tty7
IOC3/IOC4 serial port: tty8
IOC3/IOC4 serial port: tty9
IOC3/IOC4 serial port: tty10
IOC3/IOC4 serial port: tty3
IOC3/IOC4 serial port: tty4
IOC3/IOC4 serial port: tty5
IOC3/IOC4 serial port: tty6
Gigabit Ethernet: tg1, module 001c10, PCI bus 1 slot 4
Integral Gigabit Ethernet: tg0, module 001c02, PCI bus 1 slot 4
PCI Adapter ID (vendor 0x10a9, device 0x100a) PCI slot 1
PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 3
PCI Adapter ID (vendor 0x14e4, device 0x1645) PCI slot 4
PCI Adapter ID (vendor 0x10a9, device 0x100a) PCI slot 1
PCI Adapter ID (vendor 0x1077, device 0x1216) PCI slot 3
PCI Adapter ID (vendor 0x14e4, device 0x1645) PCI slot 4
IOC4 firmware revision 83
IOC4 firmware revision 83
IOC3/IOC4 external interrupts: 2
IOC3/IOC4 external interrupts: 1
HUB in Module 001c02/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
HUB in Module 001c04/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
HUB in Module 001c08/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
HUB in Module 001c10/Slot 0: Revision 2 Speed 200.00 Mhz (enabled)
IP35prom in Module 001c02/Slot n0: Revision 6.210
IP35prom in Module 001c04/Slot n0: Revision 6.210
IP35prom in Module 001c08/Slot n0: Revision 6.210
IP35prom in Module 001c10/Slot n0: Revision 6.210
vortex 2#
 

nondem

New member
Aug 5, 2020
16
2
3
The weather model runs at midnight and can't be manual done for several reasons so I won't know for sure till tomorrow AM...BUT
afaict it looks fixed. If you guys can think of anything else I should do please let me know.

AND THANKS IN A HUGE WAY! You guys may have saved my bacon.
 

CiaoTime

Public Enemy Number One
Jan 15, 2020
45
27
18
Vancouver, BC, Canada
Looks like you're up and running! Looks great. And if that system ever gets decommissioned, do try to hold onto it! That thing's a beauty, and it wouldn't be difficult to add a graphics option to it if you'd ever wanted one of the fastest IRIX desktops left in the world to play with.

Sixteen 1ghz processors. Wow. And here I thought I had a hot machine with only eight 800's!
 

Unxmaal

Administrator
Feb 8, 2019
98
60
18
I'm calling dibs on one of those 4x1GHz nodes.

Also @nondem take a look at the packages we have available for sgug-rse. There's a lot of very fresh ports for IRIX available -- new gcc, new python, so forth -- that can make your life as a sysadmin easier.
 

nondem

New member
Aug 5, 2020
16
2
3
So, I'd like to take a minute to defend my obvious ignorance here :)

I became responsible for this box about a year and half ago. There is no backup/failover system, no docs beyond the ones for the original system from SGI, no one to ask anything till I found you guys.
The box is used for smoke plume prediction and runs the model every night to base the next days plumes on as I mentioned earlier. Burn Permits for various purposes are approved based on the predicted plume the this box returns based on the weather model it creates every night.
It runs an old "MM5/Hysplit" model on the SGI system and it needs to be retired obviously. We are moving to the newer "WRF" model which has source code that is supported under Linux.
I couldn't dig into the box because I didn't have any docs from the previous admins to work with and couldn't risk causing it to go down even for a minute. It is like trying to work on an ambulance while it was driving down the road w/someone heading to the ER. I just left it alone and put my effort into a replacement.
About a year ago, I bought two identical Dell x86 boxes using the recommendations of people experienced with the new WRF model. Lots of processor cores and memory. The second one will be a failover when we get it in prod.
Our meteorologist built the model on the one of the new boxes and we put it in test shortly after. Months have went by in testing with dozens of other hot issues but it's very close to being ready. But not quite...I tried moving it to prod today since the plume wasn't working anyway and that unconvered a couple of problems in the windoze desktop app that renders the plumes. So I guess that is a couple of good things that came out of this crisis.

This is why I'm the sysadmin for a box I have no business being admin on.

I got handed two Sun Sparc systems at the same time as this one and have already retired them...this is the last thorn in my side.
 
  • Like
Reactions: Elf

massiverobot

irix detailer
Feb 8, 2019
121
108
43
Philly
twitter.com
I got handed two Sun Sparc systems at the same time as this one and have already retired them...this is the last thorn in my side.
Contact me about removing those systems from your building when you are ready. I only charge $1000 for SGI hardware removal! I will beat any other scrap dealer's offer.

Good luck...
 

weblacky

Active member
Jan 13, 2020
181
45
28
Seattle, WA
Yeah, seriously. SGI collectors are a dedicated bunch. I only collect workstations (not big Iron like these) but if you know any other stations or business using these and they are "retiring" them. Please let us know. Many companies are either trashing these great systems or parting them out hoping to screw over other business needed last-minute parts. I'm missing several SGI workstation models so I'm still on the prowl. I've been collecting SGIs since 1998! I still haven't collected them all (3 remaining). So please let groups know if stuff is available to keep these machine out of the hands of recyclers who ruin them.
 

About us

  • Silicon Graphics User Group (SGUG) is a community for users, developers, and admirers of Silicon Graphics (SGI) products. We aim to be a friendly hobbyist community for discussing all aspects of SGIs, including use, software development, the IRIX Operating System, and troubleshooting, as well as facilitating hardware exchange.

User Menu