OpenBSD/sgi

johnnym

Member
Aug 7, 2022
39
31
18
New lead

As I already wrote the other -non-IP28 - systems I have available for testing are unaffected by the bad commits, so there must be some difference between them. Which is especially irritating because Indy and Indigo² are not so different actually. When comparing the kernel configuration files for IP22 and IP28 I found the difference: Like all other systems I have except for IP28, the IP22 kernel uses a "clock0" for the clock interrupts (the "and scheduling clock" in the "int0" line might be a remnant of an older code state or reflect that things are different for IP20 (also handled by IP22 kernel) as well, see further below):

Code:
[...]
#
# Definition of system
#
mainbus0 at root
cpu* at mainbus0
clock0 at mainbus0 # scheduling clock on Indy

int0 at mainbus0 # Interrupt Controller and scheduling clock
imc0 at mainbus0 # Memory Controller
gio0 at imc0
eisa0 at imc0
[...]
from https://github.com/the-machine-hall/open...NERIC-IP22

...but the R10000 Indigo² does not:

Code:
[...]
#
# Definition of system
#
mainbus0 at root
cpu* at mainbus0

int0 at mainbus0 # Interrupt Controller and scheduling clock
imc0 at mainbus0 # Memory Controller
gio0 at imc0
eisa0 at imc0
[...]
from https://github.com/the-machine-hall/open...NERIC-IP28

...and instead uses "int0" for that. Originally both Indy and R10000 Indigo² used a "clock0" instead, but this was changed with:

Code:
commit 64ac1c5a7e13fbe4130b9b53f956c4ebff13c665
Author: miod <miod@openbsd.org>
Date: Sat Jul 14 19:53:27 2012 +0000

A known errata of R4000 and R4400 processors, is that reading the internal
counter register close to a trigger of the counter interrupt, may cause the
interrupt not to be generated. This makes it a bad idea to use the internal
counter both for the scheduling clock and for delay().

Therefore, on IP22 systems (and IP28 because it makes my life easier), use
one of the two 8254 timers connected to the onboard interrupt controller as
the scheduling clock source.

   Adapted from NetBSD.
from https://github.com/the-machine-hall/open...ebff13c665

...but soon after changed back for the Indy with:

Code:
commit 833ab59f79f5195f7dcd0b5b888b8d2f3335eac5
Author: miod <miod@openbsd.org>
Date: Wed Jul 18 19:56:02 2012 +0000

According to Linux, and just verified the hard way, the 8254 timer does not
interrupt on Indy; do not use it on such systems. Then, bring back a clock0 at
mainbus attachment to IP22 kernels, and attach it late in the autoconf process
if no other device has claimed the clock yet.

This means R4000 and R4400 based Indy may experience the lost clock interrupt
   processor errata again, until a better way to skirt it is found.
from https://github.com/the-machine-hall/open...2f3335eac5

Hence I will "bring back a clock0" to IP28 as well and make adaptations as needed as I am unsure how to fix the 8254 related code to work correctly with the two commits from my last post. Afterwards only IP20 systems will use those timers instead of a "clock0", but as my R4000 Indigo is incomplete I can't test this anyhow.
 

johnnym

Member
Aug 7, 2022
39
31
18
IP28 kernel fixed for OpenBSD/sgi 7.3

So bringing back a "clock0" to IP28 indeed solved/worked around the breakage introduced by the two commits mentioned earlier. Together with the fix/workaround for the IP28 problem in OpenBSD/sgi 7.2 things are working again for IP28 in 7.3, see this log from yesterday. My patch is simple:

Diff:
diff --git a/sys/arch/sgi/conf/GENERIC-IP28 b/sys/arch/sgi/conf/GENERIC-IP28
index 9918a08414c..afcae927626 100644
--- a/sys/arch/sgi/conf/GENERIC-IP28
+++ b/sys/arch/sgi/conf/GENERIC-IP28
@@ -37,6 +37,7 @@ config        bsd    swap generic
 #
 mainbus0    at root
 cpu*        at mainbus0
+clock0        at mainbus0
 
 int0        at mainbus0    # Interrupt Controller and scheduling clock
 imc0        at mainbus0    # Memory Controller
diff --git a/sys/arch/sgi/conf/RAMDISK-IP28 b/sys/arch/sgi/conf/RAMDISK-IP28
index e07ea14fbe7..389b0d3655d 100644
--- a/sys/arch/sgi/conf/RAMDISK-IP28
+++ b/sys/arch/sgi/conf/RAMDISK-IP28
@@ -31,6 +31,7 @@ config        bsd root on rd0a swap on rd0b
 
 mainbus0    at root
 cpu*        at mainbus0
+clock0        at mainbus0
 
 int0        at mainbus0        # Interrupt Controller and scheduling clock
 imc0        at mainbus0        # Memory Controller
diff --git a/sys/arch/sgi/localbus/int.c b/sys/arch/sgi/localbus/int.c
index c76df00762d..09c06291ce5 100644
--- a/sys/arch/sgi/localbus/int.c
+++ b/sys/arch/sgi/localbus/int.c
@@ -375,8 +375,7 @@ int2_attach(struct device *parent, struct device *self, void *aux)
     /*
      * The 8254 timer does not interrupt on (some?) IP24 systems.
      */
-    if (sys_config.system_type == SGI_IP20 ||
-        sys_config.system_subtype == IP22_INDIGO2)
+    if (sys_config.system_type == SGI_IP20)
         int_8254_cal();
 }
...and could be refined a little for the if clause condition: Because "IP22_INDIGO2" actually includes IP22 (R4000, R4400, R4600 Indigo²), IP26 (R8000 Indigo²) and IP28 (R10000 Indigo²), so would also affect IP22 and IP26. But as I only reproduced the IP28 kernel and the patch is not yet included in any branch it doesn't really matter.

****

So with the kernel fixed now for IP28, I consider the OpenBSD/sgi 7.3 release complete. The next release is still half a year away or so, so plenty of time for other things now...
 

stormy

Active member
Jun 23, 2019
133
55
28
@johnnym Is it possible to run a GUI desktop environment? Is there any hardware acceleration for mardigras/odyssey graphics? thx
 

johnnym

Member
Aug 7, 2022
39
31
18
@johnnym Is it possible to run a GUI desktop environment?
As per https://www.openbsd.org/sgi.html only for the O2 (IP32). And I didn't build it for OpenBSD/sgi 7.0. Also never tried it.

Is there any hardware acceleration for mardigras/odyssey graphics? thx
Not that I know of, but support for glass console is available also as per https://www.openbsd.org/sgi.html, except for "the IP27 Kona frame buffer".

edited due to sent before complete
 
Last edited:

johnnym

Member
Aug 7, 2022
39
31
18
Had some time to also try other machines/configurations not yet tested with OpenBSD/sgi 7.0 to 7.3, namely:

  • R4600 Indy
  • dual-node R10000 Origin200

All of them worked (Indy: tested both booting and operation with 7z and openssl; Origin200: Only tested booting with the mentioned versions). On the Origin200 the 7.2 kernel did trap one time when booting, but two subsequent boots went through w/o an issue.

The R4600 Indy seems to be unaffected by the issue I mentioned in https://forums.sgi.sh/index.php?threads/openbsd-sgi.928/#post-5939 but later confirmed as being present in 6.9 already, i.e. it can run 7z w/o getting the kernel to panic. The only thing making it unresponsive is swapping over NFS. :p

I uploaded the boot logs to dmesgd.nycbug.org and they are also linked from the respective release pages on GitHub.

The astute reader will not only notice that OpenBSD/sgi can access the hardware of the second node but also the information hidden in this portion:

Code:
origin200-2# sysctl hw
hw.machine=sgi
hw.model=IP27
hw.ncpu=2
hw.byteorder=4321
hw.pagesize=16384
hw.disknames=
hw.diskcount=0
hw.cpuspeed=180
hw.vendor=SGI
hw.product=Origin 200
hw.physmem=1207959552
hw.usermem=1207926784
hw.ncpufound=4
hw.allowpowerdown=1
hw.ncpuonline=2
hw.power=1
So it found four CPUs but only uses the two of the master node. :cool: Looking around in the sgi code I came across these two parts here in sys/arch/sgi/sgi/ip27_machdep.c:

C:
int
ip27_kl_launch_cpu(klinfo_t *comp, void *arg)
{
[...]
    /* XXX Skip CPUs on other nodes. */
    if (comp->nasid != ci->ci_nasid)
        return 0;
[...]
}
from https://github.com/the-machine-hall/open...066..L1089

C:
int
ip27_kl_attach_cpu(klinfo_t *comp, void *arg)
{
[...]
    /* XXX Skip CPUs on other nodes. */
    if (comp->nasid != ci->ci_nasid)
        return 0;
[...]
}
from https://github.com/the-machine-hall/open...098..L1135

Unfortunately I wasn't equipped with the required hardware to produce a kernel when I tested the machine yesterday. I wonder what will happen with these two parts disabled - Here be dragons! :D My Octane is already compiling an IP27.MP kernel modified in that regard as I write, can't wait to test it on that Origin200, maybe on the coming weekend, we'll see.
 
  • Like
Reactions: Elf

johnnym

Member
Aug 7, 2022
39
31
18
Didn't have time to test during the weekend, but will have now. After re-reading the IP27 related code portions and some comparisons with what Linux has included for IP27* in the meantime I'm not so sure if it will work out of the box if at all.

*) Yeah, heard right, Linux had working support for IP27 in the past. There's still a dmesg log available on showing a 128 processor Origin2000 booting.

For example setting the interrupt masks with ip27_hub_setintrmask() seems to only operate for two processors. Though it always selects the current processor and from that sets the respective interrupt mask, so could theoretically work for other nodes than the master node, too, but I'm not sure about that.

What looks like the equivalent function for Linux - setup_hub_mask() - incorporates the "nasid" (I assume each processor board in an Origin2000 has its own nasid and each node of an Origin200 has its own nasid - am I correct with that?) which it determines from the current processor but otherwise looks like it does pretty much the same as OpenBSD's version.

So actually the interrupt masks don't differ between nodes but only between the two processors on each node and are the same for each node then. As Linux's version did work for multi-node systems (see the dmesg above) and OpenBSD's version looks like it would do the same, it might actually work when not skipping non-master node processors on an Origin200. But if it works, why didn't Miod never acivate it? I know he had or still has a dual-node Origin200 so could have tested it. Well I'll find out soon enough...
 
  • Like
Reactions: Elf

johnnym

Member
Aug 7, 2022
39
31
18
I did indeed test my modifications on a dual-node Origin200 back in May. They really made a difference this time, though the result was not what I have hoped for: The machine crashes when it starts to activate the hardware of the second node. As this works with unmodified sources, it looks like even the CPUs for the second node are not activated before the kernel has activated all the hardware of the first node. So it doesn't work yet, but I know I was tinkering at the "right" spot.

****

A general sorry for no news since May, but I got heavily involved in trying to rescue another architecture but this time in the Linux world, leaving me not much time for OpenBSD/sgi. In the end we didn't prevail and ia64 got removed from Linux with 6.7-rc1 despite our efforts, but we're trying to maintain it outside of the kernel in the hope that it will make it back in in the future. You can read part of the story on LWN and El Reg, but make sure to also read the primary sources. ;)

Despite all that I did manage to do something for OpenBSD/sgi, too:
  • This includes the 7.4 release (currently only the kernels, but you already know that you can use them with octeon file systems). As you can check in the dmesg/boot logs I posted on NYC*BUG's dmesgd these work well on all my compatible machines. This time no issue with the R10000 Indigo². :D I could also get @chulofiasco to test out OpenBSD/sgi on an Octane which seemed to have worked out pretty well.


For the future I plan to also build the release files for 7.3 and 7.4 and there's also the next release somewhen during the first half of 2024.

Happy holiday and have some fun in 2024 - maybe with OpenBSD/sgi. :D
 
  • Like
  • Love
Reactions: chulofiasco and Elf

About us

  • Silicon Graphics User Group (SGUG) is a community for users, developers, and admirers of Silicon Graphics (SGI) products. We aim to be a friendly hobbyist community for discussing all aspects of SGIs, including use, software development, the IRIX Operating System, and troubleshooting, as well as facilitating hardware exchange.

User Menu