Some people in discord showed some surprise at how "poorly" the SGIs of that era did and I clarified some of the reason there, but figured I'd also call it out here as well.
First: note the resolution. This is a game that came out in 1999. Most PCs of the time struggled to play the game at 20fps, and those that did were typically playing at 640x480. This is important to note, as the FPS numbers above are all taken at 1280x1024, over 4x the pixels. Today anything under 100fps is frowned upon, but in the late 90s, if your game ran at >15fps it was called "smooth" - my how times have changed!
Second: lightmapping puts SGIs at a major disadvantage over every graphics card from the RivaTNT (released in 1998) on. None of the SGI graphics architectures supported multitexturing. This means another multiple of work that the SGI is having to do over a PC card for a similar effect.
This is super simplified, but...
For a graphics card rendering a poly that supports multitexturing, following the transform phase, it involved sampling the albedo/base texture, sampling the lightmap texture, blending those values and drawing to the screen. We'll assume bilinear filtering here as I don't have a machine that runs Q3A up right now and can't remember if it does trilinear at the max settings. That means 1 z-buffer read, 4 samples (reads) from base texture, 4 samples from the lightmap texture, some math, and a single write to the framebuffer, and a write to the z-buffer.
For a card that DOESN'T support multitexturing, you get a second pass over the entire scene. This means twice as much geometry work as the hardware wasn't caching transform data in those days. But ignoring the geometry load, let's focus on fillrate / pixel pushing issues. First layer is a z-buffer read, read 4 samples from the base texture, 1 z-buffer write, and 1 framebuffer write. Then for the second (lightmap) pass, you have 1 z-buffer read, read 4 samples from the lightmap texture, 1 framebuffer read, a blend operation, and 1 framebuffer write.
Or in short..
PC has 1 z-buffer read, 1 z-buffer write, 8 texture reads, 1 framebuffer write
SGI has 2 z-buffer reads, 1 z-buffer write, 8 texture reads, 1 framebuffer read, 2 framebuffer writes
So not only is it doing double the geometry work, but also twice the z-buffer reads, twice the framebuffer writes, and additional framebuffer reads that the PC isn't doing, all at over 4x the pixel count.
Oh, and also note that PC cards of the time focused on performance not accuracy, and used a lot of cheats along the way typically, while SGI being the standard for graphics aimed at reliability and accuracy in the produced image.
Now as for why SGI didn't support multitexturing? My guess is it wasn't needed by their actual markets. CAD doesn't need texturing at all typically. 3D modeling for games / movies would use just a single texture as all the heavy effects / lighting was done at render time not at modeling time. And realtime simulations were focused on pushing more pixels, not necessarily more detail in each of the pixels.