Savage3D review

S3 finally shows muscles

While the Virge was the first affordable 3d accelerator, it did not take long and the break-neck development speed of 3d graphics left the whole architecture in the dust. Of course, the company worked on a new product, one that would meet the contemporary demands. I even believe S3 wanted to exceed expectations and come back with still affordable, but at the same time top-class accelerator. That would be a product to savage competition with. Scheduled for the middle of 1998, it would check almost all the boxes wanted: newest manufacturing technology full AGP support, strong triangle setup, 125 megapixels per second fillrate, full-speed trilinear texture filtering, and secret sauce of a technology to leap ahead of competitors. And all of that for $200. The term "Voodoo2 killer" began to float around the web, but as it happens every so often when the Savage3D finally hit the streets, expectations were not met.
First reason was common to all the new chips that wanted to exploit the new 250 nm manufacturing. Because the process was riddled with troubles, it took a while until the chips would reach the originally planned frequency. Some companies accepted lowered clocks, some even went back to 350 nm. S3 kept on claiming a 125 MHz clock, but such a frequency was never reached. And that was only the first sign of the rougher landing the chip had to go through.
And then there are the S3 specific reasons...

The card (and the rig)

AGPhantom MS4426. All Savage3D cards have that basic look.

The card in the test was running at 100 MHz. At least before I started fighting compatibility issues. S3 claimed full AGP 1.0 support, but it wouldn't work in my "new generation" test rig with i815 chipset. The much bigger surprise was the same failure happened with the popular and period-correct 440 BX motherboard. During the troubleshooting, I flashed the latest bios. It did not help, but it set the card clocks to 110 MHz for both core and memory, the Savage3D supports only synchronized clocks. That is the clock benchmarks were run at, so we have a Savage 3D in its best shape. Graphics card vendors used clocks in the range of 90-110 MHz, anything higher was causing stability issues. The chip is not the coolest and minimalistic heatsinks from the 90s might not be enough without proper airflow.
Other reasons why this card should be a good choice are SGRAM memory and the later chip revision B. But with no luck enabling AGP on Intel chipsets, I tried a Socket 7 with Ali V chipset motherboard, and voila- 2x AGP speed, DMA texturing, and sideband addressing work properly. So probably just for this one card, the system is using K6-III+@500 MHz, GA-5AA with bus at 100 MHz, tuned for performance to the best of my abilities.
And since the chip is the 86c391 version it features a certified Macrovision 7.1 DVD decoder for a great movie experience. But this is not something to be covered here.

Architecture

Just like most of its contemporaries, the Savage3D is largely defined as a single-cycle pipeline. For S3 this was a bigger leap than for most since even later Virge chips could not render full-featured pixel in two clocks. Savage3D can while being able to put more work into the pixel than most competitors. But let's start at the beginning of the pipeline. The AGP controller should have supported all that the 1.0 specification offered, including prefetch of texture tiles, but because of the troubles mentioned above, it likely did not turn out the way designers hoped for. The data is received by a full triangle setup. While being the first one from S3 it needs quite a respectable 25 clocks to prepare one vertex.
Texturing stage was beefed up mightily. Various formats up to 32-bit RGBA at 2k resolution are supported and mip-mapping never let me down. Not only can Savage3D output bilinearly filtered texel in a single clock, but it can also do trilinear filtered one as well. To maintain this rate at proclaimed full-speed 8 kB texture cache was added. Of course, there will be a performance hit when needed data aren't in the cache. But let me assure you already here, for a chip with a 64-bit memory bus the Savage3D handles trilinear gracefully. There was even a promise of an anisotropic filter, but it did not seem to be supported. Another feature that failed to materialize in the test was table fog.
All the common blending combinations are supported. The pipeline is built for true color operation through and through, even the Z-buffer supports 24-bit precision besides the usual 16. Since at the time most of the games were still using only 16-bit colors, S3 prepared a proprietary dithering algorithm to stay closer to the original colors. Overall, this list looks quite impressive for a regular-sized chip from 1998. But as is well known, the final performance wasn't breath-taking.

How Savage is it?

No matter what mishaps happened to the chip, S3 made a big leap in comparison to their previous performance peak, the Virge /GX2. Savage 3D delivered more than triple average framerates, and minimums were even quadrupled. Detailed results are here.

As usual, click to show minimal framerates.

While the leap is one of the biggest generational differences, it was easier to achieve thanks to the quite low performance of the predecessor. The more pressing question is whether the performance is enough for the market of 1998-1999. A single texel pipeline, albeit "full speed", 8 MB of memory and 64-bit memory interface have become a new baseline. S3 however pulled a new important technology out of their hat: hardware texture compression.

S3TC

It goes to the credit of S3 that they did not cut corners when it came to texturing. Even the first Virge, slow as it was, rendered textures up to the quality standards of the time. And the same goes for the Savage. How could it hope to challenge the high-end competitors able to process two texture colors per clock? The answer is in the compression of the texture data. Take a 24-bit RGB source, the algorithms will reduce the size to four bits- 6:1 ratio. In one S3TC data block, which is 64-bit, it is then possible to pack data from sixteen texture samples instead of two! Should the source be RGBA, the ratio is lower- 4:1, resulting in "only" four times the regular data density and eight samples transferred in one clock. Having a hardware block in the chip able to decompress the data format in real-time enables a big leap in performance. Our minds today are well acquainted with inventive marketable numbers. But it is important to put ourselves into 1998 shoes and imagine what excitement can claims based on the texture compression do. Multiply the texture quality. Deliver more than 4X AGP bandwidth. Triple the effective local video memory capacity. These are real possibilities, but such high compression ratios can be achieved only with lossy algorithms. The color data is translated into a reduced palette. How the values of the palette are determined is the main target of human ingenuity, it is crucial for the feasibility of the compression. Sometimes, if just one moves too far from the original, the observer might not be satisfied with the images presented. And that happened to me as well.

System Shock 2 no compression
System Shock 2 S3TC
The human vision and its sensitivity to green color.

Far from me to discourage readers from enabling the compression. Most of the time S3TC will not hit your eyes like that. It is up to everyone's consideration in which games and at what settings the compression is useful. But since I found such a disparity the very first moment I decided to take screenshots, I decided to test without compression. Exceptions exist, like Quake 3 which was made with S3TC in mind.

But how does S3TC affect performance exactly? Let me show a game that positively flies on the Savage3D. It is Forsaken, which very obviously hit the vsync limit, so keep that in consideration when looking at further results below.

Turns out the compression does not perform miracles, merely a sizeable uplift. Of course, I tried S3TC in other, more fatal performance cases. Like Falcon 4.0 for example, a lot more texture-heavy game, but S3TC did not save the day, in fact, it did not measurably improve framerates. The technology did enable a much-needed leap in possible texture resolution, but it won't propel the Savage3D to a faster class of 3d accelerators on its own. Another letdown happened in a newer game explicitly supporting the compression: Expendable. Not only did it make no difference, despite the endorsement of the chip by Rage, the game developer, the Savage3D was accumulating artifacts during play:

Expendable Savage3D

And are there more problems to report...

Experience

Having all the features one could want in 1998 means also proper texture mip-map handling on a multitude of levels. Drivers allow to enforce mip-mapping which is handy for older games that were not made with that in mind. Take Mechwarrior 2 Titanium edition as an example. The ground textures tend to be extremely grainy at further distances. In such cases, the ability to reduce resolution of distant textures is a blessing (even speed-wise) and the Savage3D did that just fine. Hence I allow framerate results with auto mipmapping in that game, but it is the only one. The results aren't always a win-win between speed and quality. Let me turn to System Shock 2 again:

Savage3D automipmap
Forced mip-mapping can put levels too close together. That is why I cannot recommend the feature always on.

The settings available should be tweaked on a per-game basis and Fachman's drivers with support for profiles are what I recommend. For more screenshots see the Savage3D gallery.

I have to report two total losses, both are one of the oldest games. First, Heavy Gear. The game would always lock up at the very first screen. The second is CartX Racing because it was hardly stable enough to finish my demo, and/or texture corruption was unbearable. This wasn't the only game suffering from texture corruption. Lands of Lore II looked hopeless, but by playing for a while the artifacts went away, and in the end, all looked correctly during the benchmarked segment. Also, the bilinear filter in Formula 1 does not work. Then there are small errors like texture transparency glitching in Homeworld, or black ground texture in Insanity benchmark. It is however OpenGL drivers, that gave me more work. The latest Fachman's driver packs ICD made for Savage MX, which isn't completely compatible and produced too many annoying artifacts for me to use it. But it is an understandable move considering how little support S3 gave to the first Savage. It is one of those cards that came out with miniGL drivers and full ICD was delivered later. Both handle early Quake games decently, and the later version does alright in Half-Life as well. But Quake III was a tough nut to crack. You may find YouTube videos showing how to render it quite correctly with a MiniGL based on a Metal API. I've decided to stick with ICD, which suffers visual and performance meltdown when rendering some animated textures of Quake 3. There is a patch for the game that excludes those textures from light mapping. Not an ideal solution, but I did not find any ideal way to render Quake 3 on Savage 3D. This is a testament to the extremely short driver support - the most benchmarked game from 1999 was not fixed. And the ICD still fails completely with some more exotic OpenGL renderers.

GLQUAKE Savage3D via SavageMX ICD
Besides some random lines there are even more artifacts created by the ICD made for Savage MX.
For more screenshots see the Savage3D gallery.

On the positive side, outside of bugs, almost all the games tested were rendered properly and up to the standard of 1998. There are some other shortcomings of the driver worth mentioning. Support for resolutions is quite small. For example, 1280x960 could not be enabled, a resolution I wanted to test Tomb Raider II with. But since the Savage3D handled it at full speed at 1280x1024 anyway, that result will be recorded in benchmarks. Well, how fast is it?

Performance

It won't surprise anyone how the Savage3D runs circles around Virge. The tougher question to answer is how it stands against its peers. Once you see the choice of opposing card, it is clear the numbers aren't high.

Against the Ati Rage XL, a younger card but rooted in 1997, the Savage3D musters only a 7 % advantage in average framerates and barely any at minimal ones. This is underwhelming for a chip that was supposed to take on high-end in 1998. But ay, there is a catch. In more demanding games the Savage3D often barely loses any steam when upping the resolution from 640x480 to 800x600. That prompted me to exclude the lower resolution variants of tests, but because the card is not consistent even in this regard, the numbers remained the same. Also, the tragic results of multiplayer Quake demos suggest to me quite a CPU bottleneck, at least for OpenGL. Partially this is also about the drivers, since S3 did not implement 3DNow! extensions. That raises hopes for the card on faster systems, but let's reiterate the platform used was still above period correct one and should not be an excuse.
Overall, this Savage3D placed itself in between Rage XL and G200 in terms of speed. Voodoo2 killer this is not.

Last words

Savage3D wasn't the conquerer, not even the savior S3 needed. Starting prices had to be pushed to $150 and continued going down as people were learning about all the troubles. Only at the beginning of 1999 would Hercules release their Savage3D Supercharged card that would at 120 MHz almost meet that elusive highest clock. And a few months later a whole group of new chips made Savage3D obsolete. What's more, S3 itself effectively obsoleted the Savage3D by stopping driver development within a year since release!
Nonetheless, it was the right path for hardware. Version MX/IX with memory in a single package with chip reduced power draw and cost, helping integration to motherboards. And among the new chips of 1999 was of course also the Savage4. More mature, optimized evolution of the first Savage, and considering what it was, it was a respectable solution from day one and was actually selling well. For the second half of the year, S3 aimed even higher with their first dual pipeline architecture with Transform and Lighting unit. But new hope quickly vanished when Savage 2000 failed performance expectations and the TnL unit turned out defective. At this low point VIA, a big partner still in need of integrated graphics, moved in for acquisition. S3 went through restructuring which probably delayed following Savage XP chip too much. It was essentially only updated Savage 2000 done right and was recalled at the last moment in April 2002. Next performance architecture was coming on schedule and that was DeltaChrome S8. However, despite strong specifications, there was a widening gap between the efficiency of S3 and the two biggest competitors. Chrome series continued with less ambitious chips, their few highlights being low power consumption and HTPC value. While S3 lives till today, owned by HTC, they have remained silent since 2010 and nobody really believes they could ever make it back.