Chromatic Mpact review STB

A different kind of hardware

Nowadays we have only one way of making 3d accelerators, but in the middle of nineties all options were open. Personal computers were just becoming multimedia machines and strong wave of digital signal processors was coming to handle the new workloads. Biggest names of the industry like IBM, Motorola, Analog Devices, TI and AT&T were developing their DSP. And then there were also fresh startups like Xenon Microsystems. Founded with Kubota (yes, those bulldozer makers had graphics division which also spawn Gary Tarolli and ATi engineers), Convex (vector supercomputer maker sold to HP in 1995), Showgraphics and Rambus, Xenon was to develop a flexible DSP for multi function multimedia solution. Before release the company was renamed to Chromatic Research. In a parallel they worked on a media oriented x86 compatible CPU which did not see the light of day. What made Chromatic special was support not only for usual audio and video processing but also 3D and other functions. 3D graphics consists heavily of vector computing, which is a strong side of media DSP. However, computational power of Mpact! comes from various logic units in it and all the higher level functionality has to be programmed. More specialized designs with minimized programmability are of course faster. Chromatic hoped to overcome this by exploiting flexibility of Mpact! and support for a lot more features with their software driven chip. Parallel hardware and software development should have also speed up time to market.

First Mpact!

The first chip was ready before the end of 1995 and named Mpact! /3000. It was rated incredible peak 3 billion operations per second at 62,5 MHz. While Mpacts had lower clocks then contemporary CPUs, their SIMD instructions made such awesome numbers possible. However, two more things were needed to bring Mpact to life: volume manufacturing partners and software. LG and Toshiba took advantage of Mpact! and built their derivatives, but software development took longer and delayed launch until September 1996. Meanwhile 135 MHz 24 bit Ramdac and a clock synthesizer were integrated into the chip. These cost efficient chips carry R in the name, i.e. R/3000. At the launch was also unveiled new higher frequency design, the R/3600 ticking at 75 MHz thanks to a transition from 0.5 to 0.35 micron process. Price of the card was at the level of midrange 3d accelerator, but Mpact! also handled high quality playback of MPEGII and other video codecs, rich spectrum of audio codecs, 3D sound and later also AC3, and 33k modem DSVD. Together with telephony and videophone Chromatic was boasting support for seven demanding multimedia techniques. But board manufacturers were not very interested and it took a while until STB, Gateway and Micron saved Mpact and used the /3600 in large as DVD decoders. This is a testament to the flexibility of the design, DVD and 56k modem standards were implemented after the chip was done. Similarly Sound Retrieval System extensions were implemented after Chromatic Research entered into an agreement with SRS Lab in 1996. Fixed function hardware rarely gains new features with time, but the impact could emulate legacy hardware and adopt new multimedia standards.

Mpact has it's own real time kernel separating it from system interrupts and latencies of memory and PCI bus. The MRK can choose between real time scheduling or preemptive multitasking. The dirty part was program of the core itself, as VLIW requires a lot of assembly magic to extract performance. SIMD techniques are hard to program as well. Chromatic put so much effort into Mediaware they were very protective of it and never published it on the web. Users were pissed off as the only chance for update was getting a CD from computer vendors. Even nowadays I failed to find Direct3d driver for my Mpact R/3600. Thanks to Slaventus I got a driver with d3d library, but still failed to install my card as 3d accelerator. Perhaps only Mpacts with own display output can function in such way. Test of this one will have to wait until some hero appears to save me.

Here is STB's addon board used for ultimate DVD playback on PC. Whenever you see that fancy letter R on a graphics chip you know there is a Rambus controller inside. Rambus is high frequency, low pin count memory solution. While only one byte of data per clock is transmitted, RDRAM uses extreme frequency of 500 or 600 MHz for Mpact /3000 or /3600 respectively. The memory has higher latencies than other synchronous RAM, but graphics chips are good at hiding them. First released card had two megabytes of memory, this one comes with four, and some carried even six.

Architecture


Grey parts are programmable. Datapath contains all the ALUs.

Design philosophy was derived from a vision of unpredictably developing multimedia heavy future. Dedicated hardware for each function should be inefficient and underutilized. Forget about graphics pipeline, this is a special purpose CPU. Mpact does not have data cache since it is pretty much useless for streaming data. The architecture revolves around multiport addressable SRAM instead of registers, but also has four special purpose registers for indirect addressing of its 72 bit SRAM entries. All the I/O ALU traffic goes through the relatively large SRAM with more than 4 GB/s bandwidth. Since graphics is computed in such ALU's and not pipeline of specialized stages, it takes Mpact four clocks to render one 3d pixel. Unlike fixed function graphics program jumps are supported and hardware loop counts too and without branch overhead. Key unit for DVD playback like in other media processors is an MPEG2 system bit-stream variable length decoder. The Mpact can feed contiguous SRAM entries with DMA transfers asynchronously with ALUs. Multimedia love such high parallelism. Four read and write ports to SRAM are available to five ALU groups. Data crossbar bandwidth is awesome 11 GB/s for Mpact /3000. By being based on 9 bit bytes instead of 8 bit bytes, this was the equivalent of adding a sign bit to each byte which is very useful for MPEG decoding. One of the founders was also a founder of RAMBUS, and not surprisingly Chromatic used RDRAM as its memory technology. Rambus specifies 9 bit memory devices for parity purposes but the Mpact uses the ninth bit for additional precision, giving data sizes of multiplications of 9. This is why Mpact has some unique precisions such as 18 bit Z-buffering and audio sampling. 24 bit 2d colors are supported, 3d is limited to 16 bits.

792 bit internal bus for massive sustainable throughput.

Very Long Instruction Word (VLIW)
Early DSPs had a two or three functional units and reduced code size thanks to CISC instruction sets which supported commonly used parallel issues as single instructions with tag determining sequential or concurrent execution. Newer DSPs have more ALUs and first Mpact features five of them. Extracting high performance from such processor requires stronger instruction feeding. Mpact issue two instructions at once and each can control multiple ALUs. Alternative super-scalar approach with run-time instruction parallelizing has higher overhead. Defining software compatibility at the source code level is not a problem for media processors, whose software life cycle is shorter than a mainstream CPU. Bigger code size of CISC architecture can be tamed with instruction stream compression. Code density of Mpact can do without it, as instructions are of RISC SIMD nature. VLIW is highly flexible and efficiently utilizes resources when given a good compiler, but only up to some amount of parallelism. Compiler complexity increases, benefits diminish with less adaptable algorithms, wiring delays, and a widening gap between on-chip processing speed and available bandwidth to external memory. Demand for higher processing power continues and modern GPUs often combine VLIW parallel processing with super scalar designs / multithreading.

Second Mpact


In year 2000 the earth was hit by a giant meteo... just kidding.

One year had passed since the launch of Mpact /3000 and Chromatic got second generation ready and this time designed with 3d in mind. SGS-Thomson joined the band as another manufacturer. Planned clock of 150 MHz was not reached, 125 was used instead. Thus second Mpact accurately doubled peak BOPs of first Mpact, whence the name Mpact R/6000. Chromatic claimed to make the first device capable of sustaining a billion operations per second. Those enhancements are made possible by much improved 0.35-micron manufacturing process. Number of transistors in the chip increased to 3.5 million and consumes 4.5 watts under full load. Mpact2 includes a small rasterization pipeline, support for AGP texturing and new instructions.

Here is a Nitro DVD by STB, again with chip from LG. Don't mind those heatsinks. Originally STB used quite a big one, but my card arrived without it, so at least I put these little buggers on. Mpact2 can get a bit hot. Also the ugly crystal on the right is my replacement, original one fell off during postage. RDRAM runs at 500 MHz just like with old /3000. I am not aware of any overclocking possibilities. Ramdac speed was increased to 230 MHz. Nitro DVD is very poor when it comes to connectors, if you want to see full multimedia insanity look after Xenon Microsystems card.

Inside Mpact2

The design revolves around similar VLIW core capable of issuing one or two instructions packed into a 72 bit word per cycle. The new multimedia processor still features a tight co-design of software and hardware, integrates a second RDRAM channel, enabling a separate 8 kb instruction cache. This means 1125 MB/s peak bandwidth because memory clock did not increase, The chip now with its double clock also doubles memory requests and RAM arbiter combines results from two channels into double sized data through interleaving. Mpact2 has more on-chip memory bandwidth as well. Internal SRAM is expanded to 8 kb with 6 read and 6 write ports. This central memory still has 11 access ports servicing the six functional units, PCI interface, video I/O interface, random peripherals, and Rambus channels. Most important for me is a specialized polygon rendering pipeline for 3D graphics- sixth ALU group of Mpact2. Together with motion estimation specialized unit those two are the only integer ALUs. The graphics unit is a 35-stage pipeline with 2 kb texture cache. Excluding the cache there is only 83000 gates in the pipeline and it does full speed perspective correction, texture transparency testing, texture and fog blending, pixel dithering and alpha threshold. Feature set is complete, but there are other quirks instead. Quality of bilinear filter is among the worst, with some color banding left in images, caused by limited amount of interpolating steps between vertices. Mpact2 needs "only" 3 clocks per 3d pixel, thus has theoretical fillrate of 41.6 megapixels/s. This value is true for simplistic scenes, in the real world with filtered textures and blending Mpact2 is glad for half of that.

Mpact2 3DVD 8MB
More classy 3DVD card with extensive I/O and 8MB of memory.

Experience

Mpact2 surprised me with its compatibility, but for a multi function DSP board I had low expectations to begin with. At first driver runs same bug was seen in few games, random lines and triangles "shooting" to the screen from top left corner. This is seen for example in Forsaken and TNP. Broken transparency in Expendable covers whole screen with gray color. HUD and sky in Hellbender are corrupted. Incoming shows heavy color dithering. Sky in Monster Truck Madness 1 is not rendered, leaving garbage on screen. Enabling texture filtering in Motoracer layed down disruptive black lines on the track. Second Motoracer was broken completely. Backgrounds of Resident Evil are covered in black. Big textures of Warbirds are not to be seen. Unreal is rendered as lines of random colors and somehow 3dMark things Mpact2 does not support textures. Interestingly enough, most of these big bugs do not appear in older driver and after combining results from two driver versions I could successfully complete almost all game tests. Recently I found even newer driver set, which finally fixed remaining compatibility problems. I am afraid Chromatic's driver policy made this achievement a well kept secret. Mip-mapping and/or memory management was a major source of problems until this last driver. MDK and Ultimate Race Pro have a similar problem with low resolution ground/track texture. Populous was refusing to run in resolution above 320x240. I tried to play with quality slider in the Mpact control panel and it helped with LOD only in Wing Commander Prophecy, lowering fps by some 10%:


Click on the image to see the difference after setting highest quality. Visit whole gallery.

The 8MB card avoided such problems except two games. Ground texture in Mechwarrior 2 switches to lowest mip level at first transition. Viper Racing suffers a lot, the LOD level is completely off, blurring track objects and fence into oblivion. Turning mip-mapping off makes Viper Racing look lot better. And then there is a wrapper for GlQuake and Quake 2, but it is a beta with issues. Games based on first Quake engine run quite slow, and sometime there are random lines and dots appearing. Quake 2 was up to expectations, Quake 3 worked only via d3d wrapper. Last driver brought updated version that speed up Quake 2 nicely, but broke compatibility with Quake 3 and Sin. What's more, it is not stable enough with 4MB card.

Performance

Here are average framerates, click on image to see minimal.

PowerVR's PCX2 is worthy contender. It exceeds Mpact 2 4MB in average framerates by 20% and matches it in minimal. Board with 8 MB of memory sits in between. Feature rich Mpact! 2 however produces nicer images in almost every game. Overall the speed is not tragic by far, but also in significant distance from top dogs of 1997.

Summary

Mpact failed to sell well as a graphics card, but was it faulty concept, wrong execution or bad luck? Many "what if" thoughts complicate such judgements. The whole business model of selling IC design to semiconductor manufacturers did not guarantee much profit for Chromatic. As they could not sign a foundry partner things were getting even more gloomy for planned CPU. Another difficulty was the burden of coding all the Mediaware modules required by the different market targets. The company was not able to interest its customers to cooperate on software development, nor create enough revenue to cover its development cost.

The whole software layer of Mpact was reaching operating system complexity, but also shielded all multimedia features behind single IRQ, greatly enhancing stability. You really did not want to have all the drivers of special devices fighting at Ring 0 of Windows 9x and for PCI bus. First Mpact had little to offer as a 3d card and some dedicated hardware clearly became necessary to cover new trends. There were many reasons for the demise of the company, however, primarily the lack of interest from PC makers made it impossible for Chromatic to sustain its overhead. Although Mpact delivered great 2d graphics performance, state of the art audio capabilities and ultimate DVD playback, so it settles for the inferior 3d experience as fatal. Should have Chromatic fed their gamers via web with actually useful driver updates, maybe gamers could have been more patient and spread some good word about Mpact2. After all, as you can see it can perform reasonably well in 3d. Being best in DVD alone did not sell enough. Consumers accepted cost of special purpose 3d accelerators, CPU vendors boosted their multimedia performance by adding SIMD support, and other graphics vendors assisted DVD playback enough to compete with special purpose decoders. Chromatic Research was known for relatively large number of employees peaking around 350. Of note is the fact that it went through more than 5 rounds of funding. In July 1998 Chromatic announced end of Mpact 3 development and change of focus to a new vaguely described product. The company laid off half of the employees and was heading to bankruptcy. In October ATi decided to buy struggling Chromatic for $67 million and absorbed them by the end of the year. This meant premature end of driver support because ATi could not be bothered to offer any help. Around the same time heavyweights like Samsung and Phillips also gave up on their PC media processors. 3d acceleration was to be done by specialized hardware only.