In late May, AMD released their new K6-2 processor. One of the main advances that AMD made with the K6-2 was the addition of the 3Dnow! instruction set. Many people have asked, how big a performance difference can an instruction set really have? Well, AMD believes that 3Dnow! is all the K6-2 needs, whereas the K6-2 without 3Dnow! is almost identical in performance to the older K6. But 3Dnow! is very powerful and performance increases have been measured at up to 87% so far. For those wondering exactly what 3Dnow! is, this months column is a basic primer on 3Dnow! and the K6-2.
3Dnow! is a set of extra instructions similar to MMX, but instead of integer acceleration, 3Dnow! is designed to accelerate 3D graphics, through faster FPU performance. However, 3Dnow! is much more beneficial than MMX and it's hardly fair to compare the two. Let's face it, MMX was a joke. It's as if Intel wanted to test their marketing division. When it was released, MMX had almost no support. Only now is it beginning to offer meager performance gains through more widespread support, because of the fact that all new processors include MMX instructions.
3Dnow! gains much of its performance increase due to the use of SIMD (Single Instruction Multiple Data) floating point instructions. Basically the use of SIMD allows multiple operations to be performed at once. Because of this, the K6-2 can perform 4 floating point operations per clock cycle as opposed to 1 per clock cycle with a Pentium-II. Now this doesn't translate directly into games running 4 times faster, as you won't be seeing Quake 2 running at 200fps anytime soon. Just because the K6-2 can perform 4 floating point operations per clock cycle, doesn't mean it always does. Looking at a simple example, any given game sends all its FPU calculations through one data pipe. The K6-2 adds three additional pipes, greatly increasing the bandwidth. But unless the game is programmed to recognize the extra 3 pipes, performance will be the same. However, once the game is optimized to make use of the extra bandwidth of the K6-2, it can use the increased bandwidth for either a performance increase or for creating much more detailed and complex environments, while still running at the same frame rate.
So, unless a game is fully optimized from the start, the full capability of 3Dnow! is not always used. With the requirement of 3Dnow! optimized software to see any real performance gains, are we going to see yet another MMX support fiasco? Not so.
For one, AMD has support of many developers and claims "100s more native applications to follow in '98 and '99". But the real bonus to 3Dnow! technology is that it can be used to increase performance in three different areas. Before I get into the optimization, let me explain how the 3D graphics pipeline works. There are four main parts in creating 3D graphics: Physics, Geometry, Triangle Set up, and Pixel Rendering. The CPU handles the Physics, Geometry, and some of the Triangle set up. While the CPU is processing this data, the 3D card just waits for the data from the CPU before it can start rendering. This is why the performance of many 3D cards scales up with faster processors.
With native 3Dnow! support directly in games, the physics and modeling stage (first stage) can be accelerated. Because of the increased FPU throughput, application can have performance increases up to 400%. But why don't we see those performance increases? In the words of Lance Smith, Director of Technical Marketing at AMD, "...the overall impact on the application is determined by the amount of time spent in this code. For instance, if the physics and modeling (using a single pipelined FPU) takes 15-20% of the execution time which is typical for today's applications (more for complex applications, more objects in a scene), than overall performance will increase to 11-15%."
The 3D Rendering pipeline (second stage) is also an area that 3Dnow! can accelerate. The optimized code either can exist in a 3D API such as Direct 3D or OpenGL, or it can exist natively in the application in the form of a custom renderer. So a non-3Dnow! optimized game which uses Direct X 6 (optimized for 3Dnow!) will see a performance increase due to the 3Dnow! optimization in Direct X 6. How much of an increase? Again in the words of Lance Smith, "Depending on the complexity of the scene (i.e. lights, objects, filters and resolution), the time spent can range from 50-70% of the execution time which is again typical for today's applications (more for complex scenes), than the overall performance increase for this example can range from 37-50%."
Finally, 3Dnow! can be supported in the video card driver itself, which handles the final steps of triangle set up and pixel rendering. Since triangle set up and pixel rendering use a mix of integer operations and floating-point operations, 3Dnow! support can be programmed into the video drivers to help speed up the FPU operations. Currently NVIDIA, Matrox, ATI, and 3Dfx are developing 3Dnow! optimized video drivers. The performance increase due to video driver optimization can vary between 10-15%.
So even if 3Dnow! is not natively supported by a game, it still can offer a performance increase if the game uses optimized APIs and/or video card drivers. One added benefit to the 3Dnow! instruction set is that it isn't just restricted to 3D graphics. Since it accelerates the FPU performance, any application that uses FPU can benefit when optimized. For example, 3D sound that can take a toll on the CPU, can be improved with 3Dnow!. In fact, Aureal Interactive is currently working to optimize A3D for the 3Dnow! instruction set.
Now that you've heard the skinny on 3Dnow! you're probably wondering why AMD decided to use 3Dnow! instead of improving the chip architecture. A few weeks ago, and AMD engineer pointed out an interesting point in an online message post. Most of the software on the market today is optimized for Intel CPUs. Since each different CPU has a unique way of processing data, different optimizations can be made. Because Intel has the largest market share by far, many software designers choose to optimize for Intel hardware. By using 3Dnow!, developers can effectively optimize software for the AMD K6-2 and with optimization it can compete head-to-head with the Pentium-II.
| |