Interestingly, the graphics card of yore-that dedicated pixel-pushing slave of our entertainment age-is undergoing a very similar evolution. What was once a slab of silicon meant for very specific tasks is fast evolving into a vastly parallel silicon factory, populated by a transistor task force half-a-billion strong, and powering a multi-billion dollar gaming industry. Interestingly, the graphics chip is increasingly being referred to as the graphics processing unit (GPU) because much like the CPU, from which it derives the name, its work is becoming less specialised and more general.
Before we take a look at how this is happening, it is important to identify the three major forces that make up the industry. These are the independent hardware vendors (IHV) who design and produce the graphics chipsets and the graphics cards, the independent software vendors (ISVs or game developers) who design and produce the games, and last, the people whose job it is to talk to both the ISVs and the IHVs to create the software rules and directions which enable the graphics cards to talk to the games. This final piece of the puzzle forms the application programming interface (API). The three players are constantly involved in discussions on how to take the industry forward; these discussions are translated into the API, which then serves as the template upon which a graphics chipset is designed. Based on which the game can offer its eye-candy.
Here, we take a look at one such API-Microsoft’s upcoming DirectX 10 (DX10), more specifically the Direct3D element of DX10. Why was this step taken, and where it will lead us?
Why DirectX 10?
With the introduction of NVIDIA’s very first GeForce chipset, the graphics industry took its first tentative step towards the GPU. The path that was taken then, today leads to a place where the graphics chip does more than push pixels. This semblance to a CPU is because of the identified need for the graphics processor to gain independence from the CPU. With DX10, engineers hope to break the shackles that tie graphics processing to the CPU, not only speeding up rendering but also granting the GPU more power to push floating point and integer data-power that will enable it to accelerate sundry tasks from 3D graphics, to physics, to audio, to DVD playback.
DirectX 10 is thus a very important step in taking the GPU forward into unknown regions. The API is primarily designed to interface with a more generic graphics processor. With the power of DX10 and the hardware supporting it, games of tomorrow will be richer, faster, more interactive, and more detailed. DX10 will take the gaming industry another step closer to the holy grail of truly realistic visuals
This sameness is an unfortunate side-effect of the current API structure. Today’s game needs to run to the API, which in turn runs to the driver, which then talks to the hardware, which finally complies and renders a tree. At each step, the API adds overhead-as much as 40 per cent of the entire cycle is taken up by it. Add more than one unique tree to a scene and the overhead adds up as well. Pretty soon, you would be staring at a slideshow of trees, instead of a smoothly-flowing game world. This is why game developers adopt the copy paste method of rendering one or two different types of tree, and then use the same models throughout the land, more or less. This method is also adopted for enemies, blades of grass, and so on.
Reduce API overhead will allow a game developer to add more detail to a title. Seen here, a screenshot from the game Crysis2
Doing A 360
DX10 will reduce the API overhead by half. This would give the game more time to talk with the GPU and the rest of the system. This means that a game can now pack more content
Traditionally, a graphics chipset carries banks of both pixel and vertex shaders. Based on a scene, the API puts each of these banks to work. Herein lies the problem. Imagine a scene that has a character standing against the sky-the pixel and vertex shaders would be equally needed to render it. Now the character steps inside a car, increasing the workload on the vertex shaders while the pixel shaders enjoy some time off. Let’s say the car blows up in the next scene: lots of explosive effects-smoke, fire, debris, and so on. The pixel shaders are now needed more than the vertex shaders, which can now sit idle.
What if a card has only four pixel shaders and 12 vertex shaders? What if a scene then requires processing beyond the 12 vertex shaders’ capacity? What if a scene is instead unusually high on the number of pixels pushed? All these scenarios would of course lead to game slowdown: almost all of us have noted an otherwise smoothly running game chug to below 10 frames per second at the trigger of an explosion. Here’s why: game cards have dedicated resources to pixel and vertex tasks; if the game exceeds the allocated capacity, everything comes down to a terrible frame rate. More damning is the wasted silicon-why are the pixel shaders twiddling their thumbs when the vertex shaders could use some help?
A unified approach to shaders hopes to solve these problems. A graphics card with a unified shader bank can act as a pixel shader and as a vertex shader, allocating shader resources to the game according to its requirements. In theory, this should make games much faster.
In practice: programmers of today will have to unlearn their habits of programming for contemporary architectures before they can truly make use of a unified architecture-a process that will certainly take time. Note that just because DX10 has a unified shader architecture, all DX10 cards need not have unified shaders. In fact, at least for the foreseeable future, NVIDIA will stick to a contemporary architecture, whereas ATI will go the unified route. It will be interesting to see which of these two philosophies have greater merit.
Apart from a unified architecture, DX10 shares another similarity with the Xbox 360: like the 360, a GPU under DX10 can stream out data to the card’s memory and then call it back in for further use within the GPU, without having to write to an external memory or disk resource. This greatly speeds things up and will allow for some interesting effects, such as realistic shadows.
Under a traditional architecture, some might favour vertex shaders while other pixel shaders, leading to inefficiency in load-sharing.
Sundry Bits
All this discussion was technically limited to the Direct3D element of DirectX 10. What about the other bits that constitutes DirectX? Can we expect major changes there? In a word, no!
One interesting feature does deserve mention though. With the next DirectInput, Microsoft has decided to blur the lines between its console offerings and the PC: you can take any Xbox 360 input device-be it a gamepad or a racing wheel-and use it under Windows. This is currently possible under Windows XP as well, and is thus not a feature exclusive to DX10. It is however, an evolution of the DirectInput constituent. DirectSound does not see any major changes.
The Thread arbriter under a unified architecture ensures that the unified shader is well-used by all shader operations.
As you might know, under Vista, each window is a 3D surface. Thus a graphics card under Vista is always called for, whether or not you’re playing a game. Moreover, with the advent of high-definition video content, more and more 2D acceleration tasks will be offloaded to the GPU. The per-pixel shading capabilities of a GPU will be utilised to present everything from fancy transparent windows to smooth 720p video playback. Similarly, animations such as windows flipping to the foreground or cascading behind each other will be accelerated by the GPU. DX10 will thus expose the hardware to more and more applications, much like how the CPU evolved.
Having said that, the user-interface of Vista does not require DX10-the Aero Glass interface, as it’s called, runs under a variant of DirectX 9. This was done to ascertain that development of Vista is not slowed down by the development of DX10 (both were concurrent developments). Elements of DX9 used in Vista are also seen under DX10, and in the future, the entire interface might move to a DX10 environment-once DX10 reaches maturity and is deemed stable enough.
Windows Vista will thus ship with both DirectX 9.0 and 10. DirectX 10 is currently slated to be a feature exclusive to Windows Vista.
By all indications, DX10 video cards will hit the half-a-billion transistor count, and will be so power-hungry that they will require dedicated power supplies. On the positive side, DX10 video cards will be virtually identical in the features that they offer. A DX10 card will have little to differentiate itself from a similar offering: this has been done largely on the behest of game developers who are tired of programming for esoteric feature sets (remember ATI’s TruForm?). This is a huge plus for gamers as well. For example, with DX9 class cards, one had to worry whether the card supported Shader Model 2.0 or 3.0; concerns of that nature need no longer send us scurrying to the nearest search engine, or to our friendly-neighbourhood 3D guru.
DirectX 10 might not bring about a real-time Toy Story to our monitors; it will certainly not bring about graphics comparable to the real world, but it is a vital step towards those horizons.