At NVIDIA’s SIGGRAPH 2018 keynote given by CEO Jensen Huang, NVIDIA officially announced the upcoming Turing GPU architecture which will be powering NVIDIA’s Quadro line-up of graphics cards. The GeForce launch is rumoured to take place a week later at GAMESCOM 2018. Today’s announcement highlights the new features that the professional NVIDIA Turing line-up i.e. the Quadro will incorporate. Hybrid Rendering, Ray Tracing and Neural Networking using Tensor Cores aided by RT Cores was the highlight of Jensen Huang’s keynote presentation at SIGGRAPH 2018.
Today’s announcement is centered around three Quadro SKUs, all featuring the new RTX branding which highlights how hard NVIDIA is going to push on the Ray Tracing aspect of these new Turing GPUs. We saw three new SKUs announced under the Quadro sub-brand today including the NVIDIA Quadro RTX 8000, NVIDIA Quadro RTX 6000 and the NVIDIA Quadro RTX 5000. These cards bring hardware accelerated ray-tracing, AI, advanced shading and simulation to the table.
At the top of the stack is the NVIDIA Quadro RTX 8000 which has a whopping 48 GB of GDDR6 memory and packs in 4,608 CUDA cores along with 576 Tensor Cores. The NVIDIA QUADRO RTX 6000 seems a minor downgrade with just the memory being brought down to 24 GB, prima facie. However, the Quadro RTX 5000 seems to be a different GPU altogether with different core counts from the RTX 8000 and RTX 6000, and an even smaller memory buffer of 16 GB.
The new Turing GPU architecture from NVIDIA now includes three distinct type of cores – RT (Ray Tracing) Cores, CUDA Cores and Tensor Cores. The RT Cores enable real-time ray tracing as they demonstrated last year with the Star Wars technology demo prior to the release of the movie. The RT Cores are capable of handling objects and environments with physically accurate shadows, reflections and global illumination.
Then you have the Tensor Cores which have been around since Volta came out which helps with deep learning, both AI training and inferencing. The 4,608 CUDA cores in the Turing GPU that powers the Quadro RTX 8000 and 6000 are capable of churning 16 trillion floating point operations in parallel with 16 trillion integer operations per second.
The Quadro RTX 8000, 6000 and 5000 are also the first NVIDIA GPUs to implement the new Samsung ultra-fast 16Gb GDDR6 NAND chips. With the use of the NVLink finger, two Quadro RTX graphics cards can be combined to double the memory capacity and deliver up to 1000 GB/s of data bandwidth.
NVIDIA also announced software improvements with new and existing technologies such as Variable Rate Shading, Multi-View Rendering and VRWorks Audio.
At the SIGGRAPH announcement, NVIDIA did mention some performance numbers which put the new Quadro RTX cards into perspective. Compared to the PASCAL based Quadro cards, the new Turing based Quadro RTX cards are capable of a lot more owing to the new RT and Tensor Cores. If we look at the CUDA Cores which handled shaders and compute then Turing is capable of 16 TFLOPS + 16 TIPS as opposed to Pascal’s 13 TFLOPS on FP32. The Tensor Cores enable 125 TFLOPS FP16, 250 TOPS INT8 and 500 TOPS INT4. And lastly, the new RT cores are capable of tracing 10 Giga Rays/Second. This is based on the GPU that’s powering the Quadro RTX 8000 which was compared against the Pascal based P6000. Volta based GV100 is between the two and there doesn’t appear to be that big an improvement if you consider just the half-precision calculations. Throw in the massive bandwidth enabled by GDDR6 and the two start to differ significantly.
The initial set of three Quadro RTX graphics cards based on the newly confirmed NVIDIA Turing architecture will be commercially available in Q4 2018. The top of the line Quadro RTX 8000 comes at a price of $10,000. That’s quite a bit for the new RT cores.