# Difference between GPU & CPU?



## Gauravs90 (Nov 14, 2009)

Hi guys

I have heard GPUs are powerful than CPUs and also GPUs consume more power than CPUs.

But why they have less clock speed than CPUs?


----------



## r4gs (Nov 15, 2009)

I'm not really an expert, but clock speed has nothing to do with performance.
A Pentium 4 @ 3GHz is much less powerful than a Core2Duo @ 1.4GHz.

Multiple operations running in parallel are more efficient than a single thread.

The GPU is more powerful, but can never be used in the same way a CPU is used. I think it has something to do with it not having a Cache memory comparable to the cpu.

If you have seen the PS3, its 'Cell' processor is extremely powerful, 9 cores running at 3.2GHz, but because of its architecture, it can only give that performance with the correctly coded applications. Similar to why you cannot just install windows and run games on a supercomputer.

The XBOX 360 has a 3core processor also at 3.2GHz, but because it is easier to code games for it, a lot of them look better on the XBOX 360 than on the PS3.

So basically, the performance of any processor does not depend on its clock speed. It depends heavily on its design architecture and on how and what applications it is designed to run.

My memory is a bit hazy, but if you go through some of the old Digit archives, I believe this same question was covered in old magazine.


----------



## hell_storm2006 (Nov 16, 2009)

Google for these basic information!


----------



## Gauravs90 (Nov 16, 2009)

^^ I googled but didn't find any information

At least i didn't know the right key word

Thanks r4gs for the information. so it's mainly difference in architecture.


----------



## hell_storm2006 (Nov 16, 2009)

Just understand the architecture. Although they both are processors. They are meant for a complete different sets of tasks and hence behave differently. All the reasons you mentioned, like clock speed and all are all because of task they perform and their architecture.


----------



## r4gs (Nov 16, 2009)

Each type of processor is designed for a particular task, so yes, it depends a lot on architecture.

The Cell processor on the PS3 for ex is designed to do tasks in a sequence. If it misses a step, it messes up the output. A normal CPU will multi-task, assigning a certain no of cycles for each operation.

The efficiency at which a processor does its assigned task affects its performance. Not necessarily just the clock speed.

A 5GHz quad core will still not give the same graphics capability as a 1.2GHz GPU from nvidia, while the same GPU cannot do anything that a CPU can with the same efficiency.


----------



## thetechfreak (Nov 22, 2009)

CPU is processor example - Intel Corei7,AMD x4 be,etc
GPU is also known as graphics card eg - GeForce GTX 295,9800 GT,9400 gt,etc


----------



## Gauravs90 (Nov 22, 2009)

^^^^^ I'm not a total noob.

I also know that and only want to know the technical differences.
-----------------------------------------
Posted again:
-----------------------------------------
^^^^^ I'm not a total noob.

I also know that and only want to know the technical differences.

By the way i'm eager to know more.


----------



## infra_red_dude (Nov 23, 2009)

r4gs said:


> I'm not really an expert, but clock speed has nothing to do with performance.
> A Pentium 4 @ 3GHz is much less powerful than a Core2Duo @ 1.4GHz.
> 
> Multiple operations running in parallel are more efficient than a single thread.
> ...


Thumbs up!  Very correct!

Only correction is that the GPU has a cache but its not entirely accessible to at the system level. Also, it more of a SIMT processor where multiple threads (streams of data/instr.) are operated upon by a single instruction. For e.g Add 100 vectors. The instruction is the same: Add, the data operands are many: 100 vectors. On a conventional CPU, there has to be 100 add instruction for each data operand. So in a way GPUs are like Vector Processors (like the early Cray machines).



hell_storm2006 said:


> Just understand the architecture. Although they both are processors. They are meant for a complete different sets of tasks and hence behave differently. All the reasons you mentioned, like clock speed and all are all because of task they perform and their architecture.


Yes, both are architected differently. While a CPU is mainly for scalar computation with some vector enhancements (like MMX, 3DNow!, SSE, AltiVec etc.) a GPU is an inherent Vector processor because 3D Graphics is all vectors, dot products, cross products, matrix multiplications etc.



r4gs said:


> The Cell processor on the PS3 for ex is designed to do tasks in a sequence. If it misses a step, it messes up the output. A normal CPU will multi-task, assigning a certain no of cycles for each operation.


This is not true. The Cell has 1 scalar unit (PPE) and 8 vector units (SPEs, 2 of which are reserved). The Cell can efficiently perform the same thing which a normal CPU can. IBM sells Blade servers which are built with Cell processors.



thetechfreak said:


> CPU is processor example - Intel Corei7,AMD x4 be,etc
> GPU is also known as graphics card eg - GeForce GTX 295,9800 GT,9400 gt,etc


GPU is not graphics card. Anything which performes computation is a processor. Both GPU and CPU ARE processors!



Gauravs90 said:


> ^^^^^ I'm not a total noob.
> 
> I also know that and only want to know the technical differences.


The technical difference is that, on a GPU more stress is given on the actual computation (execution) units than the control logic (cache controller, memory controller, branch predictors etc.). While a CPU may have 4 or 8 cores, a desktop class GPU has 100s of cores for computation. Since they are SIMT processors (Single instruction multiple threads), the control logic occupies a smaller percentage of area as compared to the CPU. A CPU is more of a scalar unit. Whereas in a GPU you have like 10 or many more pipelines which are all executing the same instruction on different streams of data.

Lets say you want to increase the alpha (transparency) of 10,000 pixels. On a CPU (assuming no SIMD support) you will need to execute the Change_Alpha instruction 10,000 times and fetch each pixel and instruction in a different clock cycle. Assuming it takes 1 cycle to execute the instruction, the whole operation will take 10,000 cycles to complete. On a SIMD CPU (Intel SSE etc.), you can apply 1 instruction to say, 10 pixels. So you will need to fetch the instruction and the pixels (10,000/10) 1,000 times taking 1,000 cycles to complete the operation. This is because the data width is small (say, only 10).

However on a GPU with 5,000 execution units, the instruction is fetched only once and to keep the execution units busy it has a huge bandwidth (look at GTX295, it has a freaking memory bandwidth of 224 gb/s!!!). So 5,000 pixels are fetched in the first cycle and the next 5,000 pixels in the next cycles. So theoretically the operation would take only 2 cycles! See the difference???!!!

The problem however is for programs are not data-parallel (same instruction for multiple data/thread), a GPU will need to flush its pipeline and load a new instruction every cycle. Also since the data is smaller, only 10 execution units maybe executing the instruction and the rest 4,990 execution units maybe idle. Since most GPUs don't have speculation execution (branch predictors etc. for your 'if' then 'else' statements), it is slower than a CPU for such cases.

A GPU may have less pipeline stages (so more useful work is done in one cycle as it is longer than a CPU's pipeline stage), hence it has a lower clock speed.

But with the advent of nVidia Fermi, ATi 5950 and with GPGPU layers like OpenCL and CUDA this thin line is blurring. We are seeing more CPU-like GPUs these days and more GPU-like CPUs!


----------



## Gauravs90 (Nov 23, 2009)

Thanks man!!!!!

According to your post the GPU which have higher memory bandwith and higher clock speed performs higher?


----------



## infra_red_dude (Nov 24, 2009)

Gauravs90 said:


> Thanks man!!!!!
> 
> According to your post the GPU which have higher memory bandwith and higher clock speed performs higher?


Against what are you comparing them with?


----------



## Gauravs90 (Nov 24, 2009)

^^^^^^^^ yes


----------



## infra_red_dude (Nov 24, 2009)

Gauravs90 said:


> ^^^^^^^^ yes


Yes... what? What are you comparing the GPU with when you say they are faster?


----------



## Gauravs90 (Nov 24, 2009)

I mean a card A has higher clock speed and higher memory bandwith than the card B. Then it's necessary that card A is better than card B?


----------



## Krow (Nov 24, 2009)

In terms of FPS, yes.


----------



## infra_red_dude (Nov 25, 2009)

Gauravs90 said:


> I mean a card A has higher clock speed and higher memory bandwith than the card B. Then it's necessary that card A is better than card B?


NO!

As in case of CPUs, it depends on the number of Processing elements (or shaders). For e.g have a look at this:

*www.anandtech.com/video/showdoc.aspx?i=3658

You can see that the clock speed of ATi 4870 is 750Mhz while that of ATi 5850 is only 725Mhz. But observe that the number of stream processors on the latter is almost 1.75 times that of the former. So you cannot necessarily say that a better clock speed and a higher bandwidth would give better performance as in the case of CPUs (for e.g., a 2.66Ghz Core i7 performers better than a 2.8Ghz Core 2 Quad or a 3.06Ghz Core 2 Duo).

If you see the test results, only when you pair up 2 ATi 4870s (x2) are they able to beat a single GPU ATi 5850 card. A lot depends on the architecture and related factors.

But if you compare 2 GPUs with the same architecture and the number of shader cores and keep everything same except the clock speed and the memory bandwidth, then Yes... the one with the higher clock and memory bandwidth is faster than the other.


----------



## r4gs (Nov 25, 2009)

@infra_red_dude:- Succinctly put. I think that about sums up the difference.
 I remember reading somewhere, probably in digit, that the PS3's cell processor processes data sequentially, that it doesn't proceed till the previous instruction is over. I'll check it up though. Thanks for clearing it up.


----------



## infra_red_dude (Nov 25, 2009)

r4gs said:


> @infra_red_dude:- Succinctly put. I think that about sums up the difference.
> I remember reading somewhere, probably in digit, that the PS3's cell processor processes data sequentially, that it doesn't proceed till the previous instruction is over. I'll check it up though. Thanks for clearing it up.


I think you are confusing sequential (and parallel processing) with in-order (and out-of-order) execution. The Cell's PPE is an in-order, simultaneous multi-threaded (SMT, same as Intel's HyperThreading) processor. In fact, even the 8 SPEs are in-order but parallel processors (more like vector-processors, while the PPE is more like a scalar processor).


----------



## r4gs (Nov 25, 2009)

You seem to have hit the nail on the head. Thanks.


----------



## hell_storm2006 (Nov 25, 2009)

The difference between CPU and GPU would be narrowed when Larabee releases. If it releases at all!


----------

