I did some timing tests and also read some articles like this one (last comment), and it looks like in Release build, float and double values take the same amount of processing time.
How is this possible? When float is less precise and smaller compared to double values, how can the CLR get doubles into the same processing time?
On x86 processors, at least,
double will each be converted to a 10-byte real by the FPU for processing. The FPU doesn't have separate processing units for the different floating-point types it supports.
The age-old advice that
float is faster than
double applied 100 years ago when most CPUs didn't have built-in FPUs (and few people had separate FPU chips), so most floating-point manipulation was done in software. On these machines (which were powered by steam generated by the lava pits), it was faster to use
floats. Now the only real benefit to
floats is that they take up less space (which only matters if you have millions of them).
I had a small project where I used CUDA and I can remember that float was faster than double there, too. For once the traffic between Host and Device is lower (Host is the CPU and the "normal" RAM and Device is the GPU and the corresponding RAM there). But even if the data resides on the Device all the time it's slower. I think I read somewhere that this has changed recently or is supposed to change with the next generation, but I'm not sure.
So it seems that the GPU simply can't handle double precision natively in those cases, which would also explain why GLFloat is usually used rather than GLDouble.
(As I said it's only as far as I can remember, just stumbled upon this while searching for float vs. double on a CPU.)