I have a portion of a calculation which turns a triplet of double-precision coordinates (x,y,z), into an integer index. To do this, I am using an integer cast, of the form int cx = (int)(x/dx);. In testing, however, this seems much slower than I would expect; mere division is much faster.
Is there a reason why this operation takes so long? Is there a way of making it faster?
I don’t think there’s nearly enough information here to answer (C? C++? Compiler used? Compiler flags? Architecture?), but here’s a test case that contradicts the idea that casting is much slower. I do see that the first calculation takes infinitesimally longer than the second one, possibly due to caching.
#include <stdio.h>
#include <omp.h>
int main(int argc, char *argv[]) {
double x, dx, dcx, s, e1, e2;
int icx;
x = 3.1;
dx = 0.1;
s = omp_get_wtime(); icx = (int)(x/dx); e1 = omp_get_wtime()-s;
printf("%f with casting\n", e1);
s = omp_get_wtime(); dcx = (x/dx); e2 = omp_get_wtime()-s;
printf("%f without casting\n", e2);
}
When compiled with gcc -o cast_test cast_test.c -fopenmp using gcc 4.8.5 on Red Hat with an Intel Xeon E5-2680v4 CPU, I get 0.000001 seconds elapsed for the first calculation, and 0.000000 for the second calculation, regardless of if the casted calculation is first or second.
So maybe there’s something where my test case or environment deviates from the one with the observed behavior.
its a bit long to summarize in ask.ci, but the essence is a cast implies moving from one register to another. That, in practice, entails the register contents moving off chip core and back etc… - which can create a performance hit.