Originally Posted by ranthal
Maybe it would be better to create a 64-bit buffer of u8, memset it to 0, memcpy the 48-bit value to &buf, then memcpy buf into u64_val?
What architecture are you doing this on? Or do you intend the code to be portable across architectures?
In i386 and x86_64 architectures, values are stored least significant byte first. So the least significant byte (coming from u8_val in your code) goes to buf and the next least significant (from u8_val) goes to buf and up to the most significant from u8_val to buf.
Your idea, which I quoted above, assume the opposite. That would be correct in some less common architectures.
The fact that you can get details like that easily wrong, is even more reason to trust the optimizer and code just what you want done, not some alternative you think might be faster.
I'm usually on the other side of that argument. I push a little harder than other programmers for best speed in almost everything I code. I totally reject the often quoted nonsense about "premature optimization". But the compiler may do surprisingly well optimizing out the excess work if you code this the robust way, and even I can't think of an alternative that would clearly result in faster code. Your idea, properly corrected, probably would just make it slower.
really really needed fast code in this situation, I'd write the robust version, then look at the generated asm code, then decide what to do about it. But I assume you don't know enough asm coding for that to help.
Probably the robust code with all the casts and shifts is your best bet.