LinuxQuestions.org - Bit shifting 6-byte value into a u64

- Linux - Kernel (https://www.linuxquestions.org/questions/linux-kernel-70/)

- - Bit shifting 6-byte value into a u64 (https://www.linuxquestions.org/questions/linux-kernel-70/bit-shifting-6-byte-value-into-a-u64-738374/)

Bit shifting 6-byte value into a u64

Hey all,

I'm getting a compiler warning stating that the left shift count is >= the width of the type about code that is bit shifting a 6-byte (8 bit/byte) value into a u64. Is there a better way to perform this operation? Currently the code is:

Code:

u64_val = ((u8_val[0] << 40) | \

          (u8_val[1] << 32) | \

          (u8_val[2] << 24) | \

          (u8_val[3] << 16) | \

          (u8_val[4] << 8) | \

          (u8_val[5]));

Essentially I want the extra 16 bytes to be zero padded out since both values represent an unsigned integer that is represented by 48 bits.

I'm not sure whether there is a better way than correcting the code you have. But I understand the bug in your code that you seem to not understand.

Take a simpler example

u64_val = u8_val << 40;

The language standard does not care about the destination for the '=' when it is interpreting the '<<'. The data types involved in that '<<' are based just on its operands. So you are doing a 40 bit left shift whose result has less than 40 bits, which is not what you intend.

To make the result of the left shift have 64 bits, you need to cast the input to have 64 bits before the shift. Maybe:

u64_val = (unsigned long long)u8_val << 40;

I understand you probably don't want all the excess processing that might imply (cast each 8 bit value to a 64 bit value and shift all those and or them all together). But in such cases there might not be any good alternative to simply coding it that way and trusting the optimizer to throw away most of the excess work.

Quote:

Originally Posted by johnsfine (Post 3599928)

But I understand the bug in your code that you seem to not understand.

Take a simpler example

u64_val = u8_val << 40;

The language standard does not care about the destination for the '=' when it is interpreting the '<<'. The data types involved in that '<<' are based just on its operands. So you are doing a 40 bit left shift whose result has less than 40 bits, which is not what you intend.

Ah, I think you got it right. I was worried there was some sort of architecture dependency on the u64 type. Looks like instead the bit shift was on a 32-bit boundary. So I casted it like you mentioned and it got rid of the warning. I am a little concerned about the overhead involved here so maybe there are some other suggestions?

Maybe it would be better to create a 64-bit buffer of u8, memset it to 0, memcpy the 48-bit value to &buf[2], then memcpy buf into u64_val?

Quote:

Originally Posted by ranthal (Post 3599956)

Maybe it would be better to create a 64-bit buffer of u8, memset it to 0, memcpy the 48-bit value to &buf[2], then memcpy buf into u64_val?

What architecture are you doing this on? Or do you intend the code to be portable across architectures?

In i386 and x86_64 architectures, values are stored least significant byte first. So the least significant byte (coming from u8_val[5] in your code) goes to buf[0] and the next least significant (from u8_val[4]) goes to buf[1] and up to the most significant from u8_val[0] to buf[5].

Your idea, which I quoted above, assume the opposite. That would be correct in some less common architectures.

The fact that you can get details like that easily wrong, is even more reason to trust the optimizer and code just what you want done, not some alternative you think might be faster.

I'm usually on the other side of that argument. I push a little harder than other programmers for best speed in almost everything I code. I totally reject the often quoted nonsense about "premature optimization". But the compiler may do surprisingly well optimizing out the excess work if you code this the robust way, and even I can't think of an alternative that would clearly result in faster code. Your idea, properly corrected, probably would just make it slower.

If I really really needed fast code in this situation, I'd write the robust version, then look at the generated asm code, then decide what to do about it. But I assume you don't know enough asm coding for that to help.

Probably the robust code with all the casts and shifts is your best bet.

Quote:

Originally Posted by johnsfine (Post 3599978)

What architecture are you doing this on? Or do you intend the code to be portable across architectures?

Portability is definitely an issue. I've been developing on a PowerPC (MPC 8313) and it will be used on an ARM (OMAP 2430). The reason for this is outside of the scope of this discussion but you make an excellent point on which order the bytes are stored in so I guess the cast will have to do.

Another method to modify a u64 with u8's is to make a union with a u64 and u8[8]. You can then write the u8[] part of the union and read the u64.

Compilers are pretty good at this sort of thing. Probably either method will generate similar or equivalent instructions.