[SOLVED] Generate a long val from 5-byte char string. ¿Casting..., how?

volbus · 06-22-2014, 10:42 AM

Hi!
I have the following situation, in C programming:

Code:

unsigned char s[5];
unsigned long q;
/* s contains a decimal value, read elsewhere,  represented on 5-bytes; s[0]=most significant byte, s[4]=least significant byte */
q = s[0]*0x100000000u + s[1]*0x1000000u + s[2]*0x10000u + s[3]*0x100u + s[4];
/* now q contains the decimal value I need */

Is there any other way, a better way, to get my decimal unsigned value from the 5 char-bytes?
I ask, 'cause I'm new to C, and think that, instead of just copying the 5 bytes in the long variable, the compiler will actually write 4 long multiplying instructions + 5 additions, witch seems quite complicated and unnecessary... to me!
Is there any magic, like q = (long) ...my_string ??
Thank you!

metaschima · 06-22-2014, 11:23 AM

It depends on the machine you are using. Is it big endian or little endian ? Most likely it is little endian.

Code:

#include <stdio.h>

int main (void)
{
	unsigned char s[5] = {0xfe, 0x23, 0x12, 0x45, 0x32};
	unsigned long q = s[0]*0x100000000u + s[1]*0x1000000u + s[2]*0x10000u + s[3]*0x100u + s[4];
	unsigned long * lp = s;

	printf ("%lx\n%lx\n", q, __builtin_bswap64(*lp) >> 24);

	return 0;
}

However, I recommend using standard fixed width int types:

Code:

#include <stdio.h>
#include <stdint.h>

int main (void)
{
	uint8_t s[5] = {0xfe, 0x23, 0x12, 0x45, 0x32};
	uint64_t q = s[0]*0x100000000u + s[1]*0x1000000u + s[2]*0x10000u + s[3]*0x100u + s[4];
	uint64_t * lp = s;

	printf ("%lx\n%lx\n", q, __builtin_bswap64(*lp) >> 24);

	return 0;
}

volbus · 06-22-2014, 11:40 AM

Ok, interesting...
Compiling your code gives the following warning: "warning: initialization from incompatible pointer type [enabled by default]
unsigned long * lp = s;"
Now, my questions:
what exactly is "uint64_t, uint8_t"? Can I also define "uint16_t", "uint128_t" or "int32_t"? What is the "_t"?
And what is "__builtin_bswap64(*lp) >> 24"?
I understand it swaps the content of lp, but how? and the 24 times logical shift?
I haven't found anything like this in my beginning C courses, where can I find documentation on functions like these?
Thanks for the reply!

volbus · 06-22-2014, 11:45 AM

Ok, I see it writes the bytes on a 64-bit variable, starting with most significant bit, that's why you have to shift it right 3 bytes. About little-big endian, I always confuse them..., and it's a normal X86-64 machine
And, how would it be, if s[0] would be the least significant byte?

metaschima · 06-22-2014, 01:17 PM

Sorry, I'll try to simplify it.

The first issue is that 'unsigned long' may be of a different size on a different system. For example on a 32-bit system it will be 4 bytes, while on a 64-bit system it will be 8 bytes long. This can affect calculations, and this is the main reason I recommended using set width types defined in stdint.h. You can use these types as long as you include this. Really they are just macro wrappers for other types. For example, look at stdint.h:

Code:

#if __WORDSIZE == 64
typedef unsigned long int	uint64_t;
#else
__extension__
typedef unsigned long long int	uint64_t;
#endif

It defines the set width type based on your computer's word size, so uint64_t will always be 64-bits = 8 bytes no matter if you use 32-bit or 64-bit system.

This should get rid of the warning, I forgot to cast the pointer. This solution may be too complex for you at this time. Just know that the bswap64 function converts between big and little endian. This is important when converting to and from bytes and integer types. The function is compiler-dependent tho, so on gcc it would be __builtin_bswap64, but on MS VC it would be _byteswap_uint64, so defining a converter yourself would be more portable, as you have done, or you can use some macros to detect the compiler version, but this is probably too advanced.

Code:

#include <stdio.h>
#include <stdint.h>

int main (void)
{
	uint8_t s[5] = {0xfe, 0x23, 0x12, 0x45, 0x32};
	uint64_t q = s[0]*0x100000000u + s[1]*0x1000000u + s[2]*0x10000u + s[3]*0x100u + s[4];
	uint64_t * lp = (uint64_t *) s;

	printf ("%lx\n%lx\n", q, __builtin_bswap64(*lp) >> 24);

	return 0;
}

The logical shift is there because there are 3 bytes not filled. Try filling all bytes to see how it works:

Code:

#include <stdio.h>
#include <stdint.h>

int main (void)
{
	uint8_t s[8] = {0xfe, 0x23, 0x12, 0x45, 0x32, 0x12, 0x14, 0x45};
	uint64_t q = s[0]*0x100000000u + s[1]*0x1000000u + s[2]*0x10000u + s[3]*0x100u + s[4];
	uint64_t * lp = (uint64_t *) s;

	printf ("%lx\n%lx\n", q, __builtin_bswap64(*lp));

	return 0;
}

For gcc builtin functions:
http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

volbus · 06-22-2014, 01:57 PM

Ok, let me see if I get this:
s, being a pointer, at address s we have: 0xfe,23,12,45,32 (0x32 is in address s + 4);
if I have a variable: int i = 0x11223344; with little endian, at address i we have: 0x44,33,22,11; (0x11 in address i+3, right?)
now, after *lp = s, at address &lp we have: 0xfe23124532000000 (lp points to a 64-bit var.)
and if we call bswap(*lp) we have: 0x00000032451223fe, with 0xfe the most significant byte.
So now a logical shift right (here left! he-he) 3 bytes, and than we have 0x32451223fe.
Yeah!, it makes sense!
As I learned C programming (still learning!) it was mentioned somewhere, that functions that begin with an underscore shoud be avoided, only used when one knows exactly what he is doing!
So, if s[0] would be the least significant byte, I don't have to call bswap64, than after *lp = s, I have my decimal value without having to swap or shift anything, right? Will it fill the remaining 3 bytes with 0x00?

metaschima · 06-22-2014, 02:34 PM

To be correct you need s to be 8 bytes, otherwise the other bytes are not known.

The difference is that your method produces 000000fe23124532, while bswap produces fe23124532000000. Which one is correct, it depends on what you are doing with this.

volbus · 06-22-2014, 02:46 PM

Well, tomorrow is a new day, and I shall do some testing and than come back with the results, the way I understand it, and with further questions - if some!
Thanks!