doubts about "internet checksum algorithm" in rfc 1071

saga_uni · 04-12-2012, 07:04 AM

I wanna know if codes below in rfc 1071works properly in big-endian cpu.

When count is odd, sum += * (unsigned char *) addr it only add one byte's content and I think it should add 16-bit content one time, is it right in big-endian cpu?

here is my doubts:
when you pass odd bytes's data to the funcution below, it shoud work. just think about two situations:
1. you pass odd bytes's data to it, it goes to

if( count > 0 )
sum += * (unsigned char *) addr;

sum only adds the last byte in sum's low 8bit
2.the same data(odd bytes), acording to internet checksum alogrithm, we fill a byte(all 0s) to make the data to be even bytes, and then we pass it to the funciton. it only goes to:

if( count > 0 )
sum += * (unsigned char *) addr;

in big-endian cpu, sum adds the last 16bit's number made of the last byte(not the byte we filled) and the byte we filled(all 0). then sum in this situation is not equal with previous, is it a paradox and is the function in rfc 1071 right on all kinds of cpus(for example, big-endian) ?

in 6
{
/* Compute Internet Checksum for "count" bytes
* beginning at location "addr".
*/
register long sum = 0;

while( count > 1 ) {
/* This is the inner loop */
sum += * (unsigned short) addr++;
count -= 2;
}

/* Add left-over byte, if any */
if( count > 0 )
sum += * (unsigned char *) addr;

/* Fold 32-bit sum to 16 bits */
while (sum>>16)
sum = (sum & 0xffff) + (sum >> 16);

checksum = ~sum;
}

saga_uni · 04-13-2012, 10:48 PM

no one knows sth. about this?

dwhitney67 · 04-14-2012, 09:19 AM

Quote:

Originally Posted by saga_uni

no one knows sth. about this?

This is the checksum function/algorithm that I have used successfully:

Code:

uint16_t headerChecksum(const uint16_t* data, unsigned int nbytes)
{
  uint32_t sum = 0;

  for (; nbytes > 1; nbytes -= 2)
  {
    sum += *data++;
  }

  if (nbytes == 1)
  {
    sum += *(unsigned char*) data;
  }

  sum  = (sum >> 16) + (sum & 0xFFFF);
  sum += (sum >> 16);

  return ~sum;
}

Now, as for your primary concern, that is whether such a result is computed equally on a Big Endian system, well I cannot recall. I do not have access to a system with that type of architecture. In fact, nowadays, most people probably don't either. It seems like Intel (and vis-a-vis AMD) won the Endian battle against Motorola.

Have you performed any experimentation on a Big Endian system to verify whether the algorithm produces valid checksums? All you have to do (and this is a tall order for novices), is create a raw TCP or UDP packet, and send it across the wire or even to localhost. Use WireShark (aka Ethereal) to monitor the network traffic.

Nominal Animal · 04-14-2012, 12:25 PM

If you read RFC 1071 carefully, you'll see that there are examples for both little-endian and big-endian architectures.

It is easy to see why one might trip up, though.

You see, the result that the dwhitney67's code (and the example code in RFC 1071 as listed by the OP), must not be interpreted as a 16-bit unsigned integer in native byte order, but as a 16-bit unsigned integer in network byte order. In other words, to get the native value, you need to use ntohs() (from <arpa/inet.h>), i.e.

Code:

uint16_t ntohs(const uint16_t value)
{
    const uint8_t *const byte = (const uint8_t *)&value;

    return byte[0] * 256U + byte[1];
}

Also, both dwhitney67's and the example C posted by the OP assume that the checksum will not overflow the temporary storage. It will on 32-bit architectures, for example for messages longer than 65537 bytes that contain only all-bits-set; in that case the checksum does not match the one described by RFC 1701.

The RFC 1701 explicitly specifies that when adding, the carry bit (overflow bit, if the result does not fit into the register), must be added to the register. This allows the same algorithm to be implemented using 16-bit, 32-bit, or even 64-bit registers -- as long as the carry bits are summed back. Regardless of the register size or endianness.

If you write a trivial program that computes the checksum using the RFC 1701 algorithm (for example, using dwhitney67's code for at most 65537-byte messages on 32-bit architectures), and you output the checksum value as a network-endian number (using e.g. printf("0x%04x", ntohs(checksum)); ), you will find that it will, indeed, output the exact same result on both little-endian and big-endian architectures.

I confess I had to verify this logic myself (on little-endian x86-64 and big-endian sun4u sparc)..

saga_uni · 04-18-2012, 11:29 PM

The point is I can't find a big-endian cpu to verify it.