If you read
RFC 1071 carefully, you'll see that there are examples for both little-endian and big-endian architectures.
It is easy to see why one might trip up, though.
You see, the result that the dwhitney67's code (and the example code in RFC 1071 as listed by the OP), must not be interpreted as a 16-bit unsigned integer in native byte order,
but as a 16-bit unsigned integer in network byte order. In other words, to get the native value, you need to use
ntohs() (from
<arpa/inet.h>), i.e.
Code:
uint16_t ntohs(const uint16_t value)
{
const uint8_t *const byte = (const uint8_t *)&value;
return byte[0] * 256U + byte[1];
}
Also, both dwhitney67's and the example C posted by the OP assume that the checksum will not overflow the temporary storage. It will on 32-bit architectures, for example for messages longer than 65537 bytes that contain only all-bits-set; in that case the checksum does not match the one described by RFC 1701.
The RFC 1701 explicitly specifies that when adding, the carry bit (overflow bit, if the result does not fit into the register), must be added to the register. This allows the same algorithm to be implemented using 16-bit, 32-bit, or even 64-bit registers -- as long as the carry bits are summed back. Regardless of the register size or endianness.
If you write a trivial program that computes the checksum using the RFC 1701 algorithm (for example, using dwhitney67's code for at most 65537-byte messages on 32-bit architectures), and you output the checksum value as a network-endian number (using e.g.
printf("0x%04x", ntohs(checksum)); ), you will find that it will, indeed, output the exact same result on both little-endian and big-endian architectures.
I confess I had to verify this logic myself (on little-endian x86-64 and big-endian sun4u sparc)..