The mystery of the severed cables You know you’re in Texas when…
Feb 07

It’s possible to write some C code to work out whether a machine’s architecture is little-endian or big-endian with respect to bytes.

Is it possible, using only ANSI C, to work out whether the machine’s architecture is big-endian or little-endian with respect to bits?

I don’t know the ANSI spec in nearly as much pedantic detail as would be necessary to answer the question, but I bet someone does.

Tagged: , , , , , ,

8 Responses to “C question”

  1. http://ashley-y.livejournal.com/ Says:

    Is that even meaningful? Memory is a list of bytes, and a byte is simply a cell with N possible values.

  2. meta Says:

    Except that if you rotate a memory word left, the effect will be to multiply by 2 in one case, but to divide by 2 in the other case.

    So I guess the question is how exactly the standard defines the < < and >> operators. If it says they rotate left and right, then you could use them to determine bit-endianness. But I have a hunch that ANSI probably standardized << to multiply, for code portability reasons.

  3. http://gareth-rees.livejournal.com/ Says:

    Tony Finch has a good explanation here. Basically, endianness as a concept only makes sense when you have some way of addressing the subunits. Most designs of CPU have no way of addressing bits: to access bits you load whole bytes or words and then use shifts, masks, and logical operations. So endianness makes no sense for these processors.

    However, there are some processors with bit-addressing instructions, for example the Intel 8051 microcontroller, which has 16 bytes of bit-addressable internal RAM. When programming for such a chip it would make sense to talk about the endianness of the bits, but only if you were programming in assembly language, because C has no facilities for addressing bits.

  4. http://gareth-rees.livejournal.com/ Says:

    The C shift operators are defined in terms of “left” and “right” (with respect to the conventional way of writing numbers), not “big” and “little” (with respect to addresses).

    I don’t have ANSI to hand, but K&R says “The value of E1<<E2 is E1 (interpreted as a bit pattern) left-shifted E2 bits; in the absence of overflow, this is equivalent to multiplying by 2^E2.”

  5. meta Says:

    OK, that was what I suspected. So in fact, < < might compile to SHR and >> to SHL on some architectures. So there’s no way to tell from C.

  6. drj11 Says:

    Well, kind of yes. Except that everyone when considering a sequence of bits as a multi-bit integer puts the most significant bits on the left. So << always compiles to SHL. There are two wrinkles here. Not everyone labels the most significant bit of a 32-bit word with 31 and the least significant bit with 0, for example POWER architecture. This turns out to be irrelevant. Further, SHL is just a human convention, the actual machine code instruction is just a number. Human convention dictates that the instruction which moves bits to more significant positions is called SHL and the other one SHR.

  7. drj11 Says:

    I see no-one has answered the byte-endianness question. The answer is more or less, yes it is possible since any object can legitimately be accessed as a sequence of char (and C99 makes this much much clearer than it was in C89):

    int main(void) {
    int a=1;
    int le=*(char *)&a;
    return le;
    }

    This program returns 1 to its environment on little-endian architectures and 0 otherwise. On Unix it could be named isbe. The C code is totally legal, I personally guarantee it.

    Of course that’s not to say it doesn’t have bugs. An int is represented in sizeof(int) bytes (C bytes, that is chars, might not be 8 bits each), so there are sizeof(int)! possible orderings (or endiannesses if you will), and this code only distinguishes two of them. I suppose it’s possible that the bits of an int might be assigned to the bits of a char is some unnatural way. That would mean that the bottom 8 bits of the int might be scrambled before being stored in a char so that the int 1 might not have any of its representation chars be 1. But really I’d be very very surprised if anyone thought that was allowed by the standard.

  8. http://bovlb.livejournal.com/ Says:

    The obvious way to test for bitwise endianness would be to pun on a bitfield. The following program will return 1 precisely when the least significant bit in an unsigned long corresponds to the first bit allocated as a bitfield. I suspect that this tells you absolutely nothing about the architecture.

    union u {
    unsigned long l;
    struct {
    unsigned bit : 1;
    } s;
    };
    int main() {
    union u x;
    x.l = 0;
    x.s.bit = 1;
    return x.l & 1;
    }

Leave a Reply

You must be logged in to post a comment.