ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I am wanting to learn C, but am thinking that it may be worth learning to write 64bit C from the ground up. However, all of the tutorials that I have come across are aimed at writing 32bit code.
Any materials that are available would be appreciated. Also, any input or pointers on how to do so
There are same common problems that arise when you migrate from 32-bit to 64-bit (like assuming that sizeof(int)=sizeof(pointer), but if you start to learn on a 64-bit platform (or better, if you compile your programs in both modes), you won't have these problems.
Programs in C ought to be written portably, so they work despite differences between 32-bit and 64-bit.
For most programs, that isn't even an issue. It tends to take pretty advanced programming to even create a situation in which the natural way to code something is non portable and extra effort is then required to make it portable.
For example, the single most common non portable usage is assuming that an int and a pointer are the same size (which is true in x86 and other typical 32-bit architectures but false in x86_64 and some other 64-bit architectures). A beginner probably never sees a situation in which there would be any reason to make such an assumption, so avoiding such assumptions doesn't take any effort at all.
In more abstract programming, you often have a situation in which a parameter is passed from code that knows its type to ultimately reach other code that knows its type, but in between goes through code that doesn't know the type. You can do that correctly with a union. But it is enough easier via casts that many programmers have accidentally inserted non portable assumptions about size via such casts.
If you expect to care about performance, there is one programming habit I would start early because of its benefits in x86_64 architecture (it does no harm in x86 or other architectures):
Invent an index type and use a typedef to make it the same as unsigned int
typedef unsigned int index_t;
Then, whenever you have a variable that counts or indexes elements of an array, or a variable that loops through moderate size non negative values, use that type instead of using (what most people use) int.
for (index_t n= ...)
for (int n= ...)
When doing this (using unsigned int instead of size_t) you are limiting your program to 4 billion elements in any one array. But when you think about that, you realize it is not much of a limit. A 64-bit program can use far more than 4GB of memory without approaching 4G elements in one array. It may even have more than 4G Bytes in one array of structs without close to 4G structs in that array.
By using a typedef, you make it easy to change the program later, when problem size has grown so much that 4G elements per array is a real limit.
By using unsigned int now, you get better code now.
In 32-bit, int and unsigned int and size_t all perform exactly the same in the binary code, so in the common case where any one of them is correct, there is no performance difference either.
But in x86_64, in the common loop and index situations where any one of those types would give the right results, unsigned int tends to cause the compiler to generate smaller binary instructions (compared to either int or size_t) creating lots of situations in which the interaction between the L1 cache and the instruction pre fetch runs more smoothly, creating code that runs slightly faster (despite the actual operations of unsigned int being the same nominal speed as those with size_t). (int often needs extra instructions for sign extend where a non asm programmer wouldn't expect such a difference).
I expect most other experts would argue in favor of going directly to size_t, so that you don't need to guess at the time you write the program whether someone will use it later with arrays over 4G elements. Whether you agree with them or with me, part of the suggestion is the same: In 32-bit C programming you see a lot of indexes that are declared as int. In 64-bit C programming, int usually works as an index, but using it is a very bad habit to get into.
I would strongly echo JohnsFine's advice here, because when you define the string size_t (or whatever it may be), you are defining ... in one easily-found and easily-changed place ... a word that has meaning. In other words, size_t foo; is saying both to the computer (who really doesn't care) and to the programmer (who does!!) that "foo is a size_t."
It's very important when writing code to make your intentions clear, not just to the computer but also to the person who's picking-up after you following "that very unfortunate incident involving the bread truck." (Your funeral is next Friday...) Simple tricks like the one he suggests not only make the code more portable, but also more meaningful. In some cases and with some (other) languages they can also enable the compiler to detect the kind of "niggling mistakes" that are so often the consequence of an un-noticed tpyo. (The hair-follicles that you save will be your own, and the day will come, at least for all you gentlemen out there, when that is very important.)
When I used the example of index_t I was suggesting defining your own type name, for situations where a careless or beginner programmer would use int without thinking and where a careful programmer writing portable code might correctly use (the standard) size_t. In those situations, I prefer defining and using my own type name for the reasons described above.
In x86_64, size_t is a 64 bit unsigned integer, which can make it slower in many situations where a 32 bit unsigned integer is good enough.