It's not exactly that simple.
In addition to the compiled version of the code you write, and the functions in libraries such as
glibc, there are other pieces that get combined together to provide a context of sorts that the system expects to be present in an executable file.
In some environments, if you start with your
foo.o and run this command:
you will get
a.out as usual, but in principle, you will also be shown:
* something of the configuration of
gcc.
* related environment variables.
* a list of directories in which libraries will be sought.
* the names of the standard files, those other pieces I mentioned, that are combined with
foo.o.
* a whole collection of options to specify exactly what the linking step is to do.
* with the
-l options, a portion of the name of the libraries to be used to link your particular program.
Generally, if you see an option such as
-labc that says that the linker should try to find a library that is conceptually named
libabc. In the case of your
foo.o file you should see options such as
-lc which tells the linkage procedure to look for a library conceptually named
libc, and another option
-lgcc says to look for a library conceptually named
libgcc.
I'm speaking about the conceptual name of the library to distinguish that name from the specific name of the file containing the library.
In the system I'm using, the library conceptually known as
libgcc actually has a full file name of:
Quote:
/usr/lib64/gcc/x86_64-suse-linux/4.5/libgcc.a
|
The extension
.a on the name of the file containing the library, relates to the form of the library file. Library files can have various forms. On the system I'm using, the
-v shows me a list of directories in which libraries can be sought during linking, the directory
/usr/lib64/gcc/x86_64-suse-linux/4.5 is just one of several on the list.
Typically, by default, the
printf function won't actually be included in
a.out. Instead an indication of where to find
printf will be included in
a.out,
printf will be loaded from the library when
a.out is run. That's
dynamic linking.
If you supply the right options to
gcc,
a.out could be
statically linked, in which case the
printf function would be copied into
a.out when
a.out is created.
I create a
foo.c with the contents you describe, then allow
gcc to create
a.out from it.
The
file command can display the conceptual type of a file. if I run the
file command on
a.out:
the
file command displays this output:
Quote:
a.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.4, not stripped
|
from which you can see that the file is
dynamically linked. When the
ldd command is run on
a.out:
it produces this output:
Quote:
linux-vdso.so.1 => (0x00007fff396ef000)
libc.so.6 => /lib64/libc.so.6 (0x00007fe086bd5000)
/lib64/ld-linux-x86-64.so.2 (0x00007fe086f42000)
|
which shows what libraries and other components will be accessed when
a.out is run.
I let
gcc compile a file
foo.c which contains this code:
Code:
# include <stdio.h>
int main( int argc , char *argv[] )
{
printf( "Hello World!\n" ) ;
}
At first I do what you mentioned:
The
nm command can display names/symbols from inside object modules such as
foo.o, or executable files such as
a.out.
If I run this command:
I get just this output:
Code:
0000000000000000 T main
U puts
There's no mention of
printf. If I change the line in
foo.c containing
printf to look like this:
Code:
printf( "%s %s\n" , "Hello" , "World!" ) ;
and have
gcc recreate
foo.o, now the
nm command shows this output:
Code:
0000000000000000 T main
U printf
that's because "optimization" built into the procedures handled by
gcc, is smart enough to know that a
printf without a format string can really just be handled by the
puts function. I mention this as a complication to linking, because sometimes you don't necessarily need to link the function you think you do. A reference to a different function from a library, might have been substituted for you.
So then to put it altogether, to go back to what I mentioned earlier "
gcc -v", if I grab the last command line from that output, on the system I'm using, this is the command to link
foo.o:
Quote:
/usr/lib64/gcc/x86_64-suse-linux/4.5/collect2 --build-id --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib64/gcc/x86_64-suse-linux/4.5/../../../../lib64/crt1.o /usr/lib64/gcc/x86_64-suse-linux/4.5/../../../../lib64/crti.o /usr/lib64/gcc/x86_64-suse-linux/4.5/crtbegin.o -L/usr/lib64/gcc/x86_64-suse-linux/4.5 -L/usr/lib64/gcc/x86_64-suse-linux/4.5/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib64/gcc/x86_64-suse-linux/4.5/../../../../x86_64-suse-linux/lib -L/usr/lib64/gcc/x86_64-suse-linux/4.5/../../.. foo.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib64/gcc/x86_64-suse-linux/4.5/crtend.o /usr/lib64/gcc/x86_64-suse-linux/4.5/../../../../lib64/crtn.o
|
Simple, isn't it?!
Included in that
simple command you will find things such as:
* a specification of the architecture of the machine I'm using
x86,
64-bit, represented as "
x86_64" ( not really that simple, but it's close enough for this purpose ).
*
/lib64/ld-linux-x86-64.so.2 the jumping off point to the actual "dynamic" linker itself for an "
x86_64" type machine.
* files with names ending in patterns like
crt*.o are those added pieces I mentioned, some people think of the "
crt" as representing the phrase "
C Run Time".
* an indication of where to look for libraries, the arguments to the
-L options.
* an indication of what libraries to find, the arguments to the
-l ( lower case ) options.
From the lower case
-l options, you can see that here the linking procedure would be looking for the libraries
libc,
libgcc, and
libgcc_s.
From the upper case
-L options, you can see that here the linking procedure would be looking for those libraries in these directories:
Quote:
/usr/lib64/gcc/x86_64-suse-linux/4.5
/usr/lib64/gcc/x86_64-suse-linux/4.5/../../../../lib64
/lib/../lib64
/usr/lib/../lib64
/usr/lib64/gcc/x86_64-suse-linux/4.5/../../../../x86_64-suse-linux/lib
/usr/lib64/gcc/x86_64-suse-linux/4.5/../../..
|
which simplifies to this list:
Quote:
/usr/lib64/gcc/x86_64-suse-linux/4.5
/usr/lib64
/lib64
/usr/lib64
/usr/x86_64-suse-linux/lib
/usr/lib64
|
which is in turn just:
Quote:
/usr/lib64/gcc/x86_64-suse-linux/4.5
/usr/lib64
/lib64
/usr/x86_64-suse-linux/lib
|
AFAIK, all the backtracking in the paths, with the ".." stuff, is effectively from the procedure mechanically dealing with file system links. Even then, on the system I'm using, I believe the fourth path really just houses "scripts", prototypes of a sort, used by the linker, not libraries, per se.
These things are all either for the system I'm using, and/or for this program. Your system could be different. But if you really want to delve into all the details, you can try "
gcc -v" or check the man page for
gcc to see if the option is different for the system you are using.
Some of the commands I just touched on in passing,
file,
ldd, and
nm, can be helpful for making the contents of object files and executable files more tangible.
Also, on some systems, the manual page for a function will tell you if there are options you need to pass to
gcc to allow it to find the library which contains the function.
Personally, even though I understand it a little, I'm still going to let
gcc link my files for me. Even if, I first have
gcc separately compile a whole collection of object modules and then link them altogether in one last step to create one executable program.
If this all seems confusing, then I probably have at least succeeded in conveying the complexity of the thing.
Seriously, I do hope this helps.