LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Problem Understanding the Library - Linux OS (https://www.linuxquestions.org/questions/linux-newbie-8/problem-understanding-the-library-linux-os-4175552796/)

Claudio_Baldo 09-07-2015 12:08 AM

Problem Understanding the Library - Linux OS
 
Good day everybody,
I am new to linux OS and I have looked around for a topic similar to what I need, I got lost on the thousand of pages so I have decided to make a new thread.
Please indicate me the relevant post if someone has already asked what I am asking right now.

I am reading the `Linux Programming Interface` Manual (Micheal Kerrisk) which I find very interesting but I am afraid I need a kick start on this....

I want to call the stdio.c function `open()` but I am not sure if I should invoke it from the bash or from an executable c file.
Thank you in advance to anybody how can help me out.

Claudio

jpollard 09-07-2015 02:19 AM

The libraries are there or your system would not run.

The libraries themselves are in one of two places depending on your architecture: /lib (and /usr/lib) for 32 bit code; /lib64 (and /usr/lib64) for 64 bit code. The reason for two locations is that a 64 bit install can run 32 bit binaries, if the appropriate 32 bit libraries are installed. Most (if not all) 64 bit linux systems have SOME 32 bit libraries installed, and some 32 bit programs. Additional libraries may be located in other places, as defined by the projects that use them (such as /usr/share, or /usr/local/...).

If you are trying to do program development you need other tools/packages installed.

Depending on your distribution (I mostly use Fedora), the packages are grouped under "program development", and will include header files defining interfaces, extra libraries (some for debugging, others for various development targets - like X Window, Gnome, ...) and include the usual toosls: compilers, make, autoconf, version control...

You likely need to refer to the installation documentation for your distribution. Usually it will have various sections to help select what you need.

Claudio_Baldo 09-07-2015 02:45 AM

Hi jPollard,

yes, I understand the libraries are there, what I am not quite understanding is how the environment works in linux.
For example, if I want to open a file using the open() function defined by the stdio.c library, how should I do it?
Load the stdio.c library from somewhere?
I am using Ubuntu 32bit, where should I find the stdio.c library and how should I install the package?

wigry 09-07-2015 03:28 AM

You are reading the "Linux programming Interface" guide, meaning, you are dealing with programming. That in turn implies you to write source code and compile it into executable and run the resulting executable from the shell.

So the open() is a method in standard C library and you need to make a C program to use it. If you have never done any C programming before, then I suggest scrapping the Linux Programming Interface for now and pick up a C tutorial (not C++, but plain C)

For example this one:
http://www.cprogramming.com/tutorial/c-tutorial.html

jpollard 09-07-2015 03:30 AM

Ah. no need to do anything.

A linked executable image has within it references to the libraries it uses. An executable gets loaded by a small executable called ld.so - it is invoked by the kernel to provide the runtime linking. It reads the executables header to locate the referenced libraries, and loads them and the executable making final address resolution for the symbols, then it calls the executables starting address (which is just a subroutine that sets everything else up for the application (environment variables, the command parameters) then that calls the main function of the application.

Normally, you don't need the stdio.c source - you can't execute that, and is only useful if you are adding/modifying the runtime for some reason. The library is already compiled (it is the /lib/libc.so library file) and is ready for use.

A C program uses prototypes that describes the interface to the library (the "#include <stdio.h>") which causes the compiler to read the include file (which is stored in /usr/include...). This tells the compiler the data types for parameters, and return type for the functions used, which in turn, allows the detection of basic errors.

The compiled object code (not yet an executable) contains a list of undefined symbols (usually the variables and functions are defined in other object files or libraries). The linker (ld) then combines specified object files to come up with a remaining list of undefined symbols. ld searches through a list of known libraries (locations are specified by /etc/ld.so.cache, configured by the /etc/ld.so.conf configuration file and other files it designates). Any symbols defined by one of those libraries gets the library reference included in the executable.

The "ldd" utility can be used to list the libraries used, and what file is being referenced by an executable. If there are no references (it can happen - via static linking, which requires the use of an object library instead of a shared library) then the reference will not exist - as the linker has already resolved any symbols needed by using other object files or object library. Object libraries would be identified by a ".a" extension - for "archive", as it is a file containing an archive of other object files.

Claudio_Baldo 09-09-2015 07:59 PM

Thanks jpollard...
I understand how the #include works, it is used to load data types and functions so once I call a function from an executable C program (or I declare a specific data type), this function (or object) is found in the libraries.

I have been using the C programming in closed environments, now I don`t understand the boundaries between Kernel, shell and C programs.
I understand the kernel is the body responsible of allocating HW resources (CPU/RAM/VMEM etc...), I understand C program is used as high level language to allow the programmer to make the PC to perform some tasks, I know the shell for some OS is an integral part of the kernel, this is not the case of UNIX systems, for UNIX systems the shell is just a user program...

When I compile a C program from the shell I use the compiler gcc, so I `translate` the C language in a shell executable...
1) Why the shell cannot run program in C code?
Now I run the C code from the shell using the ./a.out, I can also pass some data with the argc and *argv[].
Let`s assume I want to run the function open(), in order to open a file, from the shell, how can I do it?
2) The glibc (or libc) should be already built at start up, so, how can I have access to the open() function from the stdio.c?

chrism01 09-10-2015 01:21 AM

It seems to me you're over thinking it; you just want a basic user mode C program.
See 2nd answer here https://stackoverflow.com/questions/...ith-c-in-linux
Note the doc quote about fopen() vs open().

Claudio_Baldo 09-10-2015 05:57 AM

hi chrism01,
the point is not the open() function written in C but more general.
I am trying to draw lines between the shell environment and the C programming language, trying to understand why I cannot run C code directly from the shell but I need to compile the C code and then to run the executable.
Things start being a bit more clearer now that I am starting with some basic stuff.
The problem is that is always hard to troubleshoot the errors if you don`t have a clear idea of the whole environment and system.

i.e. Are the systems calls all the calls made from the shell?
Can I write in C++? What about portability? Is the C++ compiler available for all the UNIX system?

jpollard 09-10-2015 06:12 AM

Quote:

Originally Posted by claudioba@aruba.it (Post 5418281)
Thanks jpollard...
I understand how the #include works, it is used to load data types and functions so once I call a function from an executable C program (or I declare a specific data type), this function (or object) is found in the libraries.

I have been using the C programming in closed environments, now I don`t understand the boundaries between Kernel, shell and C programs.
I understand the kernel is the body responsible of allocating HW resources (CPU/RAM/VMEM etc...), I understand C program is used as high level language to allow the programmer to make the PC to perform some tasks, I know the shell for some OS is an integral part of the kernel, this is not the case of UNIX systems, for UNIX systems the shell is just a user program...

Not exactly.

The kernel does provide the allocation of resources - it is still just a program, but one that uses system calls/traps to enforce an interface from applications. These can be thought of as equivalent to function calls, but do not share their memory (code/data) with the application. Libraries (such as libc) provide a function call interface to the system calls so that a standard definition of the functions can exist - different hardware provide different ways for system calls to work - the libc functions provide code to rearrange the parameters into the structure required by the hardware. As such, libc provides a "wrapper" to present a standard interface for the programmer.

Other libraries provide functions unrelated to the kernel (such as database access, keyed files, string search). The libc library contains what is considered the "standard" libraries for C programming - no keyed files, no database, but can include string search (string.h - no system calls), buffered I/O support(stdio.h), memory allocation (stdlib.h, does include system calls to allocate memory). Math functions are in a separate library libm (math.h).

The decision to use separate libraries is generally to isolate more special purpose functions from general use functions. libc is used by NEARLY every program (I say nearly because it IS possible to not use it, though awkward) but math functions, like keyed files, are not.

A shell is just a program that takes input (from a file), produces output (to another file), and reports errors (to a third file), and carries out actions directed by that input. Here, I say "from a file" because for UNIX/Linux systems, everything is considered a file - even a terminal. The shell language is designed to provide a relatively simple interface for a user. As such, its primary function is to translate the users command into something that invokes a separate program (such as "cat"). The language implemented by the shell is a direct interpreter - it reads a line, identifies the parts of the line (the program to run, parameters), and initiates the program using various system calls, then waits for it to finish.

In addition to the simple translation, additional features are added to make the language used more useful - thus the addition of various loop control, variables (everything is a string), some variable substitution capability...

The "everything is a file" becomes very useful by being able to use the shell itself as a command to process files (a "shell script" if you will). This extends the generality of the shell to becoming a basic programming environment. The shell can now invoke programs... or treat text files as programs...

A compiler is just another program. It takes input from one (usually more) files, and produces a new file for output. By design the C compiler itself is actually composed of multiple files. A preprocessor (which takes the file to translate, and looks for macros, expands them, combines include files...) and passes the output to a translator (which itself can be multiple files performing different translation/optimization/code generation phases); eventually producing an object file. By default it passes the result to a linker to produce an executable.

Quote:

When I compile a C program from the shell I use the compiler gcc, so I `translate` the C language in a shell executable...
1) Why the shell cannot run program in C code?
Now I run the C code from the shell using the ./a.out, I can also pass some data with the argc and *argv[].
Let`s assume I want to run the function open(), in order to open a file, from the shell, how can I do it?
The shell interpreter doesn't use "C" for the command language (too awkward to use for commands). The "everything is a file" comes in because the shell controls three default files - standard in, standard out, and standard error. For most things all you need to do is "redirect" input to a pre-existing program from something else. The base linux shell is "bash" - a tutorial is available at http://www.tldp.org/LDP/Bash-Beginners-Guide/html/
Quote:


2) The glibc (or libc) should be already built at start up, so, how can I have access to the open() function from the stdio.c?
As stated, stdio.c doesn't exist. stdio.h is an include file for the C compiler that defines how standard I/O functions of the C language are to be used.

A tutorial on Linux is available at:
http://www.ee.surrey.ac.uk/Teaching/Unix/

This will give you a more complete understanding of how things work.

Claudio_Baldo 09-10-2015 06:27 AM

It makes things clearer.
I only have one point which is not 100% clear yet.
If the stdio.c does not exist, where are stored the original C function calls?
If I invoke the open() function in C, the code for this must be stored somewhere otherwise how can it be compiled?

jpollard 09-10-2015 07:34 AM

It happens to be in /lib/libc.so... or in /lib64/libc.so...

And until the executable actually tries to run... it doesn't have to exist at all.

Remember. A compiler only knows about symbols and syntax. It doesn't have anything to do with actual executables. The linker is what combines a file with undefined symbols with definitions from other object files, and references to other code.

suicidaleggroll 09-10-2015 11:09 AM

Quote:

Originally Posted by claudioba@aruba.it (Post 5418440)
If I invoke the open() function in C, the code for this must be stored somewhere otherwise how can it be compiled?

It's already been compiled, by another person on another computer. The pre-compiled binary blob (called a shared object library) that contains open() is at /lib/libc.so (or /lib64) as jpollard mentioned, this gets pulled in whenever any program that needs it is run. This is called dynamic linking. Your executable does not contain every function that is required to run, it only contains the "unique" code. Standard functions that your program calls are pulled in dynamically from the system libraries in /lib or /lib64 when needed.

If you run "ldd <executable>", replacing <executable> with the name of your executable, it will tell you which system libraries are required by your executable, which will be pulled in when you run it.

You can build your own shared object libraries as well. Say you write a function that's terrific, you want to use it all the time in all sorts of projects. You have two options:

1) Copy the source code over to every project you want to use it in and built it into each project's executable. If you find a bug or want to improve some aspect of your function, you need to update this source code file in every project and rebuild every project in order to update them.

2) Build your source code into a shared object library, and let your other projects dynamically link to it. If you find a bug or want to improve some aspect of your function, just rebuild the shared object library, and every project that uses it will instantly be using the newest version.

jpollard 09-10-2015 02:07 PM

Quote:

Originally Posted by suicidaleggroll (Post 5418540)
It's already been compiled, by another person on another computer. The pre-compiled binary blob (called a shared object library) that contains open() is at /lib/libc.so (or /lib64) as jpollard mentioned, this gets pulled in whenever any program that needs it is run. This is called dynamic linking. Your executable does not contain every function that is required to run, it only contains the "unique" code. Standard functions that your program calls are pulled in dynamically from the system libraries in /lib or /lib64 when needed.

If you run "ldd <executable>", replacing <executable> with the name of your executable, it will tell you which system libraries are required by your executable, which will be pulled in when you run it.

You can build your own shared object libraries as well. Say you write a function that's terrific, you want to use it all the time in all sorts of projects. You have two options:

1) Copy the source code over to every project you want to use it in and built it into each project's executable. If you find a bug or want to improve some aspect of your function, you need to update this source code file in every project and rebuild every project in order to update them.

2) Build your source code into a shared object library, and let your other projects dynamically link to it. If you find a bug or want to improve some aspect of your function, just rebuild the shared object library, and every project that uses it will instantly be using the newest version.

3) create an object library that copies only the needed binary for use...

Claudio_Baldo 09-10-2015 11:12 PM

Thank you to the both of you.


All times are GMT -5. The time now is 07:47 AM.