Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a question thats been bothering me for a while.
We have been taught at uni that after a compilation ( so even .o file) there is
assembly instructions for certain processor.
Also: the adresses of the instructions should be absolute? I am not talking about dynamic libraries or stub as those are inserted during program loading or during execution.
But how come the instruction adresses can be absolute? What happens if on that adress there is a different code? I feel like all programs should be PID
It would be of great help if some of you would answer my question
The addresses referred to are the location within the compiled code of functions, etc. In other words once the program is loaded into memory those specific instructions are at a fixed offset from the beginning of the code. How else do you think the program would be able to access its own instructions and know where they were located?
What do you think would happen if the code were to be relocated to a different address while the app was running?
The details depend on the CPU architecture. All architectures can specify absolute addresses. Most can specify addresses relative to a base address (offsets may be limited to 8, 16, or 32-bits). The linker and loader can relocate absolute addresses as needed to make an executable run at a specific address.
Ed
The addresses created by the compiler are virtual. These are then translated on a page by page basis into physical memory addresses. There are translation tables inside the kernel that do the page mapping. But as far as the program code is concerned, the virtual addresses are perfectly real.
PIC, as used by libraries, is different in that these addresses are relative even by the program's own standards. It must be so because a particular library function could be called several times by different parts of the program (or indeed by different programs) and has to run the same way every time.
so lets say one program's code virt. addr. is mapped to phys. addr. and then to memory.
But if I try to load another program with similar(or same) virt. address as the first one then the mapping tables should see these virtual addresses are already mapped right? So there might be some kind of collision. Or do the mapping tables have some workaround when 2 processes have the same virt memory?
so lets say one program's code virt. addr. is mapped to phys. addr. and then to memory.
But if I try to load another program with similar(or same) virt. address as the first one then the mapping tables should see these virtual addresses are already mapped right?
Exactly! If a page is already mapped (say because it's a function in a common library and another program is running it concurrently), then the virtual page will be mapped directly to that physical page. So one physical page might correspond to several virtual pages in different programs.
Quote:
So there might be some kind of collision. Or do the mapping tables have some workaround when 2 processes have the same virt memory?
For every type of page, there are kernel functions that handle "page faults", such as not finding the virtual page in the translation table. If the page is code, the handler function for such a fault would be to check whether the content of the page is already in memory as part of another program. If so, there's no need to copy it a second time. A translation table entry can be created for the virtual page pointing to the existing physical memory page.
But that means that the 2 codes will have the same virtual memory if and only if the code is the same( or it is some library function) but how would compiler know such information?
In general, shared objects get loaded at different virtual addresses because the loader does not control a processes' virtual address map. The goal is for the kernel to map all instances of a shared object to the same physical pages so that only one copy resides in memory. To do this, the shared object should be compiled with -fPIC. The compiler emits extra instructions (if needed) to make the code run at any virtual address.
Ed
Unfortunately, this discussion wandered almost immediately into error, apparently based on the ambiguous term, "absolute address."
When the compiler generates an ".o" file, that file contains not only binary instructions, but relocation information that will be needed to adjust memory locations within the loaded program image in order to account for where in memory it was loaded. The file also contains a list of symbols defined by this file, and a list of symbols that are needed. The so-called "loader" is responsible for bringing all of the required materials into memory, thus deciding their address within the [virtual ...] memory space, and patching-up all the memory addresses everywhere.
It is possible for a symbol to be defined as equal to a fixed address-value, as opposed to being relative to the program unit in which it is defined. This facility is very rarely used.
However, nothing allows you to know anything about "absolute" as in "within physical CPU RAM." All programs run in a virtual memory space and perceive "memory" to be that space ... which it is, for them. Meanwhile, the virtual memory subsystem maps a portion of this (the portion that is currently being used) to physical RAM addresses – which are subject to change at any time. Only the operating system kernel knows what this mapping is at any particular microsecond.
In fact, the virtual memory address that you next reference might not be in physical RAM at all. This will produce a "page fault" interrupt. The operating system will then retrieve the "page" from backing storage (disk) where it had previously saved it, and it is free to put it anywhere in physical memory – any so-called "page frame."
Last edited by sundialsvcs; 12-14-2021 at 08:52 AM.
PIC means Position Independent Code, with other words it is relocatable.
In general it is explained here: https://en.wikipedia.org/wiki/Position-independent_code
In sort: non-relocatable code have a fixed address inside the memory of the process.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.