LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Number of Virtual Memory Layer (https://www.linuxquestions.org/questions/linux-newbie-8/number-of-virtual-memory-layer-4175599110/)

caprice-j 02-06-2017 08:32 AM

Number of Virtual Memory Layer
 
I heard that Linux can be configured to use multi-layer virtual memory from 1 to 4. Does anyone know a particular distribution where 4-layer virtual memory is used? Or is it common to use full four layers?

I only know that a specific mobile device uses 3-layer virtual memory in Android Kernel.

jpollard 02-06-2017 05:11 PM

Depends on the processor. Linux normally uses 3 layers... with a fourth value being the offset into the selected page.

https://www.kernel.org/doc/gorman/ht...rstand006.html

Jjanel 02-07-2017 05:23 AM

Might the word here be: 'level' (vs, 'layer')? (I'm a very-not-sure-Newbie here!)
I found references to PML4, like [2004 2.6! 64bit change]: http://lwn.net/Articles/106177
I wonder IF your "3 ... Android" refers to ['older'] 32bit/PAE (LONG story here) [?]

jpollard 02-07-2017 06:10 AM

Perhaps this would be a better reference: http://rayseyfarth.com/asm/pdf/ch04-memory-mapping.pdf

The fourth layer reference there is to the process page table (an index).

sundialsvcs 02-07-2017 08:09 AM

In all virtual-memory architectures, the virtual address is split into several different fields, accessing a "tree" of related data structures that are maintained by the kernel. Any of these structures may be marked "missing," triggering a page fault (or a memory protection exception). This software interrupt stops the process from executing and transfers control to the operating system. When the operating system resolves the issue and transfers control back to user-land, the interrupted process re-tries (or, resumes) the instruction.

To save time, processors also use some kind of "translation lookaside buffer (TLB)" (the old IBM mainframe term, which stuck ...) to instantly resolve recently-used virtual addresses without looking them up in the virtual-memory tables. Privileged instructions are used to invalidate all or part of the TLB entries when the underlying tables are changed.

A TLB is a parallel memory caching circuit, built into the CPU chip: every "bucket" in the cache is literally checked at the same instant.

jpollard 02-07-2017 09:31 AM

Think of the page tables as a radix tree search. The TLB is a cache of recent entries containing only unique entries.

The page tables and the TLB are searched in parallel - and whichever one identifies the entry first is used. A TLB hit terminates the page table search (which is MUCH slower). But that means that the TLB isn't necessarily consistent with the page tables when the page tables change - thus the TLB has to be "invalidated" by the OS to force the correct values to be identified from the page tables. At that point, the TLB gets the new entry.

Since the TLB has a limited number of entries, a new one will replace an invalidated entry, or the least recently used entry.

caprice-j 02-08-2017 09:27 AM

Thank you for providing useful resources and comments.

To Jjanel,
Yes you are right. What I meant was level.
And the document in mm.txt file is nice -- I haven't known x86_64 itself has 4-level structure (I tweaked ARM before).

To jpollard,
Thanks! So three levels are common.
That slides are clearly explained, thanks.

And to sundialsvcs,

Wow, TLB was a term coined by IBM?
I learned about TLB in my OS class (though I'm not aware that TLB itself can be multi-level like the slides jpollard posted).

Thank you all.

sundialsvcs 02-08-2017 07:35 PM

I think that "TLB = Translation Lookaside Buffer" is an original-IBM term. (I still find it in my original "POP = Principles of Operation" manual which was ... koff koff ... printed on a line printer.)

... if you ... ("bah! humbug! these kids today!" ... ;)) even are aware of what "a line printer" even is ...

("koff, koff™ ...")

"But, I Digress.™" :D

TLBs are not "multi-level!"! By design, they are parallel! Why? "Because they specifically exist to allow the CPU to avoid(!) a multi-level search!" :eek:

Here's the TLB's objective: "Here's the Question. Do you have (in one clock-cycle) the Answer?"

In all hardware architectures (however big or small ...), you will find this very-key component. It consists of a certain number of elements of so-called associative memory. In one clock-cycle, the virtual-address of interest is presented to all of the memory-elements at once:
  • If any of them has a "hit," it responds, and the otherwise-laborious process of searching through however-many "page and segment tables" (again, an IBM-hardware term ...) is averted.
  • If a "TLB Miss" occurs, the operating system will (laboriously ...) "resolve the issue," then it will furnish "the right answer" to the TLB, so that the entire question can be circumvented "next time."
Yeah, "hardware engineers" – yes, even now – face the same basic problems, and they solve them in the same basic ways. :) Whether they're solving them as they did fifty years ago, or today with wonderous microchips.

jpollard 02-09-2017 05:54 AM

Most TLB misses are handled by the hardware by continuing the page table walk. It isn't an OS issue as that would be uselessly slow (I have used one that did that... the PDP-10, for all intents the associative memory had to be big enough to map the entire memory... or your speed was about that of a PDP-11/34)

ref: https://en.wikipedia.org/wiki/Transl...okaside_buffer

sundialsvcs 02-09-2017 08:29 AM

Quote:

Originally Posted by jpollard (Post 5667701)
Most TLB misses are handled by the hardware by continuing the page table walk. It isn't an OS issue as that would be uselessly slow (I have used one that did that... the PDP-10, for all intents the associative memory had to be big enough to map the entire memory... or your speed was about that of a PDP-11/34)

ref: https://en.wikipedia.org/wiki/Transl...okaside_buffer

Yes, both the TLB lookaside and the page-table walk are handled by the hardware. Fetching an address given a four-tier set of tables could result in four memory reads just to resolve the address. Or, one TLB hit. Another trick that is used is to make the memory data-bus very wide, so that if (as expected) the next request for data is "at the next consecutive location," it will already be on-board the chip.

Ahh, yes ... the PDP. :)

jpollard 02-09-2017 10:13 AM

Not just four memory reads... it could involve another page fault where the page tables themselves may be paged out. Doesn't happen often, but for large memory machines it could.


All times are GMT -5. The time now is 04:55 AM.