[SOLVED] I'm confused as to what is meant by 12-bit offset in 32-bit intel processors...
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I'm confused as to what is meant by 12-bit offset in 32-bit intel processors...
Intel 32-bit processors use a 4096 byte page file and word size = 32 bits. Which means each instruction is 32 bits or 4 bytes.
Take this memory location in hex
Code:
0x47F0Fxxx
From what I understand that last 3 digits represent the 12-bit offset from the base register. What does that mean exactly? In hexadecimal, according to this
Hex/Oct/whatever is just another way of writing a value without using all the 1s and 0s, to save space. Hex F = binary 1111.
Code:
01010101 01010101 01010101 01010101
----------------------++++ ++++++++ <- Last 12 bits
2^12 = 4096 = 4K.
If the 32bit value addresses memory, you could define the start of each 4K page as being when the last 12bit are all 0. Any other value stored in/intepreted from those bits would be a location within that page, while we're retaining the page number/reference in the 13th-31st bits.
Taking this original full number and doing a bitwise AND with 00000FFF gives you a 32bit value which could be interpreted as a whole as a number that'll be somewhere within 0 and 4K, and it was found offset 12bits into(from the least siginificant bit) the full 32bit value.
If the 32bit value addresses memory, you could define the start of each 4K page as being when the last 12bit are all 0. Any other value stored in/intepreted from those bits would within that page while retaining the page number/reference in the 13th-31st bits.
Taking this original full number and doing a bitwise AND with 00000FFF gives you a 32bit value which could be interpreted as a number that'll be somewhere within 0 and 4K, and it was found offset 12bits into(from the least siginificant bit) the full 32bit value.
Yes, that is definitely correct. Just remember that the offset is just the number used to find your address space within memory, so when programming and dealing with offsets, you don't want to be even a digit off.
word size = 32 bits. Which means each instruction is 32 bits or 4 bytes.
No, it does not mean that.
Instructions are variable length. Some instructions are just one byte long. Many are far more than four bytes long (I forget the rules for the absolute maximum instruction length).
The "word size" in much older simpler CPU architectures was very significant and determined the sizes of many different things in the CPU design. In Intel x86, the "32-bit" has limited significance. Many registers, data paths, address sizes, etc. are larger than 32 bits in 32-bit x86.
Quote:
Take this memory location in hex
Code:
0x47F0Fxxx
From what I understand that last 3 digits represent the 12-bit offset from the base register. What does that mean exactly?
I don't think the thing you are talking about is called a "base register".
In the mode of 32-bit x86 used by Linux and Windows, virtual addresses are 32 bits and every virtual address is translated to a physical address by the cpu. In that translation, the bottom 12 bits are not changed (the bottom 12 bits of the physical address matches the bottom 12 bits of the virtual address).
So you can view the virtual address as consisting of 20 bits that select which page is addressed and 12 bits which select which byte within the page.
Quote:
The last three digits can be anywhere from 0-255 bits or 32 bytes.
What do they mean by a 12-bit offset?
I'm not sure what you think the above means. Maybe Proud already cleared up that part of your confusion (see post #2 in this thread). Three hex digits encode 4096 (decimal) different values. As the low 12 bits of an address, 3 hex digits encode a byte position of 0 to 4095 within a page.
In ordinary addressing, bytes are addressed, not bits. So 256 bits is just 32 bytes, but that fact is not directly relevant to ordinary addressing.
32-bit x86 architecture also supports segmented addressing that does involve "base" registers combined with offsets. But those offsets are not 12-bit and that type of addressing is not significantly used in Linux nor 32-bit Windows. So you seem to be combining some terminology and concepts from segmented addressing into a question about the non segmented addressing Linux uses.
Ok, I think I'm understanding it better. The first 5 hex values represent a given Virtual Page.
0x47F0F000 - 0x47F0FFFF is one whole page = 4095 bytes. So the last 3 FFFs are where in that Virtual Page we are.
So the last 3 at FFF represent 4095 in decimal?
The purpose of the 12-bit offset? To make sure everything lines up so you won't be off by one bit? 2^12 = 4096 does not imply 4096/12
0x47F0F000 - 0x47F0FFFF is one whole page = 4095 bytes.
4096 bytes. You forgot one byte - either first or last one.
General rule - if zero-based array (starts at index 0) has N elements, then index of last element is N-1.
The purpose of the 12-bit offset? To make sure everything lines up so you won't be off by one bit? 2^12 = 4096 does not imply 4096/12
What strange meaning are you giving to the phrase "12-bit offset".
A reasonable meaning is that there is an offset (which in this case refers to the byte position within a page) and 12 bits of the address are used to encode that offset.
12-bits is the size of the encoding of the offset. You seem to be treating it as a fixed amount of "offset". Even so, I don't understand what you expect would be divisible by 12.
The word "offset" in computer programming has a very general meaning. To know what it means in a specific context, you need to understand the context. Similarly, a lot of the words around "offset" must be interpreted either through context or reasonableness.
Some data structure might include a "500 byte offset". In that case reasonableness tells us the direct magnitude of the "offset" is 500 bytes. No one would use 500 bytes to encode an offset.
But when you talk about a "12 bit offset" mere reasonableness can't tell you whether that describes a fixed tiny offset in some bit packed data structure vs. the size of the encoding of a variable offset. You need context.
So here, 0x80497bf, in this case the page number is 0x8049 16-bit page number, and 7bf 12-bit offset, which is a 28-bit address. Addresses are normally 32-bit no? Usually the page number is 5 hex digits instead of 4. I guess pages in lower memory are 28-bit? Or is it because 0x8049 is really 0x08049, which case it would be a 20-bit page number and 7bf a 12-bit offset, which would give us a 32-bit address. Each hex digit is 4 bits correct?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.