LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   mmap questions (https://www.linuxquestions.org/questions/programming-9/mmap-questions-944663/)

ooff 05-12-2012 08:33 PM

mmap questions
 
Hi,

I'm working with a legacy system and it suddenly begins to core recently. By doing a strace, I found the following in the trace:

1. [pid xxx] mmap(NULL, 20480, PROT_READ|PROT_WRITE, MAP_SHARED, 116, 0x799300) = 0x2a955d700
2. [pid xxx] munmap(0x2a955d700, 20480) = 0
3. [pid xxx] mmap(NULL, 20480, PROT_READ|PROT_WRITE, MAP_SHARED, 116, 0x799400) = 0x2a955d700
4. [pid xxx] munmap(0x2a955d700, 20480) = 0
5. [pid xxx] mmap(NULL, 20480, PROT_READ|PROT_WRITE, MAP_SHARED, 116, 0x799800) = 0x2a955d700
6. [pid xxx] munmap(0x2a955d700, 20480) = 0
7. [pid xxx] mmap(NULL, 20480, PROT_READ|PROT_WRITE, MAP_SHARED, 116, 0x4eef000) = 0x2a955d700
8. [pid xxx] mmap(NULL, 18446744071562084352, PROT_READ|PROT_WRTITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 EINVAL (Invalid argument)
9. [pid xxx] mmap(NULL, 18446744071562084352, PROT_READ|PROT_WRTITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 EINVAL (Invalid argument)
10. [pid xxx] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2a955dc00

My questions are as following:
1. Does it look right for line 3 above to request mmap with offset 0x799400? Line 1 says it needs mmap with offset 0x799300 and length = 20480. It looks like the first mmap should cover from offset 0x799300 to offset 0x799300 (which is 127479808 decimal) + (decimal)20480 => 0x7998000. I would imagine line 5 will be the next mmap directly following line 1. Why line 3 above tries to mmap something in the middle again? The trace file shows a lot mmap calls in this pattern, i.e. there is always a mmap call in between of one offset and offset + len.

(1.a As a side note to question 1 above, it seems 0x799300 got converted to 127479808 decimal on a 64-bit Windows Calculator but 7967488 on a 32-bit one. I guess it overflows 32-bit one.)

2. Does it look right for line 7 above without munmap? I know sometimes munmap isn't required. But given line 2,4,6 all have munmap, I don't know why line 7 doesn't have a matching munmap.

3. Line 8 and 9 definitely are the problem. The length argument is way too large. But what are line 8 and 9 trying to do by MAP_PRIVATE|MAP_ANONYMOUS? Is this just trying to allocate more memory, which has nothing to do with fd 116?

Thanks

Nominal Animal 05-13-2012 10:39 AM

Quote:

Originally Posted by ooff (Post 4676961)
As a side note to question 1 above, it seems 0x799300 got converted to 127479808 decimal on a 64-bit Windows Calculator but 7967488 on a 32-bit one.

No, you made a typo. 0x799300 = 7967488, 0x7993000 = 127479808.

Quote:

Originally Posted by ooff (Post 4676961)
8. [pid xxx] mmap(NULL, 18446744071562084352, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 EINVAL (Invalid argument)

0xffffffff80004000 = 18446744071562084352. Looks like an bug in the size calculation. Note that you made a typo there too: it's PROT_WRITE, not PROT_WRTITE. Be more careful, please.

Quote:

Originally Posted by ooff (Post 4676961)
The trace file shows a lot mmap calls in this pattern, i.e. there is always a mmap call in between of one offset and offset + len.

It looks like Fortran I/O. The mmap is a sliding window. Because the entire record (or line) must be accessible in the window, the mmap is moved in increments smaller than the size.

Quote:

Originally Posted by ooff (Post 4676961)
Does it look right for line 3 above to request mmap with offset 0x799400?

Impossible to say, because it depends completely on the application. It is certainly stupid (inefficient, inelegant), but other than that, there is nothing wrong with it.

One reason for such a short offset adjustment would be alignment. A library would prefer to keep the offsets aligned, so that the kernel can just map the pages without extra copying. On the other hand, it's only aligned to 1024 bytes (0x400), which does not help at all when page size is 4096 bytes (0x1000), so that alignment adjustment would make no difference anyway.

Quote:

Originally Posted by ooff (Post 4676961)
Does it look right for line 7 above without munmap?

No, but the next mmap() is clearly b0rked, with its negative size. I'd say there was a bug in the application near 7. On 32-bit architectures the bug could be accidentally papered over; have you just moved to a 64-bit architecture?

Quote:

Originally Posted by ooff (Post 4676961)
But what are line 8 and 9 trying to do by MAP_PRIVATE|MAP_ANONYMOUS? Is this just trying to allocate more memory, which has nothing to do with fd 116?

When MAP_ANONYMOUS is used, both the fd and the offset are ignored. So yes, this looks like it is trying to allocate more memory. Because the obvious bug must have occurred earlier already, I wouldn't wonder too much about this; after a bug, code may do weird things.

On the other hand, I seem to recall Fortran does temporary allocations related to I/O using the file descriptor, so these could be temporary array allocations, but with invalid size.

Note that -16384 = 0xFFFFC000 as a signed 32-bit integer. When you negate it and sign-extend to 64 bits (-(-16384 & 0x7FFFFFFF)), you get ffffffff80004000 = 18446744071562084352.

So, if this is Fortran, and you have access to the source code, check that the size and offset calculations for I/O are done using (temporary) integers of sufficient size (INTEGER*8 or with KIND=SELECTED_INT_KIND(18)); especially look for expressions that use negation or subtraction.


All times are GMT -5. The time now is 08:33 PM.