LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-20-2014, 10:16 AM   #16
SoftSprocket
Member
 
Registered: Nov 2014
Posts: 399

Rep: Reputation: Disabled

Quote:
Originally Posted by jpollard View Post
Actually not.

malloc has been that way for about 30 years.


The reservation is marked in the process page tables (reserved), but physical allocation is delayed until actual use requires the page.

The problem is caused by allowing oversubscription of memory, which in turn, allows a lot more processes to run than can physically fit in memory.

Such oversubscription has been used in UNIX/Linux ever since UNIX had paging, and since Linux allowed oversubscription (around version 0.99).
I'm curious what other UNIX behave that way. I've only been programming in the UNIX environment for 25 years (a little longer in others) but Solaris doesn't practise optimistic allocation, while AIX can be configured to use quite a number of allocation algorithms I not aware of one of them being optimistic. BSD doesn't do it. I don't remember what HPUX does.

The ISO C standard says malloc returns NULL when there is no memory so a programmer should check the return value. If Linux has not returned NULL despite the memory not being available then no harm is done. If NULL has been returned then you can respond to it.
 
Old 12-20-2014, 10:32 AM   #17
SoftSprocket
Member
 
Registered: Nov 2014
Posts: 399

Rep: Reputation: Disabled
Quote:
Originally Posted by psionl0 View Post
That's the appalling bit. When a program segfaults, it simply prints "Segmentation fault" and dies instantly. You have no way of knowing if it's because it tried to access an illegal memory or if the system couldn't allocate the memory required (unless you are prepared to go through a massive debugging session).

So is there a lower risk of segfaulting if you use BSS instead of requesting more system memory? As I understand it, system memory is allocated by changing the "program break" via brk() or sbrk(). If in doing so, you bump into another process then that would suggest an instant segfault.

My reason for asking is that malloc() and free() routines are pretty straightforward to write if you have a block in BSS for that purpose (see K&R's Ansii C). You can even write simpler routines if you know that only blocks of a certain size are to be allocated.

As a matter of fact, the malloc() routine can be extremely rudimentary if you know that:
  • You will not be freeing any memory or
  • Memory will be freed in the reverse order to which it was malloc'd or
  • You will be freeing the entire block of memory at once.
There's definitely a question of whether the Linux approach makes sense. Many people think it's broken. Does having the OOM killer choose processes to kill represent an improvement over allowing processes themselves to handle the condition? In any event, when you're out of memory it's hard to say what can be done gracefully.
 
Old 12-20-2014, 12:10 PM   #18
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Quote:
Originally Posted by psionl0 View Post
That's the appalling bit. When a program segfaults, it simply prints "Segmentation fault" and dies instantly. You have no way of knowing if it's because it tried to access an illegal memory or if the system couldn't allocate the memory required (unless you are prepared to go through a massive debugging session).
Only if it doesn't setup a signal handler.
Quote:

So is there a lower risk of segfaulting if you use BSS instead of requesting more system memory? As I understand it, system memory is allocated by changing the "program break" via brk() or sbrk(). If in doing so, you bump into another process then that would suggest an instant segfault.
The "bump into another process" doesn't happen. Memory pages are allocated where ever a free page is available. When a free page isn't available, 1)it is taken from a list of pages that have already been written to disk (swap/mmap), and that is re-used. 2)If that can't be done, the system will trim buffers, and use that. 3)If that also can't be done the kernel starts trimming the memory of processes which forces those pages to be written to disk (and then back to #1). If everything fails, the OOM will start aborting processes.

As for the BSS, it depends. The last time I checked (some years ago), the linker actually sets the memory for static arrays. If the array is uninitialized, it is expected to be zeros, and the segment descriptor doesn't actually have any memory allocated (it can even be zero length, and the system is expected to allocate pages as required... so yes, a segfault COULD occur, but is extremely unlikely.
Quote:

My reason for asking is that malloc() and free() routines are pretty straightforward to write if you have a block in BSS for that purpose (see K&R's Ansii C). You can even write simpler routines if you know that only blocks of a certain size are to be allocated.

As a matter of fact, the malloc() routine can be extremely rudimentary if you know that:
  • You will not be freeing any memory or
  • Memory will be freed in the reverse order to which it was malloc'd or
  • You will be freeing the entire block of memory at once.
Dynamic memory handling is never quite as simple as it seems.

It CAN be simple - but only if the entire machine is allocated for a single process, with a single executable. In that case you can easily compute how much memory will be available. This is usually the case with embedded systems though.

Once the system becomes multi-tasking (or worse, multi-tasking, multi-process, multi-user), it is no longer simple.

Old strategies used swapping (no oversubscription - and the system hangs if there isn't enough swap space). In these systems memory expansion that "bumps into another process" has to be written to the swap file (the the one bumping, OR the one being bumped). And the swap file must be as large as the largest allowed process times the number of processes allowed. You CAN run with less swap... but sometimes the system will hang - or have to abort the process trying to allocate the memory (this is the only case where malloc can actually return NULL if it fails). This method doesn't use paging. It imposes the least system overhead, but is inflexible, and requires a lot more disk space. Note: the basic swapping doesn't allow for processes larger than physical memory minus the memory used by the system.

Some swapping systems used paging to emulate larger memory - but then what happens is the page file becomes fragmented - which slows down swapping (multiple scattered reads to get the process back into memory). More flexible, but slower. The advantage is that the swap space can be allocated to processes, and if enough is not available the process doesn't start (as I recall the error was "no swap"). Dynamic expansion of memory wasn't allowed (all processes had a fixed maximum size, so swap would be allocated for the worst case). That way, even if the process didn't use it all, it COULD expand usage to the maximum limit. These systems also allowed for oversubscription - but at the cost of stability. When the process memory was expanded too far, processes couldn't be written out (no space), and processes in memory couldn't continue, and sometimes the system would hang. This was occurring around UNIX v7 to System V release 2 (early to mid 1980 time frame). I've actually seen this occur in IRIX systems with 128 CPUs and 256GB of memory (mid 1990, the system would deadlock, adding more swap space wouldn't eliminate the deadlocks but did reduce the frequency). This was the first time I ran across the malloc always returning an address even if none was available - it was assumed that another process would exit making memory available when needed.

Fully paging systems were much more flexible, and allowed for processes to be larger than physical memory, and the problems were less noticeable.

But there were trade-offs. In linux, the tradeoff is the kernel OOM process killer that picks what process to abort. It does this instead of having a "livelock" (waiting for memory that cannot be provided). It doesn't eliminate the possibility of a livelock (still can happen if the OOM procedure can't identify any process to kill). When the OOM procedure was originally implemented it could even pick init to be killed (which crashes the system ). It now has designations for those not to be considered (mostly root owned processes, but others can also be protected).
 
Old 12-22-2014, 01:38 PM   #19
metallica1973
Senior Member
 
Registered: Feb 2003
Location: Washington D.C
Posts: 2,190

Original Poster
Rep: Reputation: 60
Many thanks for all the help. I think I will just use jpollard suggestion of using getpwent function as he suggested. It will satisfy my specific needs in achieving my goal. I learned a lot.
 
Old 12-22-2014, 02:32 PM   #20
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Quote:
Originally Posted by SoftSprocket View Post
I'm curious what other UNIX behave that way. I've only been programming in the UNIX environment for 25 years (a little longer in others) but Solaris doesn't practise optimistic allocation, while AIX can be configured to use quite a number of allocation algorithms I not aware of one of them being optimistic. BSD doesn't do it. I don't remember what HPUX does.
All of them. It is in AT&T System V since it started using paging.

Quote:
The ISO C standard says malloc returns NULL when there is no memory so a programmer should check the return value. If Linux has not returned NULL despite the memory not being available then no harm is done. If NULL has been returned then you can respond to it.
And what does the term "no memory" mean? The address range of a VM may be 64 bits. There may be PLENTY of "memory" that is available... Yet, if the kernel decides that a page is NOT available later - you are out of luck.

Yes, malloc will return NULL, but only when the address range is fully occupied (and it almost never is).

When you get an address, it exists just fine a virtual address. But it MAY not exist at the time it is actually referenced.

The only way to ensure that it DOES exist is to turn off over subscription. And then you have to deal with determining how much RAM+SWAP you really need. A single 32 bit address range can require 4 GB. So if you are going to have 100 processes, you have a worst case need of 400GB for RAM+swap. Guess what - it gets a lot worse with a 64 bit address range.

The nice part is that with oversubscription disabled, it is possible to compute if there is a failure at the time malloc is called. If oversubscription is allowed, it cannot.

Last edited by jpollard; 12-22-2014 at 02:36 PM.
 
Old 12-22-2014, 07:01 PM   #21
SoftSprocket
Member
 
Registered: Nov 2014
Posts: 399

Rep: Reputation: Disabled
Quote:
Originally Posted by jpollard View Post
All of them. It is in AT&T System V since it started using paging.



And what does the term "no memory" mean? The address range of a VM may be 64 bits. There may be PLENTY of "memory" that is available... Yet, if the kernel decides that a page is NOT available later - you are out of luck.

Yes, malloc will return NULL, but only when the address range is fully occupied (and it almost never is).

When you get an address, it exists just fine a virtual address. But it MAY not exist at the time it is actually referenced.

The only way to ensure that it DOES exist is to turn off over subscription. And then you have to deal with determining how much RAM+SWAP you really need. A single 32 bit address range can require 4 GB. So if you are going to have 100 processes, you have a worst case need of 400GB for RAM+swap. Guess what - it gets a lot worse with a 64 bit address range.

The nice part is that with oversubscription disabled, it is possible to compute if there is a failure at the time malloc is called. If oversubscription is allowed, it cannot.
There's a big difference between over subscription and virtual memory space. The memory allocators job is to track free memory blocks. Linux will assign more then what is available - presuming that it will be available when required.
 
Old 12-22-2014, 08:23 PM   #22
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Quote:
Originally Posted by SoftSprocket View Post
There's a big difference between over subscription and virtual memory space. The memory allocators job is to track free memory blocks. Linux will assign more then what is available - presuming that it will be available when required.
Yes, and no.

The memory allocators only track virtual memory.

And it is that "presuming that it will be available" is what causes malloc to not behave. Malloc will give virtual addresses until it is exhausted. The kernel is what backs that allocation, and it cannot always do so. When it has oversubscription disabled, then the kernel can easily determine if the actual page will be available, or not - at the time malloc is called. With oversubscription enabled, it cannot. And the failure will not show up until the memory is actually written to - and causes a segfault.

Thus the programmer cannot trust the return value from malloc.
 
Old 12-25-2014, 12:17 PM   #23
smeezekitty
Senior Member
 
Registered: Sep 2009
Location: Washington U.S.
Distribution: M$ Windows / Debian / Ubuntu / DSL / many others
Posts: 2,339

Rep: Reputation: 231Reputation: 231Reputation: 231
There is a lot of misinformation here.

Linux has a system called overcommitting. Basically what happens
is that any reasonable amount of memory allocated with malloc()
will be allocated in virtual address space and a live pointer to VM will be returned.

Basically the kernel says "I think I can make it fit! I think I can make it fit!" LOL

It doesn't actually allocate any physical memory for that allocated virtual memory
until you write to it. Once you write to that address, the kernel allocates a physical page
to the written region.

What happens if there is not enough physical memory?
Pretty simple. kswapd looks for pages that are not being used actively
and swaps them out to the disk and uses the new space for the newly used allocation.

But what if swap is disabled or depleted?

Things get dire and Linux unloads actively running data and finally binaries
to the disk and actually runs the binaries from the disk (system slows to a absolute crawl)

Finally, if more memory keeps getting used, Linux triggers the OOM killer. That sends
the uncatchable SIGKILL to (usually) the process using the most memory.

But if you want, overcommitting can be turned off (through /proc I think)
In which case, allocations greater than free mem + free swap should fail

Saying that malloc() won't return NULL is wrong. When the request is clearly unsatisfiable, malloc() will return NULL
Furthermore, you should never get a SEGFAULT by using memory (in bounds) returned by malloc.
The memory returned is valid in the virtual memory tables hence no seg fault. But if you use more than
is available, it is risking a SIGKILL from the OOMK

Hope that clears it up

Last edited by smeezekitty; 12-25-2014 at 12:18 PM.
 
Old 12-25-2014, 12:42 PM   #24
genss
Member
 
Registered: Nov 2013
Posts: 742

Rep: Reputation: Disabled
Quote:
Originally Posted by smeezekitty View Post
There is a lot of misinformation here.

Linux has a system called overcommitting. Basically what happens
is that any reasonable amount of memory allocated with malloc()
will be allocated in virtual address space and a live pointer to VM will be returned.

Basically the kernel says "I think I can make it fit! I think I can make it fit!" LOL

It doesn't actually allocate any physical memory for that allocated virtual memory
until you write to it. Once you write to that address, the kernel allocates a physical page
to the written region.

What happens if there is not enough physical memory?
Pretty simple. kswapd looks for pages that are not being used actively
and swaps them out to the disk and uses the new space for the newly used allocation.

But if you want, overcommitting can be turned off (through /proc I think)
In which case, allocations greater than free mem + free swap should fail

Saying that malloc() won't return NULL is wrong. When the request is clearly unsatisfiable, malloc() will return NULL
all that has been said
mechanisms are mmap() and brk()
memory is in "pages", that are usually 4k bytes
mmap will not always return 0, more so with todays "advanced" "thread caching" memory allocators

/proc/sys/vm/overcommit_memory is the proc entry
https://www.kernel.org/doc/Documenta...mit-accounting is the documentation


PS it gets even more complicated when you find out how the kernel treats files
 
Old 12-25-2014, 01:08 PM   #25
smeezekitty
Senior Member
 
Registered: Sep 2009
Location: Washington U.S.
Distribution: M$ Windows / Debian / Ubuntu / DSL / many others
Posts: 2,339

Rep: Reputation: 231Reputation: 231Reputation: 231
Quote:
all that has been said
But a lot of things that are incorrect were said such as

Quote:
As another nit - malloc will almost never return a NULL entry, even if it doesn't allocate memory. The only way to know if the pointer is really valid is to actually put data into it, and use a trap handler to catch segfaults.
Malloc will allocate virtual memory or return NULL
The pointer will be valid but that doesn't mean there is physical memory to back it
 
Old 12-25-2014, 01:10 PM   #26
genss
Member
 
Registered: Nov 2013
Posts: 742

Rep: Reputation: Disabled
Quote:
Originally Posted by smeezekitty View Post
Malloc will allocate virtual memory or return NULL
The pointer will be valid but that doesn't mean there is physical memory to back it
yes
depending on if you take "allocate memory" as allocating virtual memory or real memory that can be taken bout ways
i take it as allocating actual memory
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] reference bash array values, using a variable with a value of the array name gusthecat Programming 5 03-07-2012 03:41 PM
[SOLVED] How to get search results to array by awk webhope Programming 5 05-05-2010 11:59 AM
Bash Variable Array, Trying to add another value into the array helptonewbie Linux - Newbie 6 03-02-2009 11:18 PM
php array sort inconsistent results rblampain Programming 2 04-03-2006 12:34 AM
How to echo results of variable BabaKali Linux - Newbie 7 11-10-2004 12:47 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:00 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration