[SOLVED] Results Of A Variable Into An Array Using C Language

SoftSprocket · 12-20-2014, 10:16 AM

Quote:

Originally Posted by jpollard

Actually not.

malloc has been that way for about 30 years.

The reservation is marked in the process page tables (reserved), but physical allocation is delayed until actual use requires the page.

The problem is caused by allowing oversubscription of memory, which in turn, allows a lot more processes to run than can physically fit in memory.

Such oversubscription has been used in UNIX/Linux ever since UNIX had paging, and since Linux allowed oversubscription (around version 0.99).

I'm curious what other UNIX behave that way. I've only been programming in the UNIX environment for 25 years (a little longer in others) but Solaris doesn't practise optimistic allocation, while AIX can be configured to use quite a number of allocation algorithms I not aware of one of them being optimistic. BSD doesn't do it. I don't remember what HPUX does.

The ISO C standard says malloc returns NULL when there is no memory so a programmer should check the return value. If Linux has not returned NULL despite the memory not being available then no harm is done. If NULL has been returned then you can respond to it.

SoftSprocket · 12-20-2014, 10:32 AM

Quote:

Originally Posted by psionl0

That's the appalling bit. When a program segfaults, it simply prints "Segmentation fault" and dies instantly. You have no way of knowing if it's because it tried to access an illegal memory or if the system couldn't allocate the memory required (unless you are prepared to go through a massive debugging session).

So is there a lower risk of segfaulting if you use BSS instead of requesting more system memory? As I understand it, system memory is allocated by changing the "program break" via brk() or sbrk(). If in doing so, you bump into another process then that would suggest an instant segfault.

My reason for asking is that malloc() and free() routines are pretty straightforward to write if you have a block in BSS for that purpose (see K&R's Ansii C). You can even write simpler routines if you know that only blocks of a certain size are to be allocated.

As a matter of fact, the malloc() routine can be extremely rudimentary if you know that:

You will not be freeing any memory or
Memory will be freed in the reverse order to which it was malloc'd or
You will be freeing the entire block of memory at once.

There's definitely a question of whether the Linux approach makes sense. Many people think it's broken. Does having the OOM killer choose processes to kill represent an improvement over allowing processes themselves to handle the condition? In any event, when you're out of memory it's hard to say what can be done gracefully.

jpollard · 12-20-2014, 12:10 PM

Quote:

Originally Posted by psionl0

That's the appalling bit. When a program segfaults, it simply prints "Segmentation fault" and dies instantly. You have no way of knowing if it's because it tried to access an illegal memory or if the system couldn't allocate the memory required (unless you are prepared to go through a massive debugging session).

Only if it doesn't setup a signal handler.

Quote:

So is there a lower risk of segfaulting if you use BSS instead of requesting more system memory? As I understand it, system memory is allocated by changing the "program break" via brk() or sbrk(). If in doing so, you bump into another process then that would suggest an instant segfault.

The "bump into another process" doesn't happen. Memory pages are allocated where ever a free page is available. When a free page isn't available, 1)it is taken from a list of pages that have already been written to disk (swap/mmap), and that is re-used. 2)If that can't be done, the system will trim buffers, and use that. 3)If that also can't be done the kernel starts trimming the memory of processes which forces those pages to be written to disk (and then back to #1). If everything fails, the OOM will start aborting processes.

As for the BSS, it depends. The last time I checked (some years ago), the linker actually sets the memory for static arrays. If the array is uninitialized, it is expected to be zeros, and the segment descriptor doesn't actually have any memory allocated (it can even be zero length, and the system is expected to allocate pages as required... so yes, a segfault COULD occur, but is extremely unlikely.

Quote:

My reason for asking is that malloc() and free() routines are pretty straightforward to write if you have a block in BSS for that purpose (see K&R's Ansii C). You can even write simpler routines if you know that only blocks of a certain size are to be allocated.

As a matter of fact, the malloc() routine can be extremely rudimentary if you know that:

You will not be freeing any memory or
Memory will be freed in the reverse order to which it was malloc'd or
You will be freeing the entire block of memory at once.

Dynamic memory handling is never quite as simple as it seems.

It CAN be simple - but only if the entire machine is allocated for a single process, with a single executable. In that case you can easily compute how much memory will be available. This is usually the case with embedded systems though.

Once the system becomes multi-tasking (or worse, multi-tasking, multi-process, multi-user), it is no longer simple.

Old strategies used swapping (no oversubscription - and the system hangs if there isn't enough swap space). In these systems memory expansion that "bumps into another process" has to be written to the swap file (the the one bumping, OR the one being bumped). And the swap file must be as large as the largest allowed process times the number of processes allowed. You CAN run with less swap... but sometimes the system will hang - or have to abort the process trying to allocate the memory (this is the only case where malloc can actually return NULL if it fails). This method doesn't use paging. It imposes the least system overhead, but is inflexible, and requires a lot more disk space. Note: the basic swapping doesn't allow for processes larger than physical memory minus the memory used by the system.

Some swapping systems used paging to emulate larger memory - but then what happens is the page file becomes fragmented - which slows down swapping (multiple scattered reads to get the process back into memory). More flexible, but slower. The advantage is that the swap space can be allocated to processes, and if enough is not available the process doesn't start (as I recall the error was "no swap"). Dynamic expansion of memory wasn't allowed (all processes had a fixed maximum size, so swap would be allocated for the worst case). That way, even if the process didn't use it all, it COULD expand usage to the maximum limit. These systems also allowed for oversubscription - but at the cost of stability. When the process memory was expanded too far, processes couldn't be written out (no space), and processes in memory couldn't continue, and sometimes the system would hang. This was occurring around UNIX v7 to System V release 2 (early to mid 1980 time frame). I've actually seen this occur in IRIX systems with 128 CPUs and 256GB of memory (mid 1990, the system would deadlock, adding more swap space wouldn't eliminate the deadlocks but did reduce the frequency). This was the first time I ran across the malloc always returning an address even if none was available - it was assumed that another process would exit making memory available when needed.

Fully paging systems were much more flexible, and allowed for processes to be larger than physical memory, and the problems were less noticeable.

But there were trade-offs. In linux, the tradeoff is the kernel OOM process killer that picks what process to abort. It does this instead of having a "livelock" (waiting for memory that cannot be provided). It doesn't eliminate the possibility of a livelock (still can happen if the OOM procedure can't identify any process to kill). When the OOM procedure was originally implemented it could even pick init to be killed (which crashes the system

). It now has designations for those not to be considered (mostly root owned processes, but others can also be protected).

metallica1973 · 12-22-2014, 01:38 PM

Many thanks for all the help. I think I will just use jpollard suggestion of using getpwent function as he suggested. It will satisfy my specific needs in achieving my goal. I learned a lot.

jpollard · 12-22-2014, 02:32 PM

Quote:

Originally Posted by SoftSprocket

I'm curious what other UNIX behave that way. I've only been programming in the UNIX environment for 25 years (a little longer in others) but Solaris doesn't practise optimistic allocation, while AIX can be configured to use quite a number of allocation algorithms I not aware of one of them being optimistic. BSD doesn't do it. I don't remember what HPUX does.

All of them. It is in AT&T System V since it started using paging.

Quote:

The ISO C standard says malloc returns NULL when there is no memory so a programmer should check the return value. If Linux has not returned NULL despite the memory not being available then no harm is done. If NULL has been returned then you can respond to it.

And what does the term "no memory" mean? The address range of a VM may be 64 bits. There may be PLENTY of "memory" that is available... Yet, if the kernel decides that a page is NOT available later - you are out of luck.

Yes, malloc will return NULL, but only when the address range is fully occupied (and it almost never is).

When you get an address, it exists just fine a virtual address. But it MAY not exist at the time it is actually referenced.

The only way to ensure that it DOES exist is to turn off over subscription. And then you have to deal with determining how much RAM+SWAP you really need. A single 32 bit address range can require 4 GB. So if you are going to have 100 processes, you have a worst case need of 400GB for RAM+swap. Guess what - it gets a lot worse with a 64 bit address range.

The nice part is that with oversubscription disabled, it is possible to compute if there is a failure at the time malloc is called. If oversubscription is allowed, it cannot.

SoftSprocket · 12-22-2014, 07:01 PM

Quote:

Originally Posted by jpollard

All of them. It is in AT&T System V since it started using paging.

And what does the term "no memory" mean? The address range of a VM may be 64 bits. There may be PLENTY of "memory" that is available... Yet, if the kernel decides that a page is NOT available later - you are out of luck.

Yes, malloc will return NULL, but only when the address range is fully occupied (and it almost never is).

When you get an address, it exists just fine a virtual address. But it MAY not exist at the time it is actually referenced.

The only way to ensure that it DOES exist is to turn off over subscription. And then you have to deal with determining how much RAM+SWAP you really need. A single 32 bit address range can require 4 GB. So if you are going to have 100 processes, you have a worst case need of 400GB for RAM+swap. Guess what - it gets a lot worse with a 64 bit address range.

The nice part is that with oversubscription disabled, it is possible to compute if there is a failure at the time malloc is called. If oversubscription is allowed, it cannot.

There's a big difference between over subscription and virtual memory space. The memory allocators job is to track free memory blocks. Linux will assign more then what is available - presuming that it will be available when required.

jpollard · 12-22-2014, 08:23 PM

Quote:

Originally Posted by SoftSprocket

There's a big difference between over subscription and virtual memory space. The memory allocators job is to track free memory blocks. Linux will assign more then what is available - presuming that it will be available when required.

Yes, and no.

The memory allocators only track virtual memory.

And it is that "presuming that it will be available" is what causes malloc to not behave. Malloc will give virtual addresses until it is exhausted. The kernel is what backs that allocation, and it cannot always do so. When it has oversubscription disabled, then the kernel can easily determine if the actual page will be available, or not - at the time malloc is called. With oversubscription enabled, it cannot. And the failure will not show up until the memory is actually written to - and causes a segfault.

Thus the programmer cannot trust the return value from malloc.

smeezekitty · 12-25-2014, 12:17 PM

There is a lot of misinformation here.

Linux has a system called overcommitting. Basically what happens
is that any reasonable amount of memory allocated with malloc()
will be allocated in virtual address space and a live pointer to VM will be returned.

Basically the kernel says "I think I can make it fit! I think I can make it fit!" LOL

It doesn't actually allocate any physical memory for that allocated virtual memory
until you write to it. Once you write to that address, the kernel allocates a physical page
to the written region.

What happens if there is not enough physical memory?
Pretty simple. kswapd looks for pages that are not being used actively
and swaps them out to the disk and uses the new space for the newly used allocation.

But what if swap is disabled or depleted?

Things get dire and Linux unloads actively running data and finally binaries
to the disk and actually runs the binaries from the disk (system slows to a absolute crawl)

Finally, if more memory keeps getting used, Linux triggers the OOM killer. That sends
the uncatchable SIGKILL to (usually) the process using the most memory.

But if you want, overcommitting can be turned off (through /proc I think)
In which case, allocations greater than free mem + free swap should fail

Saying that malloc() won't return NULL is wrong. When the request is clearly unsatisfiable, malloc() will return NULL
Furthermore, you should never get a SEGFAULT by using memory (in bounds) returned by malloc.
The memory returned is valid in the virtual memory tables hence no seg fault. But if you use more than
is available, it is risking a SIGKILL from the OOMK

Hope that clears it up

genss · 12-25-2014, 12:42 PM

Quote:

Originally Posted by smeezekitty

There is a lot of misinformation here.

Linux has a system called overcommitting. Basically what happens
is that any reasonable amount of memory allocated with malloc()
will be allocated in virtual address space and a live pointer to VM will be returned.

Basically the kernel says "I think I can make it fit! I think I can make it fit!" LOL

It doesn't actually allocate any physical memory for that allocated virtual memory
until you write to it. Once you write to that address, the kernel allocates a physical page
to the written region.

What happens if there is not enough physical memory?
Pretty simple. kswapd looks for pages that are not being used actively
and swaps them out to the disk and uses the new space for the newly used allocation.

But if you want, overcommitting can be turned off (through /proc I think)
In which case, allocations greater than free mem + free swap should fail

Saying that malloc() won't return NULL is wrong. When the request is clearly unsatisfiable, malloc() will return NULL

all that has been said
mechanisms are mmap() and brk()
memory is in "pages", that are usually 4k bytes
mmap will not always return 0, more so with todays "advanced" "thread caching" memory allocators

/proc/sys/vm/overcommit_memory is the proc entry
https://www.kernel.org/doc/Documenta...mit-accounting is the documentation

PS it gets even more complicated when you find out how the kernel treats files

smeezekitty · 12-25-2014, 01:08 PM

Quote:

all that has been said

But a lot of things that are incorrect were said such as

Quote:

As another nit - malloc will almost never return a NULL entry, even if it doesn't allocate memory. The only way to know if the pointer is really valid is to actually put data into it, and use a trap handler to catch segfaults.

Malloc will allocate virtual memory or return NULL
The pointer will be valid but that doesn't mean there is physical memory to back it

genss · 12-25-2014, 01:10 PM

Quote:

Originally Posted by smeezekitty

Malloc will allocate virtual memory or return NULL
The pointer will be valid but that doesn't mean there is physical memory to back it

yes
depending on if you take "allocate memory" as allocating virtual memory or real memory that can be taken bout ways
i take it as allocating actual memory