[Help!]linux,memory management,c/c++

kefeng.chen · 04-16-2006, 11:00 AM

hi all,

I met the following problems by writing a program for linux using c/c++:

1. how to allocate memory at given address?
2. how can i get to know, which byte in memory is modified?

Has anyone an idea??

kefeng

ta0kira · 04-16-2006, 12:27 PM

If you have an address which you can write to then it's already allocated; allocation is what gives you the address in the first place. You can use in-place new in C++ to create an object at a specific location. Example:

Code:

#include <memory>

int main()
{
        unsigned char Memory[ sizeof(int) ];
        new( (int*) &Memory ) int;
}

What is the purpose of using a specific address?
ta0kira

koningshoed · 04-16-2006, 12:44 PM

Quote:

1. how to allocate memory at given address?

With this I assume you mean get access to memory at a given physical address? If this is indeed what you mean then the following function should achieve what you need:

Code:

#include <sys/mman.h>

void* allocphysmem(off_t phys_addr, size_t sz)
{
    void* virt_addr;

    int devmem = open("/dev/mem", O_RDWR);
    if(devmem < 0)
        return NULL;
    
    virt_addr = mmap(NULL, sz, PROT_READ | PROT_WRITE, MAP_SHARED, devmem, phys_addr);
    close(devmem);

    return virt_addr;
}

This will return a void* pointer you can use to access the memory at phys_addr, please note that it is highly unlikely that this pointer will have any correlation to phys_addr. For example, calling allocphysmem(0xA0000, 0x10000) will allocate 65KiB of memory at some virtual address that just happens to point to physical memory address 0xA0000. You will probably cast the void* to an uint8_t* or unsigned char* or something more usable.

As for the memory trap, I have no idea how to do this. I'm pretty sure most hardware systems can provide this functionality and the kernel probably has some mechanism to expose it (afaik both valgrind and gdb sets up memory traps of sorts). Whether it's possible to detect when some external process modifies memory is debatable - I know where I work we have issues with a process needing to detect some chip modifies a specific value in memory, and detecting this is only possible inside a busy wait loop for all practical purposes.

kefeng.chen · 04-16-2006, 02:02 PM

Quote:

Originally Posted by ta0kira

What is the purpose of using a specific address?

I want to exactly copy the data segment of a process to the memory space of another process running on another computer, so that the target process can use the same pointer to access the same data.

To koningshoed,

what i want is not at a given physical address, but a given virtual address.

paulsm4 · 04-16-2006, 02:11 PM

Hi -

I'd recommend using "SysV IPCs".

You'll need to open a shared memory segment (shmget (), shmat(), etc). Then you'll probably also need to synchronize access to that segment (semget(), semop()).

Here's a good tutorial:
http://kt.squeakydolphin.com/sysv_shared_mem.jsp

This technique is portable across all flavors of *nix (Linux, AIX, HP-UX, Solaris, etc); it's usable from both C and C++

koningshoed · 04-16-2006, 02:29 PM

mmap can do that too. What you are trying to do though is pretty damn complex. What about stack space? Global vars?

And afaik SysV IPCs doesn't work between different computers at this point in time (unless projects like OpenMOSIX has progressed quite a bit further than when I looked at them). What can be said for IPC though is that it is available for any POSIX compliant programming language, or any programming language that allows you access to the appropriate system calls (including python, perl, php, C/C++, pascal ...).

I think I might have an inkling of an idea of what you are trying to do. This wouldn't happen to have anything to do with multi-threading between multiple machines? Building a cluster between a large number of lowish-end machines to create a pretty powerfull system. I'd highly recommend taking a look at systems like OpenMOSIX and/or PVM. Systems for PVM require special design but can be very powerfull. Then again, I may be totally off track

. And perhaps you are the next prodigy that is going to push parallel computing on Linux to be able to run a single memory image over multiple machines

. I know single system image is possible, but the last time I looked at the deffinition that didn't apply to memory space (each process still had to stick to the same machine, or migrate as a process, so even though you may have 500+ processors over the cluster, if each machine only had, say, 4 processors then that 4 would be the max any given process could utilize).

kefeng.chen · 04-16-2006, 04:17 PM

to paulsm4,
I think SysV IPCs is not suitable for my situation.

Quote:

Originally Posted by koningshoed

mmap can do that too. What you are trying to do though is pretty damn complex. What about stack space? Global vars?

can you give me a sample code using mmap so. i am not familiar with the usage of mmap.

Quote:

Originally Posted by koningshoed

I think I might have an inkling of an idea of what you are trying to do. This wouldn't happen to have anything to do with multi-threading between multiple machines? Building a cluster between a large number of lowish-end machines to create a pretty powerfull system.

oh, God! you are a genie!

Quote:

Originally Posted by koningshoed

I'd highly recommend taking a look at systems like OpenMOSIX and/or PVM.

OpenMOSIX is similar to what i want. But i hope to get it without changing the kernel.

thanks for all the answers.

kefeng

paulsm4 · 04-16-2006, 04:30 PM

Sorry, kefeng.chen - I overlooked the part about "on separate computers". Whoops!

koningshoed is absolutely correct - what you're trying to do is NOT trivial and you *definitely* want to investigate how others have addressed the problem before you try to "reinvent the wheel".

Another option you should seriously look at (in addition to konigshoed's excellent suggestions) is OpenMPI:
http://www.open-mpi.org/

Your .. PSM

kefeng.chen · 04-17-2006, 06:57 AM

to paulsm4,

using MPI one must change the manner that he writes programes.
I am trying to make MPI transparent to user, and let users use thread to implemente the parallel computing, because thread programming is familiar to most programmers.

anyway, thanks for your replies.

kefeng

kefeng.chen · 04-18-2006, 04:12 AM

no one knows a solution to the second problem???

aluser · 04-18-2006, 11:27 PM

you probably want mmap() with MAP_FIXED for the first question. For the second, you probably just want to compare the two regions byte-by-byte, but there is sort of a way to detect when your program is writing to various places in memory:

Mark all the memory you're watching as unwriteable, then catch SIGSEGV with sigaction(). Somewhere in the junk your sigaction handler gets is the memory address which caused the seg fault. (poke around in the man pages a bit..) In the handler, you set the memory writable and the faulting instruction will execute successfully after you return. Be careful about ending up in infinite loops. The problem is of course that you'll miss getting notified about the next bytes your program writes. You could try and figure out how to single-step your program it get just the one write while the memory is readable, but I haven't done anything like that.

kefeng.chen · 04-20-2006, 04:31 PM

sample code for allocation memory at certain address

Code:

mmap(given_address, size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS,0,0);

to aluser,

thanks for your replies.

Do you know, if the sig_handler is thread safe??

kefeng

aluser · 04-20-2006, 05:04 PM

Quote:

Do you know, if the sig_handler is thread safe??

well, the BUGS section of my pthread_signal(3) man page says that, on linux, signals are delivered to a thread, not a thread group (since threads are treated as full processes by the kernel). This would imply that the sigaction hack I suggested will work as expected; the faulting thread should handle the SIGSEGV, meaning your handler function will run in the faulting thread and return just before the second execution of the faulting instruction.

This post sounds a bit speculative and non-portable, and it likely is

koningshoed · 12-16-2006, 02:21 AM

Threads are no longer implemented as processes (well, not since recent kernels and glibc 2.4 no longer has support for linux threads, only nptl afaik). And no, signals are not thread safe, for example, you can't take locks within signal handlers, nor can you safely do strange things. Essentially I'd recommend the following advice: Unless it really has to be done inside a signal handler: Don't. And NEVER make any calls to the pthread library from a signal handler.

Sharing memory over different machines is at this point in time impossible to the best of my knowledge. As such one really needs to use something like MPI and one should adjust one's way of coding appropriately. All the memory traps etc that has been seen here are but the tip of the iceberg. What happens with mutexes? Semaphores? atomic variables? file descriptors? No, until hardware can't provide us with a SSI (Single System Image) over multiple machines (And there is research going in this direction) we'll have to settle for MPI-like solutions. Beowulf might prove me wrong, but afaik beowulf only gets an "almost SSI" in that the filesystem is consistent over the machines (well, large parts of it: NFS) and processes can migrate transparently. As for the shared memory over multiple machines, the blade center technology already interconnects the blades using a 32-lane PCI-X bus, as far as I understand it the plan is essentially to build a NUMA-like system over this where each memory bank in a different machine is just seen as a different NUMA-bank. Thus, like on current AMD opteron systems access to non-local RAM will be slower than accessing local RAM.

This _might_ have the added bonus of actually presenting a SMP-like system where you run a single operating system instance over the blades and thus every time you add a blade to the setup it's like hotplugging some extra RAM, CPU and HDD in one go. Alternatively it'll still be an OS/blade but with the ability to "share" RAM between them. I'd personally prefer the first mostly - except that offlining stuff or hardware failure in one blade _might_ cause the entire cluster to go down.

aluser · 12-16-2006, 10:30 AM

Quote:

Threads are no longer implemented as processes

The wikipedia page http://en.wikipedia.org/wiki/Native_...Thread_Library calls NPTL threads processes. I suppose "tasks" might be more appropriate, but it seems to say that they still get a task_struct in the kernel like a normal process.

Agreed about not doing things in signal handlers unless really necessary. I doubt that literal answers to the original questions will actually lead in useful non-academic directions; like you said the program probably needs to be restructured.