LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 12-11-2004, 01:33 PM   #1
Franziss
Member
 
Registered: Dec 2004
Posts: 35

Rep: Reputation: 15
How to do memory management of a program in C?


Hi!

I am working on a project which requires me to write a program in C. This program will build a suffix tree for a string, and at the same time it is suppose to do memory management. For example, if the string size is larger than the RAM, it will do paging strategy which will read in part of the string into the RAM. ( Is this possible? Sorry, I'm a newbie... )

So do I have to modify the Linux Kernel? Or do I make my program to access the kernel to implement the paging strategy? Or is my whole concept of memory management is wrong?
 
Old 12-11-2004, 02:11 PM   #2
btmiller
Senior Member
 
Registered: May 2004
Location: In the DC 'burbs
Distribution: Arch, Scientific Linux, Debian, Ubuntu
Posts: 4,143

Rep: Reputation: 322Reputation: 322Reputation: 322Reputation: 322
You don't have to muck about with the page tables to do memory management in C. Simply use malloc and free to allocate/deallocate memory, and let the kernel worry about when to put particular pages into memory and when to swap them. Simple, easy, and 100% portable. If you really feel like mucking with the page table, the kernel provides system calls that allow you to do so -- see mmap, mlock, mprotect, and friends.

BTW, you might want to request a mod to move this to the programming forum, as it probably fits better there.
 
Old 12-11-2004, 05:26 PM   #3
Mara
Moderator
 
Registered: Feb 2002
Location: Grenoble
Distribution: Debian
Posts: 9,539

Rep: Reputation: 149Reputation: 149
Moved: This thread is more suitable in Programming and has been moved accordingly to help your thread/question get the exposure it deserves.

I understand you want to make a simulation of memory management, not the management itself. To do it in a very basic way you need to kernel modifications. I'd split the strings into smaller parts (let's say 4KB - page size; if they're really that long) and keep them as separate variables. Now when you memory calcualtion shows you're running out of memory you can write a numbe of them to a file and mark them, so you can readtham back when needed.
 
Old 12-12-2004, 01:39 AM   #4
Franziss
Member
 
Registered: Dec 2004
Posts: 35

Original Poster
Rep: Reputation: 15
Quote:
Originally posted by Mara

To do it in a very basic way you need to kernel modifications. I'd split the strings into smaller parts (let's say 4KB - page size; if they're really that long) and keep them as separate variables. Now when you memory calcualtion shows you're running out of memory you can write a numbe of them to a file and mark them, so you can readtham back when needed. [/B]
How do I do these? I have basic knowledge of C programming. Is it enough to do the things that you mentioned? Or do I need to learn more? If so, do you have any recommendations on websites or books that will be helpful? Thanks!
 
Old 12-12-2004, 02:26 AM   #5
btmiller
Senior Member
 
Registered: May 2004
Location: In the DC 'burbs
Distribution: Arch, Scientific Linux, Debian, Ubuntu
Posts: 4,143

Rep: Reputation: 322Reputation: 322Reputation: 322Reputation: 322
How basic is basic. Do you know how to use fopen, fclose, fread, fwrite, fseek, and fprintf to do file I/O? That's really all you need to do what Mara suggested, along with enough knowledge to account for how much memory your are using. Also, is it a requirement that you do your own memory management? If not, yo can do what I suggested a couple posts up. I'm a bit confused as to exactly what the requirements of this project are?
 
Old 12-12-2004, 09:42 AM   #6
Franziss
Member
 
Registered: Dec 2004
Posts: 35

Original Poster
Rep: Reputation: 15
Yea, I need to do memory management and I got a rough idea of fopen, fseek, etc...

My project is on building Suffix Trees based on a string. The string can be DNA sequences of up to 30GB, and the built suffix tree will be about 8 to 10 times larger. Thus, the string and suffix tree data structure will not be able to store in the memory. So how do I do these?

For example, if I use fread to read in the 30Gb string and declare malloc on a pointer pointing to the string input, the pointer will be NULL, because there is not enough memory for it. I think I'll need to do paging strategy, maybe page in a section of the string and suffix tree data structure. But does Linux help me do all these? Or I will have to code these paging strategy?

The programming module that I learnt only taught me how to code, with the assumption that everything can be in the memory. Now, I'm facing the problem that the data structure will be bigger than the memory, which is totally new to me...
 
Old 12-12-2004, 12:09 PM   #7
btmiller
Senior Member
 
Registered: May 2004
Location: In the DC 'burbs
Distribution: Arch, Scientific Linux, Debian, Ubuntu
Posts: 4,143

Rep: Reputation: 322Reputation: 322Reputation: 322Reputation: 322
Then it that case your best strategy is to open up a file and write data that you don't need to use right away to it, as Mara suggested. Try to keep as little in memory as possible. If you know the basics of file I/O this should be relatively easy. Your program is going to have to keep track of what's in memory and what's is your scratch file.

The problem is, 30 GB is probably going to be bigger than memory + swap on just about every machines out there, so even trying to have the kernel move some of the pages to swap space is not likely to help, since there simply isn't enough to hold that amount of data.
 
Old 12-12-2004, 03:54 PM   #8
Mara
Moderator
 
Registered: Feb 2002
Location: Grenoble
Distribution: Debian
Posts: 9,539

Rep: Reputation: 149Reputation: 149
I suggest to keep the data not in one variable (one pointer), but in many. In fact, you don't have a choice - you can't have 30GB structure that way.

Divide it and store pointers to the fragments you currently have in memory (you probably need to perform calculations on the data, right?). 4KB fragments will be probably too small - 64KB or maybe longer.

Now how to do it. Keep the data in a file(s). When you need a fragment, read it and process. You can keep more fragments in memory at the same time and store pointers to them with the place in data when the fragment starts). When you need information, you first look into the list (or tree) of structures keeping pointers to the loaded fragments. If you find the right one - you can process. If not, load the needed fragment and add pointer to it to your tree. Remember to check memory usage from time to time (in short; how many fragments you have loaded) and free if threre are too many.

How would the structure look like:
Code:
struct memory_fragment{
    char *ptr; /* pointer to the fragment */
    unsigned long location; /* offset of the beginning of this fragment */
};
More information about lists, trees, memory allocations, reading/writing to/from file can be found in any C book.
 
Old 12-13-2004, 03:52 AM   #9
Franziss
Member
 
Registered: Dec 2004
Posts: 35

Original Poster
Rep: Reputation: 15
Hi guys!

Thanks for the advice! I guess I should stop reading books on Linux Kernel and start concentrating on C Programming!

Currently, the fastest time of a program to build a suffix tree using a 30gb Dna genome is 30 hrs, guess it got to do too many paging....
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Memory Leak when using memory debugging C program on SuSE SLES8 babalina Linux - Distributions 0 10-06-2003 10:39 AM
Memory management Mojojo Linux - Hardware 11 08-31-2003 11:29 AM
memory management exigent Linux - General 1 08-17-2003 09:28 PM
Memory Management hhegab Linux - General 3 08-07-2002 11:20 PM
Memory Management mrsolo Linux - General 7 06-26-2002 01:55 AM


All times are GMT -5. The time now is 04:11 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration