LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software > Linux - Kernel
User Name
Password
Linux - Kernel This forum is for all discussion relating to the Linux kernel.

Notices

Reply
 
Search this Thread
Old 07-31-2013, 11:30 PM   #1
undernet
LQ Newbie
 
Registered: Jul 2013
Location: UK
Distribution: Fedora
Posts: 6

Rep: Reputation: Disabled
can not delete a file that has been memory mapped until all dirty pages are written


Hi
I am memory mapping files from java which wraps the mmap kernel function, it all works fine except that when I close my program down and try to delete the memory mapped file the delete hangs for ages until all the dirty pages are written to the file. So If I am memory mapping a 25gb file, do a load of writes (resulting in loads of dirty memory pages that map to the file), close the program down then try to delete the file, the kernel will prevent the delete from happening until all 25gb of dirty pages are written to it, this causes any program using the drive to hang until it has finished. The computer will not shutdown if I ask it to , it will just hang on the fedora shutdown logo, I have to manually turn the switch off, if I do, after I restart my SSD becomes frozen because of the power loss during writing meaning I have to disconnect the SATA power cable to reset it, highly Annoying!

I would like to be able to delete the file immediately and any unfinished dirty page writes are simply erased from memory as the file it is paging out to no longer exists. Is there a kernel parameter to enable this function to give deletes preference?

Last edited by undernet; 08-01-2013 at 09:14 AM.
 
Old 08-01-2013, 06:34 PM   #2
jailbait
Guru
 
Registered: Feb 2003
Location: Blue Ridge Mountain
Distribution: Debian Wheezy, Debian Jessie
Posts: 7,578

Rep: Reputation: 186Reputation: 186
Do you issue a munmap command before you delete the file?

---------------------
Steve Stites
 
Old 08-02-2013, 06:55 PM   #3
undernet
LQ Newbie
 
Registered: Jul 2013
Location: UK
Distribution: Fedora
Posts: 6

Original Poster
Rep: Reputation: Disabled
turns out there is no way to make sure a memory mapped file is unmapped in java.
see this 11 year old bug> http://bugs.sun.com/view_bug.do?bug_id=4724038

only workaround is to make all references to the memorymapped file null then call System.gc() in java and pray that the garbage collector runs.
 
Old 08-05-2013, 12:06 PM   #4
sundialsvcs
Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 5,425

Rep: Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159
It certainly does not surprise me in the slightest that, if you have dirtied hundreds or thousands of pages in a (memory-mapped) disk file, the operating system is going to make sure that all of those physical disk-writes actually get done!

If what you want, instead, is a RAM disk, such that you really don't care if physical disk-writes ever occur, then you can have that, also. . . and you can mmap() it also.
 
Old 08-05-2013, 02:55 PM   #5
undernet
LQ Newbie
 
Registered: Jul 2013
Location: UK
Distribution: Fedora
Posts: 6

Original Poster
Rep: Reputation: Disabled
How can you create a RAM disk of the file if the file is bigger than RAM itself. Thats why I am memory mapping the file because it will not all fit in ram, if it did I would just perform writes to the RAM disk and then write the whole RAM disk out to SSD disk once every few minutes or so in case of power failure.
 
Old 08-06-2013, 11:16 AM   #6
sundialsvcs
Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 5,425

Rep: Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159Reputation: 1159
Most commonly, you map a portion of the file in a "sliding window" approach.

However, it all comes down to your algorithm of choice. If you are truly accessing "that much data," and "truly accessing all of it," then in the end you are going to pay the price of all those disk-writes ... whether the disk-writes come from memory-mapped I/O or from paging caused by the virtual memory subsystem.

You don't describe what you are actually doing .. what your algorithm is .. thus it is impossible to speculate how the algorithm might be improved upon. But I would hazard an almost-certain guess that it could be, using hash-tables or other such data structures which permit accessing a very large name space but which store only the portion that is used. If you're waiting "noticeable seconds or minutes" for a bunch of pending-writes, then you're beating-up the computer pretty good such that it's going to have bruises and a bad attitude.
 
1 members found this post helpful.
Old 08-06-2013, 08:49 PM   #7
undernet
LQ Newbie
 
Registered: Jul 2013
Location: UK
Distribution: Fedora
Posts: 6

Original Poster
Rep: Reputation: Disabled
Smile

Its funny that you mention hashtables because thats exactly what my program is, a disk based hash table, I am currently stress testing it. The file is split into 4KB buckets matching the page size of the OS and SSD. When a write is performed the key is hashed to the right bucket and the Key/value inserted. The 4kb bucket is then written to mapped memory for the OS to page out to disk when it feels like it. It seems the deletes were taking ages because the SSD I was using was not up to scratch. I was using a samsung ssd 840 (not pro version) and it just would lock up under heavy writes for ages, I was also using the xfs file system which has issues with deletes. I have since switched to an intel ssd 320 with ext4 and its much better, I can delete the file soon after the program exits.

The issue I'm having now is to do with slow downs on large files. If i insert 100 million 4kb buckets on a 2gb file it all goes fine, I can do about 200k inserts per second from start to finish. But if I do the same number of inserts with same data on a 20gb file I get a gradual slow down after about 5 million inserts and it just gets worse from there. Iotop says my inserts are taking place at 100MB/sec with disk I/O near 100% when I first start, but at 8 million inserts down the line my inserts are only 10MB/sec with I/O still near 100% which I cant get my head around. It's not as if my data writing patterns change during the insert as each page being written is mostly random due to the hash function, the only thing I can think of that is changing is the linux kernels OS paging patterns but I don't have time to figure out whats going on inside the kernel paging thread.

I will try using a random access file instead of mapped memory and see if I get the same issues, it should be slower in theory due to the explicit OS write calls but maybe bypassing the OS virtual memory paging system might get me more consistent performance on large files.

Last edited by undernet; 08-07-2013 at 02:45 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Memory mapped file descriptor opened after execl is empty but should have content kickuindajunk Linux - General 1 11-03-2012 11:47 AM
Get number of dirty pages of memory-mapped file via syscall dezo Linux - Kernel 1 01-16-2012 12:14 PM
Network device Memory mapped or Port mapped ?? Bignon Linux - Hardware 0 10-20-2009 09:36 AM
how to count the fault pages and the mapped pages riquelme Programming 0 05-20-2004 11:16 PM
Memory Mapped file IO problems legogt Programming 0 08-01-2003 02:52 PM


All times are GMT -5. The time now is 09:36 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration