LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 02-28-2012, 04:08 PM   #1
muggabug
LQ Newbie
 
Registered: Aug 2011
Posts: 18

Rep: Reputation: Disabled
mmap is not faster than read/write ???


Hi

I am trying to use mmap for copying files. However, my mmap programs are not any faster than my ordinary write/read programs.

I am looping over a number of files, and for each I do:

fdin=open(filenamein,O_RDONLY)
fstat(fdin,&statstruct);
fdout=open(filenameout,O_RDWR|O_CREAT|O_TRUNC,S_IRUSR|S_IWUSR)

for the mmap case I do

lseek(fdout,statstruct.st_size-1,SEEK_SET)
write(fdout,"",1)
mapfrom=mmap(0,statstruct.st_size,PROT_READ,MAP_FILE|MAP_SHARED,fdin,0)
mapto=mmap(0,statstruct.st_size,PROT_READ|PROT_WRITE,MAP_FILE|MAP_SHARED,fdout,0)
memcpy(mapto,mapfrom,statstruct.st_size);
munmap(mapto,statstruct.st_size);
munmap(mapfrom,statstruct.st_size);

and with an allocated 'buffer' of 'buffersize', I also tried:
while (n=read(fdin,buffer,buffersize))
write(fdout,buffer,n)

Compiling with gcc/linux 2.4.X, both versions are just as fast. Why don't I get a faster copy with mmap as promised in the books?
 
Old 02-28-2012, 04:13 PM   #2
sundialsvcs
Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 5,329

Rep: Reputation: 1099Reputation: 1099Reputation: 1099Reputation: 1099Reputation: 1099Reputation: 1099Reputation: 1099Reputation: 1099
Memory-mapped files pushe the responsibility for requesting I/O to the virtual-memory manager. However, the ultimate speed of the operation rests upon but one thing: was a physical I/O request necessary, or not?

If the answer is "yes," then of course there will be no appreciable speed difference. If, on the other hand, you are making many random requests for data within a particular window of a file, memory-mapping can help significantly, because they leverage the already highly-optimized algorithms of the virtual memory manager.

Last edited by sundialsvcs; 02-28-2012 at 04:14 PM.
 
Old 03-02-2012, 10:58 AM   #3
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,287

Rep: Reputation: 173Reputation: 173
I think mmap is more for ease of use than speed. That's why I use it anyway.

Just because you mmap'd a file doesn't mean it's been read into memory.
It will still just page it in when you access it. It will I imagine read in in page sizes
just as read does. You can try adjusting the size of the read buffer or comparing small and large files. If you stat a file it shows the preferred block size. Usually about 4096k I think or
the size of a memory page.

Of course i could be wrong or you could play about with this:
http://pubs.opengroup.org/onlinepubs...x_madvise.html

Last edited by bigearsbilly; 03-02-2012 at 10:59 AM.
 
Old 03-02-2012, 12:53 PM   #4
sundialsvcs
Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 5,329

Rep: Reputation: 1099Reputation: 1099Reputation: 1099Reputation: 1099Reputation: 1099Reputation: 1099Reputation: 1099Reputation: 1099
That's exactly how it works. The virtual-memory system can "page in" from any source, not just the paging file. (This is used for example when managing modules and libraries of various kinds.) When you touch a portion of this mapped memory, a page-fault occurs and it is resolved by the OS from the specified file. It can be very efficient especially when many processes need to hit the same file because the copies can readily be shared using well-developed OS code. But, "fast" is entirely dependent on whether or not the data is present. If it's not, then a disk read is going to take place (as it would also take place with any other form of file I/O), and you're going to pay more or less the same price for the privilege. Under the right set of circumstances, for which it was designed, mmap() is the cat's meow. In other circumstances it is nondescript.
 
Old 03-02-2012, 01:30 PM   #5
muggabug
LQ Newbie
 
Registered: Aug 2011
Posts: 18

Original Poster
Rep: Reputation: Disabled
There is a bit more to it than I thought.
I had the code straight out of Stevens' Unix programming book. He used it for one single file. mmap is faster when used on one file, so I thought, well...
But for simply copying a list of files it doesn't seem to help, because, i now understand, for each file it is certain its contents will be read..
 
Old 03-05-2012, 02:10 AM   #6
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,287

Rep: Reputation: 173Reputation: 173
It is a useful tool. Part of programming is to let the OS do the work for you.

It is good for strictly structured files of records.
e.g:
If you have a load of floating points in a file, you mmap them you have an instant array.
No messing about with malloc and all that nonsense. Less chance of error.
Or if you are operating on a file, say encoding it, mmap it you have a convenient giant char string.

think of it more as saving programming time than processing time.
much more valuable.

hint: if you extend an mmap'd file you will need to seek past the end first to establish the new size then write back at the append position.

Last edited by bigearsbilly; 03-05-2012 at 02:13 AM.
 
Old 10-30-2012, 11:02 AM   #7
mbarley42
LQ Newbie
 
Registered: Oct 2012
Posts: 5

Rep: Reputation: Disabled
Quote:
Originally Posted by muggabug View Post

lseek(fdout,statstruct.st_size-1,SEEK_SET)
write(fdout,"",1)
mapfrom=mmap(0,statstruct.st_size,PROT_READ,MAP_FILE|MAP_SHARED,fdin,0)
mapto=mmap(0,statstruct.st_size,PROT_READ|PROT_WRITE,MAP_FILE|MAP_SHARED,fdout,0)
memcpy(mapto,mapfrom,statstruct.st_size);
munmap(mapto,statstruct.st_size);
munmap(mapfrom,statstruct.st_size);

and with an allocated 'buffer' of 'buffersize', I also tried:
while (n=read(fdin,buffer,buffersize))
write(fdout,buffer,n)

Compiling with gcc/linux 2.4.X, both versions are just as fast. Why don't I get a faster copy with mmap as promised in the books?
Hi, muggabug,

The problem with your code is that lseek() to required end of file and then mmap() and memcpy() to it do not create really nice file on physical disk. If you are not on SSD drive, your memcpy() will create serious walking of disk's head. In fact, if you copied first to in-memory buffer, and then to destination memory area, you might have gotten better time.

This is the advantage of read()/write() example: you read to memory holding disk head over source file, then you write holding disk head over destination file - there's no head walking like in mmap() to mmap() example.

Extending destination file with lseek() is bound to create disk fragmentation, and you may find better result by mmap() of read() on source file and plain write() on output. In such example I saw 50% speedup over open() + read() in large mailbox example.

Hope this helps.

Rgdz,
mbarley42
 
Old 10-30-2012, 11:39 AM   #8
mbarley42
LQ Newbie
 
Registered: Oct 2012
Posts: 5

Rep: Reputation: Disabled
IN following example, results are pretty disappointing for mmap() with lseek() to end of file and memcopy. Mmap() with buffer roughly compares to plain read in huge chunk.

Code:
mtodorov@domac:~/c$ time ./mmap-cpy --read /var/mail/mtodorov m1
real    0m46.972s
user    0m0.000s
sys     0m2.580s
mtodorov@domac:~/c$ time ./mmap-cpy --mmap /var/mail/mtodorov m1
real    1m6.064s
user    0m0.320s
sys     0m1.400s
mtodorov@domac:~/c$ time ./mmap-cpy --mmap+buffer /var/mail/mtodorov m1
real    0m47.748s
user    0m0.632s
sys     0m1.600s
mtodorov@domac:~/c$
 
Old 10-30-2012, 03:13 PM   #9
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,287

Rep: Reputation: 173Reputation: 173
I have found in my travails that using a seek a lot is very expensive.

Have you tried using madvise

Last edited by bigearsbilly; 10-30-2012 at 03:20 PM.
 
Old 11-01-2012, 02:08 PM   #10
NevemTeve
Senior Member
 
Registered: Oct 2011
Location: Budapest
Distribution: Debian/GNU/Linux, AIX
Posts: 1,676

Rep: Reputation: 487Reputation: 487Reputation: 487Reputation: 487Reputation: 487
Note: mmap is unix-specific, so if you want to develop multiplatform-programs, don't use it.
 
Old 11-02-2012, 10:33 AM   #11
mbarley42
LQ Newbie
 
Registered: Oct 2012
Posts: 5

Rep: Reputation: Disabled
Quote:
Originally Posted by bigearsbilly View Post
I have found in my travails that using a seek a lot is very expensive.

Have you tried using madvise
IN the end you are defeated by disk speed even if you use Linux-specific sendfile (2). Disk head just doesn't go any faster, and all four methods spend 2.000 - 2.500 seconds in work and 51.0s to 1m06s in waiting on disk. Especially writing.

Rgdz,
mbarley
 
  


Reply

Tags
mmap


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Read Write access to a iso9660 filesystem..mount a .iso image as read write ceazar123 Linux - Newbie 16 09-01-2010 09:07 AM
Read Write access to a iso9660 filesystem..mount a .iso image as read write ceazar123 Linux - General 2 08-26-2010 03:32 PM
Reiserfs vs JFS base on Read, Re-read, Write, Re-write Hesi Linux - Newbie 1 03-19-2010 04:08 AM
Open office read only, K-write read/write mode lwtvh Linux - Newbie 1 07-19-2003 11:33 AM


All times are GMT -5. The time now is 09:26 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration