Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I'm copying a load of 1GB files to a USB stick, USB-2 so I expect it to be slow but thats not the problem.
I just want to undestand what's happening, I use a terminal session with the cp command like this
Code:
cp -v CBS1*.mp4 /media/jonke/MOVIES2/CBS/
There are ten files CBS10.mp4 to CBS19.mp4 all about 1.0GB in size.
Each copy takes about 15-20mins
I monitor the process in Thunar and you can see the timestamp changing as the file is copied. This only takes about 2 minutes till the timestamp stops and Thunar reports the full 1.0GB is on the flash drive. You can see the size increasing as the copy progresses.
Then it appears to do nothing for the next 15-20mins before it copies the next file.
I'm guessing the copy is still going on even though you would think it's finished.
So for the last file you MUST wait till the terminal prompt is shown.
that is called cache. Files are copied into cache, not directly to the pendrive.
For the next file the cache is cleared, so the [previous] file is written (copied from the cache) to the device.
that is called cache. Files are copied into cache, not directly to the pendrive.
For the next file the cache is cleared, so the [previous] file is written (copied from the cache) to the device.
That's what I thought. So if I copy just one file Thunar shows that the full 1.0GB is on the drive after about 2 minutes.
Ten minutes later the terminal prompt returns, so during that ten minutes the cache is actually being written and cleared.
If I removed the flash drive before the cache was written, ie before the terminal prompt comes back, the file would be incomplete even though Thunar shows the file has been written.
Is there anyway to monitor the cache? Can I see whats in it and how large it is?
Is that what the sync command is for?
I don't know all these questions!
After a bit of ddg, I found three things I could do:
1 Upgrade cp with the latest coreutils
2 Install pv then use: pv infile > outfile
3 Use dd: dd if=in of=out status=progress
Options 2 and 3 will require a loop - no problem with tah
Option 1 needs cp to be rebuilt
Incomplete ignorance is worse than total ignorance. I have incomplete ignorance, you are edging toward it.
Does that ameliorate the angst ?.
I thought not.
The page cache is global, and no you cannot monitor (or control) your particular usage of it for that copy process. Yes sync can help. but not as you might wish - it generally lengthens the entire procedure, even if providing data integrity on disk.
USB has its own particular issues, but they should have been cleaned up by now. I'll get back with a suggestion after I check my old notes.
You can use dd with "oflag=direct" option to bypass the kernel cache. The "status=progress" option will then show the actual transfer rate to the device. Note that the block size may have an even larger effect on the transfer rate than it does when the cache is used.
@syg00: Remember that we are merely talking about a pen drive, not a fellow participant ... "participants are off limits."
-----
USB sticks are designed to be capacious, but the technical compromise of their design is that they are extremely slow on writing.
If you need to transfer data in this way, I simply recommend that you purchase a USB or USB-C, SSD "hard drive" at your local office-supply store. Which should cost you about fifty bucks (USD). This device is designed to be "fast," and it should have one or two terabytes of capacity. Like a "stick," it requires no external power.
Last edited by sundialsvcs; 06-21-2022 at 09:09 AM.
@syg00: Remember that we are merely talking about a pen drive, not a fellow participant ... "participants are off limits."
-----
USB sticks are designed to be capacious, but the technical compromise of their design is that they are extremely slow on writing.
If you need to transfer data in this way, I simply recommend that you purchase a USB or USB-C, SSD "hard drive" at your local office-supply store. Which should cost you about fifty bucks (USD). This device is designed to be "fast," and it should have one or two terabytes of capacity. Like a "stick," it requires no external power.
I'm not bothered about the speed, I know USB sticks are slow, even usb 3.0 what I want to see is a progress bar of some sort, and maybe a way of speeding it up like what rknichols suggested. I'll test it out. Thganks
You can use dd with "oflag=direct" option to bypass the kernel cache. The "status=progress" option will then show the actual transfer rate to the device. Note that the block size may have an even larger effect on the transfer rate than it does when the cache is used.
Just tried this and its 10 times slower when you use oflag=direct.
dd and pv seem to be equally quick, a 1gb file took approx 160 seconds and there wasn't another 20 min wait while it flushed through.
UPDATE
When I tried this with the actual files all 1GB and over they pv's or dd'd in 160 seconds or so, but then it stopped - I gave up at this point and have gone back to cp.
sync is ok, just it will sync everything, not only that pendrive.
eject may help too, you need to wait for the prompt again, but finally you can remove that device.
pv is not really useful, that is just a progress meter, will not influence the usage of cache. dd also may write into the cache.
If you remove that pendrive without syncing (or ejecting) the result will be unpredictable (like corrupted filesystem, missing files/dirs, other kind of surprises).
But in general your approach is wrong. The time required to copy 1GB is probably 15 minutes. If you don't want to remove that device the OS will use cache and complete it in 2 minutes (and will write the file later, without disturbing you). So cache is used to mimic the operation to the OS, and put the real process into background, but obviously will not make it really faster - if you want to really have it on that device.
sync is ok, just it will sync everything, not only that pendrive.
eject may help too, you need to wait for the prompt again, but finally you can remove that device.
pv is not really useful, that is just a progress meter, will not influence the usage of cache. dd also may write into the cache.
If you remove that pendrive without syncing (or ejecting) the result will be unpredictable (like corrupted filesystem, missing files/dirs, other kind of surprises).
But in general your approach is wrong. The time required to copy 1GB is probably 15 minutes. If you don't want to remove that device the OS will use cache and complete it in 2 minutes (and will write the file later, without disturbing you). So cache is used to mimic the operation to the OS, and put the real process into background, but obviously will not make it really faster - if you want to really have it on that device.
Yep, what you describe is exactly what's happening, it doesn't matter if you use a terminal and cp or fulemanager and drag and drop. It all takes the same amount of time.
Ah well, I just thought there may have been a way to seed things up.
Those little usb sticks have a wide range of write to speeds. None of them write real fast, compared to an external hard drive. Drives/sticks stuck into a USB2 port will write at 24-28MBps. But only if they want to. Usually one only gets 7-8MBps write speed on those little cheap thumb drives.
Three of them out of the same pack. 2 write at 7-12MBps, one writes at 1-3 MBps. Obviously have different chips on it.
At 7MBps, 1GB should take 177 seconds to write, almost 3 minutes.
Code:
>>> 1024000000 / 7168000
142.8571428571428
At 3MBps 1 GB should take 333 seconds to write, almost 6 minutes.
Code:
>>> 1024000000 / 3072000
333.3333333333333
1Gbps, 11 minutes
Code:
>>> 1024000000 / 1024000
1000.0
If yours are taking 15 min for 1GB, then they are dirt slow. I have noticed a difference in write speeds on mounted files systems with different kernels. You could also look at async and flush. I know that works with ntfs.
Quote:
Ah well, I just thought there may have been a way to seed things up.
Purchase faster/better sticks if you can find some. Or a little external SSD.
You can use dd with "oflag=direct" option to bypass the kernel cache. The "status=progress" option will then show the actual transfer rate to the device. Note that the block size may have an even larger effect on the transfer rate than it does when the cache is used.
Quote:
Originally Posted by GPGAgent
Just tried this and its 10 times slower when you use oflag=direct.
What block size option ("bs=" or "obs=") did you use? That operation is going to be dog slow doing a physical I/O operation for each 512-byte (the default) block.
Try the following to reduce the amount of I/O that is kept memory resident before being flushed. This is a system-wide metric and tramples on some other knobs so I tend to only use it when I have to (not in a long time actually). Set it back to zero afterwards. The value is somewhat arbitrary, but used to work for me.
Code:
sudo su -c "echo 10000000 > /proc/sys/vm/dirty_bytes"
What block size option ("bs=" or "obs=") did you use? That operation is going to be dog slow doing a physical I/O operation for each 512-byte (the default) block.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.