LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 06-21-2022, 06:48 AM   #1
GPGAgent
Senior Member
 
Registered: Oct 2018
Location: Surrey UK
Distribution: Mint 20 xfce 64bit
Posts: 1,026
Blog Entries: 3

Rep: Reputation: 133Reputation: 133
Question Copying large files to USB stick


I'm copying a load of 1GB files to a USB stick, USB-2 so I expect it to be slow but thats not the problem.

I just want to undestand what's happening, I use a terminal session with the cp command like this
Code:
cp -v CBS1*.mp4 /media/jonke/MOVIES2/CBS/
There are ten files CBS10.mp4 to CBS19.mp4 all about 1.0GB in size.

Each copy takes about 15-20mins

I monitor the process in Thunar and you can see the timestamp changing as the file is copied. This only takes about 2 minutes till the timestamp stops and Thunar reports the full 1.0GB is on the flash drive. You can see the size increasing as the copy progresses.

Then it appears to do nothing for the next 15-20mins before it copies the next file.

I'm guessing the copy is still going on even though you would think it's finished.

So for the last file you MUST wait till the terminal prompt is shown.

Whats going on?

Last edited by GPGAgent; 06-21-2022 at 06:50 AM.
 
Old 06-21-2022, 06:51 AM   #2
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,850

Rep: Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309
that is called cache. Files are copied into cache, not directly to the pendrive.
For the next file the cache is cleared, so the [previous] file is written (copied from the cache) to the device.
 
Old 06-21-2022, 07:48 AM   #3
GPGAgent
Senior Member
 
Registered: Oct 2018
Location: Surrey UK
Distribution: Mint 20 xfce 64bit
Posts: 1,026

Original Poster
Blog Entries: 3

Rep: Reputation: 133Reputation: 133
Quote:
Originally Posted by pan64 View Post
that is called cache. Files are copied into cache, not directly to the pendrive.
For the next file the cache is cleared, so the [previous] file is written (copied from the cache) to the device.
That's what I thought. So if I copy just one file Thunar shows that the full 1.0GB is on the drive after about 2 minutes.
Ten minutes later the terminal prompt returns, so during that ten minutes the cache is actually being written and cleared.
If I removed the flash drive before the cache was written, ie before the terminal prompt comes back, the file would be incomplete even though Thunar shows the file has been written.
Is there anyway to monitor the cache? Can I see whats in it and how large it is?
Is that what the sync command is for?
I don't know all these questions!
 
Old 06-21-2022, 08:07 AM   #4
GPGAgent
Senior Member
 
Registered: Oct 2018
Location: Surrey UK
Distribution: Mint 20 xfce 64bit
Posts: 1,026

Original Poster
Blog Entries: 3

Rep: Reputation: 133Reputation: 133
After a bit of ddg, I found three things I could do:
1 Upgrade cp with the latest coreutils
2 Install pv then use: pv infile > outfile
3 Use dd: dd if=in of=out status=progress
Options 2 and 3 will require a loop - no problem with tah
Option 1 needs cp to be rebuilt

Comments?
 
Old 06-21-2022, 08:36 AM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,128

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Incomplete ignorance is worse than total ignorance. I have incomplete ignorance, you are edging toward it.
Does that ameliorate the angst ?.

I thought not.

The page cache is global, and no you cannot monitor (or control) your particular usage of it for that copy process. Yes sync can help. but not as you might wish - it generally lengthens the entire procedure, even if providing data integrity on disk.
USB has its own particular issues, but they should have been cleaned up by now. I'll get back with a suggestion after I check my old notes.
 
1 members found this post helpful.
Old 06-21-2022, 09:06 AM   #6
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
You can use dd with "oflag=direct" option to bypass the kernel cache. The "status=progress" option will then show the actual transfer rate to the device. Note that the block size may have an even larger effect on the transfer rate than it does when the cache is used.
 
1 members found this post helpful.
Old 06-21-2022, 09:07 AM   #7
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941
@syg00: Remember that we are merely talking about a pen drive, not a fellow participant ... "participants are off limits."

-----

USB sticks are designed to be capacious, but the technical compromise of their design is that they are extremely slow on writing.

If you need to transfer data in this way, I simply recommend that you purchase a USB or USB-C, SSD "hard drive" at your local office-supply store. Which should cost you about fifty bucks (USD). This device is designed to be "fast," and it should have one or two terabytes of capacity. Like a "stick," it requires no external power.

Last edited by sundialsvcs; 06-21-2022 at 09:09 AM.
 
Old 06-21-2022, 09:26 AM   #8
GPGAgent
Senior Member
 
Registered: Oct 2018
Location: Surrey UK
Distribution: Mint 20 xfce 64bit
Posts: 1,026

Original Poster
Blog Entries: 3

Rep: Reputation: 133Reputation: 133
Quote:
Originally Posted by sundialsvcs View Post
@syg00: Remember that we are merely talking about a pen drive, not a fellow participant ... "participants are off limits."

-----

USB sticks are designed to be capacious, but the technical compromise of their design is that they are extremely slow on writing.

If you need to transfer data in this way, I simply recommend that you purchase a USB or USB-C, SSD "hard drive" at your local office-supply store. Which should cost you about fifty bucks (USD). This device is designed to be "fast," and it should have one or two terabytes of capacity. Like a "stick," it requires no external power.
I'm not bothered about the speed, I know USB sticks are slow, even usb 3.0 what I want to see is a progress bar of some sort, and maybe a way of speeding it up like what rknichols suggested. I'll test it out. Thganks
 
Old 06-21-2022, 09:59 AM   #9
GPGAgent
Senior Member
 
Registered: Oct 2018
Location: Surrey UK
Distribution: Mint 20 xfce 64bit
Posts: 1,026

Original Poster
Blog Entries: 3

Rep: Reputation: 133Reputation: 133
Quote:
Originally Posted by rknichols View Post
You can use dd with "oflag=direct" option to bypass the kernel cache. The "status=progress" option will then show the actual transfer rate to the device. Note that the block size may have an even larger effect on the transfer rate than it does when the cache is used.
Just tried this and its 10 times slower when you use oflag=direct.


dd and pv seem to be equally quick, a 1gb file took approx 160 seconds and there wasn't another 20 min wait while it flushed through.

UPDATE
When I tried this with the actual files all 1GB and over they pv's or dd'd in 160 seconds or so, but then it stopped - I gave up at this point and have gone back to cp.

I tried this on two machines with the same result

Last edited by GPGAgent; 06-21-2022 at 10:49 AM.
 
Old 06-21-2022, 10:58 AM   #10
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,850

Rep: Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309
sync is ok, just it will sync everything, not only that pendrive.
eject may help too, you need to wait for the prompt again, but finally you can remove that device.
pv is not really useful, that is just a progress meter, will not influence the usage of cache. dd also may write into the cache.
If you remove that pendrive without syncing (or ejecting) the result will be unpredictable (like corrupted filesystem, missing files/dirs, other kind of surprises).

But in general your approach is wrong. The time required to copy 1GB is probably 15 minutes. If you don't want to remove that device the OS will use cache and complete it in 2 minutes (and will write the file later, without disturbing you). So cache is used to mimic the operation to the OS, and put the real process into background, but obviously will not make it really faster - if you want to really have it on that device.
 
1 members found this post helpful.
Old 06-21-2022, 12:20 PM   #11
GPGAgent
Senior Member
 
Registered: Oct 2018
Location: Surrey UK
Distribution: Mint 20 xfce 64bit
Posts: 1,026

Original Poster
Blog Entries: 3

Rep: Reputation: 133Reputation: 133
Quote:
Originally Posted by pan64 View Post
sync is ok, just it will sync everything, not only that pendrive.
eject may help too, you need to wait for the prompt again, but finally you can remove that device.
pv is not really useful, that is just a progress meter, will not influence the usage of cache. dd also may write into the cache.
If you remove that pendrive without syncing (or ejecting) the result will be unpredictable (like corrupted filesystem, missing files/dirs, other kind of surprises).

But in general your approach is wrong. The time required to copy 1GB is probably 15 minutes. If you don't want to remove that device the OS will use cache and complete it in 2 minutes (and will write the file later, without disturbing you). So cache is used to mimic the operation to the OS, and put the real process into background, but obviously will not make it really faster - if you want to really have it on that device.
Yep, what you describe is exactly what's happening, it doesn't matter if you use a terminal and cp or fulemanager and drag and drop. It all takes the same amount of time.


Ah well, I just thought there may have been a way to seed things up.
 
Old 06-21-2022, 01:18 PM   #12
teckk
LQ Guru
 
Registered: Oct 2004
Distribution: Arch
Posts: 5,137
Blog Entries: 6

Rep: Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826Reputation: 1826
Those little usb sticks have a wide range of write to speeds. None of them write real fast, compared to an external hard drive. Drives/sticks stuck into a USB2 port will write at 24-28MBps. But only if they want to. Usually one only gets 7-8MBps write speed on those little cheap thumb drives.

Three of them out of the same pack. 2 write at 7-12MBps, one writes at 1-3 MBps. Obviously have different chips on it.

At 7MBps, 1GB should take 177 seconds to write, almost 3 minutes.
Code:
>>> 1024000000 / 7168000
142.8571428571428
At 3MBps 1 GB should take 333 seconds to write, almost 6 minutes.
Code:
>>> 1024000000 / 3072000
333.3333333333333
1Gbps, 11 minutes
Code:
>>> 1024000000 / 1024000
1000.0
If yours are taking 15 min for 1GB, then they are dirt slow. I have noticed a difference in write speeds on mounted files systems with different kernels. You could also look at async and flush. I know that works with ntfs.

Quote:
Ah well, I just thought there may have been a way to seed things up.
Purchase faster/better sticks if you can find some. Or a little external SSD.
 
1 members found this post helpful.
Old 06-21-2022, 04:37 PM   #13
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Quote:
Originally Posted by rknichols View Post
You can use dd with "oflag=direct" option to bypass the kernel cache. The "status=progress" option will then show the actual transfer rate to the device. Note that the block size may have an even larger effect on the transfer rate than it does when the cache is used.
Quote:
Originally Posted by GPGAgent View Post
Just tried this and its 10 times slower when you use oflag=direct.
What block size option ("bs=" or "obs=") did you use? That operation is going to be dog slow doing a physical I/O operation for each 512-byte (the default) block.
 
Old 06-21-2022, 11:00 PM   #14
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,128

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Try the following to reduce the amount of I/O that is kept memory resident before being flushed. This is a system-wide metric and tramples on some other knobs so I tend to only use it when I have to (not in a long time actually). Set it back to zero afterwards. The value is somewhat arbitrary, but used to work for me.
Code:
sudo su -c "echo 10000000 > /proc/sys/vm/dirty_bytes"
 
1 members found this post helpful.
Old 06-22-2022, 06:37 AM   #15
GPGAgent
Senior Member
 
Registered: Oct 2018
Location: Surrey UK
Distribution: Mint 20 xfce 64bit
Posts: 1,026

Original Poster
Blog Entries: 3

Rep: Reputation: 133Reputation: 133
Quote:
Originally Posted by rknichols View Post
What block size option ("bs=" or "obs=") did you use? That operation is going to be dog slow doing a physical I/O operation for each 512-byte (the default) block.
I used 2048 for both
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to differentiate a mounted usb stick vs plugged in but not mounted usb stick. andrewysk Linux - Newbie 22 04-05-2021 02:42 PM
System slows down/intermittent freeze happens while copying large files to USB HDD PrinceCruise Linux - Desktop 18 07-24-2015 11:37 AM
[SOLVED] Copying the files inside a folder, without copying the folder (hopefully easy) tibberous Linux - Software 3 12-23-2010 01:50 AM
LXer: This week at LWN: Large pages, large blocks, and large problems LXer Syndicated Linux News 0 09-27-2007 11:40 AM
I/O error when copying large file from Memory Stick airman99 Linux - General 3 08-22-2005 09:26 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 10:34 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration