LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-16-2019, 11:26 AM   #16
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,842

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308

do we really need xargs?
Code:
LC_ALL=C
DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User

Dirs=( ${DirPath%/}/*/ )
[[ ${#Dirs[@]} -le 4 ]] && exit 
unset Dirs[-1] Dirs[-2] Dirs[-3] Dirs[-4] 
echo rm -rf "${Dirs[@]}"
if that command line was too long we can cd DirPath first. If it was still too long we can try xargs
 
Old 10-16-2019, 11:35 AM   #17
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,340

Rep: Reputation: Disabled
Quote:
Originally Posted by Firerat View Post
I'm writing these direct on the forum, no syntax highlight, no Ctrl+p ( vim ; )
I do that myself sometimes. It usually results in multiple edits.

BTW, have you considered the possible implications of this:
Quote:
Originally Posted by Firerat View Post
Code:
Dirs=( ${DirPath%/}/*/ )
Under some circumstances (not in these particular scripts, though, due to the loop) performing an rm -r on the resulting variable $Dir might have a, shall we say, interesting effect.

It makes me shudder whenever someone appends a "/" behind a variable used to hold a path or filename marked for deletion. I think the person who wrote the (un)installer for the Bungie game "Myth II" back in 1998 might have done just that.
 
Old 10-16-2019, 11:48 AM   #18
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,340

Rep: Reputation: Disabled
Quote:
Originally Posted by pan64 View Post
do we really need xargs?
Code:
LC_ALL=C
DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User

Dirs=( ${DirPath%/}/*/ )
[[ ${#Dirs[@]} -le 4 ]] && exit 
unset Dirs[-1] Dirs[-2] Dirs[-3] Dirs[-4] 
echo rm -rf "${Dirs[@]}"
if that command line was too long we can cd DirPath first. If it was still too long we can try xargs
Not to be beating a dead horse or anything, but what's the advantage of that code as opposed to this:
Code:
#!/bin/sh
DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User/
Keep=4

ls -1dtr "${DirPath}*" | head -n -${Keep} | xargs rm -r
This is measurably faster, a lot simpler (in my opinion), and it has the advantage of not relying on specific directory names. Instead, it keeps the four most recent directories based on timestamps, which is really what the OP asked for.

(Honest question.)

Edit: Yes, this also contains a variant of the "empty path" vulnerability, but you'd have to manually type in a blank or wrong path for it to fail.

Last edited by Ser Olmy; 10-16-2019 at 12:14 PM. Reason: Quotes around paths are nice
 
Old 10-16-2019, 11:55 AM   #19
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,842

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
Quote:
Originally Posted by Ser Olmy View Post
Not to be beating a dead horse or anything, but what's the advantage of that code as opposed to this:
Code:
#!/bin/sh
DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User/
Keep=4

ls -1dtr ${DirPath}* | head -n -${Keep} | xargs rm -r
This is measurably faster, a lot simpler (in my opinion), and it has the advantage of not relying on specific directory names. Instead, it keeps the four most recent directories based on timestamps, which is really what the OP asked for.

(Honest question.)
That is an interesting example, because now your code is more readable (and a lot simpler), yes. But I do not think it is faster. How did you measure that?
And - if the filenames (dirnames) contain spaces it will not work properly.
 
Old 10-16-2019, 12:08 PM   #20
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Quote:
Originally Posted by Ser Olmy View Post
I do that myself sometimes. It usually results in multiple edits.

BTW, have you considered the possible implications of this:

Under some circumstances (not in these particular scripts, though, due to the loop) performing an rm -r on the resulting variable $Dir might have a, shall we say, interesting effect.

It makes me shudder whenever someone appends a "/" behind a variable used to hold a path or filename marked for deletion. I think the person who wrote the (un)installer for the Bungie game "Myth II" back in 1998 might have done just that.

yeah, tests should be done
i.e. if ${DirPath} exists ( and is not / )

This is going to be a problem regardless
one reason I had the echo in there
 
Old 10-16-2019, 12:14 PM   #21
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,340

Rep: Reputation: Disabled
Quote:
Originally Posted by pan64 View Post
That is an interesting example, because now your code is more readable (and a lot simpler), yes. But I do not think it is faster. How did you measure that?
I generated 100,000 directories (seq 0 1 99999 | xargs mkdir) and ran a test on an old server.

Turns out populating an array with that many elements takes a non-trivial amount of time. This is arguably irrelevant if the number of directories is anything but huge, but the point is that this makes the script a tiny bit slower, while the initial argument for using an array had to do with speed.
Quote:
Originally Posted by pan64 View Post
And - if the filenames (dirnames) contain spaces it will not work properly.
You're right, there has to be double quotes around the parameter to the ls command; I'll fix that.

Of course, I could argue that the OP's directory structure doesn't have directories with spaces, but then I just made the argument that the solution ought to be more generic, so... mea culpa.

Edit: And I just realized that this is an overly simplistic solution, vulnerable to precisely the same issues that I've been arguing against. Mea maxima culpa, I guess.

pan64 had the right idea:
Quote:
Originally Posted by pan64 View Post
if that command line was too long we can cd DirPath first.
That gets rid of the parameter expansion to the ls command, and provides an extra sanity check to boot.

Revised version:
Code:
#!/bin/sh
DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User
Keep=4

cd "$DirPath" || exit 1

ls -1tr | head -n -${Keep} | xargs rm -r

Last edited by Ser Olmy; 10-16-2019 at 12:25 PM.
 
Old 10-16-2019, 12:19 PM   #22
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
why did you sort the array?

I did say that I was taking advantage of the dirname format and the default order bash would glob them in



background

https://mywiki.wooledge.org/BashGuide/Arrays
 
Old 10-16-2019, 12:24 PM   #23
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,340

Rep: Reputation: Disabled
Quote:
Originally Posted by Firerat View Post
why did you sort the array?

I did say that I was taking advantage of the dirname format and the default order bash would glob them in
I didn't sort anything, I just ran your code. I don't even know why I said that; my brain must have temporarily overheated.

Using the default order is certainly a valid approach in this particular case.
 
Old 10-16-2019, 12:40 PM   #24
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Quote:
Originally Posted by Firerat View Post
I nearly missed that,

Nice

this is faster ( around 2x )

Code:
time cut -d " " -f6 </var/lib/slackpkg/pkglist | od -c -w1 -An | sort -u | grep -Ev "\n|\*"
and it deals with the *

this is even better

twice the speed of yours, and one ( ok maybe one and a half ) commands
Code:
time  awk '{gsub(/[[:print:]]/,"&\n",$6);printf $6 | "sort -u"}' /var/lib/slackpkg/pkglist
 
Old 10-16-2019, 12:53 PM   #25
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
hmm, you must be doing something wrong

I will write a script to time the population of the array
 
Old 10-16-2019, 01:00 PM   #26
circus78
Member
 
Registered: Dec 2011
Posts: 273

Original Poster
Rep: Reputation: Disabled
Wink

Quote:
Originally Posted by scasey View Post
Run the lt -lt in each case and see what the difference is. Let us know what you think.
Code:
# cd /share/CACHEDEV1_DATA/Backup/NetBakData/User
# ls -lt
drwxrwxrwx    3 admin    administ      4096 Oct 16 11:29 2019-10-16-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct 15 11:30 2019-10-15-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct 14 11:30 2019-10-14-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct  9 11:30 2019-10-09-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct  8 11:30 2019-10-08-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct  7 11:29 2019-10-07-11-30-00/
Code:
# ls -lt /share/CACHEDEV1_DATA/Backup/NetBakData/User
drwxrwxrwx    3 admin    administ      4096 Oct 16 11:29 2019-10-16-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct 15 11:30 2019-10-15-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct 14 11:30 2019-10-14-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct  9 11:30 2019-10-09-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct  8 11:30 2019-10-08-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct  7 11:29 2019-10-07-11-30-00/
spot the differences!
 
Old 10-16-2019, 01:03 PM   #27
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,842

Rep: Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308Reputation: 7308
I think that is 1 (one), not l (ell):
ls -1t
 
Old 10-16-2019, 01:05 PM   #28
circus78
Member
 
Registered: Dec 2011
Posts: 273

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by berndbausch View Post
I don't think either of the two pipelines work. You create a long listing and pass filenames like drwxrwxrwx or admin to the rm command. Try without the -l option.

EDIT: The rm -f option prevents error messages to be printed, so that it should actually work (though it's not pretty).
Hi, actually works fine:

Code:
[~] # cd /share/CACHEDEV1_DATA/Backup/NetBakData/User
[/share/CACHEDEV1_DATA/Backup/NetBakData/User] # ls
2019-10-07-11-30-00/ 2019-10-09-11-30-00/ 2019-10-15-11-30-00/
2019-10-08-11-30-00/ 2019-10-14-11-30-00/ 2019-10-16-11-30-00/
[/share/CACHEDEV1_DATA/Backup/NetBakData/User] # ls -lt | tail -n+4
drwxrwxrwx    3 admin    administ      4096 Oct  9 11:30 2019-10-09-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct  8 11:30 2019-10-08-11-30-00/
drwxrwxrwx    3 admin    administ      4096 Oct  7 11:29 2019-10-07-11-30-00/
[/share/CACHEDEV1_DATA/Backup/NetBakData/User] # ls -lt | tail -n+4 | xargs rm -rf
[/share/CACHEDEV1_DATA/Backup/NetBakData/User] # ls
2019-10-14-11-30-00/ 2019-10-15-11-30-00/ 2019-10-16-11-30-00/
[/share/CACHEDEV1_DATA/Backup/NetBakData/User] #
Without -l, same results:

Code:
[/share/CACHEDEV1_DATA/Backup/NetBakData/User] # ls
2019-10-07-12-00-00/ 2019-10-09-12-00-00/ 2019-10-14-12-00-00/ 2019-10-16-12-00-00/
2019-10-08-12-00-00/ 2019-10-11-12-00-00/ 2019-10-15-12-00-00/
[/share/CACHEDEV1_DATA/Backup/NetBakData/User] # cd
[~] # ls -t /share/CACHEDEV1_DATA/Backup/NetBakData/USer| tail -n+4 | xargs rm -rf
[~] # ls /share/CACHEDEV1_DATA/Backup/NetBakData/User
2019-10-07-12-00-00/ 2019-10-09-12-00-00/ 2019-10-14-12-00-00/ 2019-10-16-12-00-00/
2019-10-08-12-00-00/ 2019-10-11-12-00-00/ 2019-10-15-12-00-00/
 
Old 10-16-2019, 01:19 PM   #29
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,340

Rep: Reputation: Disabled
Benchmarks

FWIW, here's how long the various scripts used to delete all but the last 4 out of 100,000 directories.

The directories were created using this procedure:
Code:
~$ mkdir /tmp/test
~$ cd /tmp/test
/tmp/test$ seq -w 0 1 99999 | xargs -I % mkdir qwerty-long-directory-name-asdfghjkl-zxcvbnm-%
The underlying /tmp/test directory was deleted and re-created between tests.

The tests were run via time in a new subshell, like this:
Code:
sync ; time ( bash ../script.sh ; sync )
One dry-run was performed before real testing began, just to populate disk caches.

(I then added the syncs to minimize the effects of RAM caching, but it actually made surprisingly little difference.)

These were the results:

Script #1 (Firerat's initial draft):
Code:
/tmp/test$ sync ; time ( bash ../test1.sh; sync )

real    3m54.783s
user    0m37.145s
sys     3m27.716s
Script #2 (Firerat's first alternative):
Code:
/tmp/test$ sync ; time ( bash ../test2.sh; sync )

real    0m7.035s
user    0m7.090s
sys     0m2.727s
Script #3 (Firerat's second alternative, last revision):
Code:
/tmp/test$ sync ; time ( bash ../test3.sh; sync )

real    0m12.537s
user    0m8.433s
sys     0m2.583s
Script #4 (my script incorporating pan64's suggestion):
Code:
/tmp/test$ sync ; time ( bash ../pipes.sh; sync )

real    0m2.631s
user    0m0.532s
sys     0m2.201s
Yes, the numbers for the first test are accurate; I could hardly believe it myself, and ran the test several times to confirm.

Hardware: The server used is an elderly Fujitsu Primergy RX300 S8 with a single 6-core Xeon E5-2620 v2 CPU running at 2.1 GHz, with 64 Gb DDR3 RAM. The drive array is a 12 Tb 4-drive RAID5 SAS array driven by an LSI MegaRAID SAS 2108 controller.

The server was otherwise totally idle during the tests.

Last edited by Ser Olmy; 10-16-2019 at 01:53 PM. Reason: added some more hardware details
 
Old 10-16-2019, 01:29 PM   #30
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.9.2009
Posts: 5,727

Rep: Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211
Quote:
Originally Posted by circus78 View Post
[CODE]
spot the differences!
I don't see one.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Running bash script from another bash script bulletproof.rs Programming 5 12-10-2017 04:22 AM
[SOLVED] BASH Script - What am I doing wrong in this test? - BASH Script BW-userx Programming 34 04-08-2017 01:36 PM
[bash-script] A question about using grep in the script thomas2004ch Linux - Software 2 03-05-2012 03:27 AM
SSH connection from BASH script stops further BASH script commands tardis1 Linux - Newbie 3 12-06-2010 08:56 AM
Bash script to create bash script jag7720 Programming 10 09-10-2007 07:01 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 07:41 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration