LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-16-2019, 04:48 AM   #1
circus78
Member
 
Registered: Dec 2011
Posts: 273

Rep: Reputation: Disabled
bash script question


Hi,
my goal is to delete all files/directories in specific path, but retain the last "n".
I mean:

Code:
[~] # ls -lt /share/CACHEDEV1_DATA/Backup/NetBakData/User
drwxrwxrwx    3 admin    administ      4096 Oct 16 09:59 2019-10-16-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct 15 09:59 2019-10-15-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct 14 09:59 2019-10-14-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct  9 09:59 2019-10-09-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct  8 09:59 2019-10-08-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct  7 09:59 2019-10-07-10-00-00/
I would like to keep most recent 4 directories.

If I type this command in relevant path (/share/CACHEDEV1_DATA/Backup/NetBakData/User), all is ok:

$ cd /share/CACHEDEV1_DATA/Backup/NetBakData/User
$ ls -lt | tail -n+4 | xargs rm -rf

If I use same command with absolute path, nothing happens:

$ ls -lt /share/CACHEDEV1_DATA/Backup/NetBakData/User | tail -n+4 | xargs rm -rf



why?

Thank you!
 
Old 10-16-2019, 05:12 AM   #2
berndbausch
LQ Addict
 
Registered: Nov 2013
Location: Tokyo
Distribution: Mostly Ubuntu and Centos
Posts: 6,316

Rep: Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002Reputation: 2002
Quote:
Originally Posted by circus78 View Post
$ cd /share/CACHEDEV1_DATA/Backup/NetBakData/User
$ ls -lt | tail -n+4 | xargs rm -rf

If I use same command with absolute path, nothing happens:

$ ls -lt /share/CACHEDEV1_DATA/Backup/NetBakData/User | tail -n+4 | xargs rm -rf

why?
I don't think either of the two pipelines work. You create a long listing and pass filenames like drwxrwxrwx or admin to the rm command. Try without the -l option.

EDIT: The rm -f option prevents error messages to be printed, so that it should actually work (though it's not pretty).

Last edited by berndbausch; 10-16-2019 at 05:14 AM.
 
Old 10-16-2019, 07:10 AM   #3
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
well, does
Code:
ls -tl /share/CACHEDEV1_DATA/Backup/NetBakData/User
list the files including the full path?

no

your first example works ( with lots of errors ) as the full path eventually hits

Code:
drwxrwxrwx    3 admin    administ      4096 Oct  8 09:59 2019-10-08-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct  7 09:59 2019-10-07-10-00-00/
the ones in red produce error ( they do not exist )
the green do exist, they get deleted.

pipe trains are awful., don't use them
ls is for humans, don't use them in scripts

I'm going to ignore "files", and only look at the "dirs", taking advantage of the sort order to delete the oldest

This assumes your dirnames will always have that "date format"

Code:
#!/bin/bash
LC_ALL=C
# set the locale to ensure sort order is consistent

DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User
Keep=4

Dirs=( ${DirPath%/}/*/ )

for (( i=0 ; i < $(( ${#Dirs[@]} - $Keep )) ; i++ ))
do
    echo rm -rf \"${Dirs[$i]}\"
done

Nearly forgot
https://mywiki.wooledge.org/BashGuide/Arrays

Last edited by Firerat; 10-16-2019 at 07:12 AM.
 
1 members found this post helpful.
Old 10-16-2019, 08:04 AM   #4
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,371

Rep: Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750
Quote:
pipe trains are awful., don't use them
Pourquoi?
For me, it is just following the *nix philosophy of using a simple tool for a single task, then linking those tasks together.
 
Old 10-16-2019, 08:13 AM   #5
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Quote:
Originally Posted by allend View Post
Pourquoi?
For me, it is just following the *nix philosophy of using a simple tool for a single task, then linking those tasks together.
did that ideology work for the OP ?

did I achieve what the OP wanted to without using pipe trains?

pipe trains are slow, and ugly
They do have a use in reconnaissance while you explore something.. but once you begin writing a script you need to stop using those trains. especially the ones with UUOC and UUOE , awful scripts, truly awful.
 
Old 10-16-2019, 09:07 AM   #6
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,340

Rep: Reputation: Disabled
Quote:
Originally Posted by Firerat View Post
did that ideology work for the OP ?
No, because the commands weren't used correctly. That's not really what I'd call an "ideological" issue.
Quote:
Originally Posted by Firerat View Post
did I achieve what the OP wanted to without using pipe trains?
Perhaps, perhaps not:
Quote:
Originally Posted by Firerat View Post
Code:
Dirs=( ${DirPath%/}/*/ )
The above will break if the string containing all file names in ${DirPath} exceeds the OS line length limit. This limit can be anything from a few kilobytes (Cygwin...) to a few megabytes, so this is a very real issue. For that reason the practice of using shell expansion directly in scripts is discouraged.
Quote:
Originally Posted by Firerat View Post
pipe trains are slow, and ugly
Whether they are slower than using a different approach depends very much on the task being performed.

As for "ugliness", a list of piped commands are usually exceedingly easy to understand. Of course, you could also make a list of commands look like a jumbled mess, but that's true for any method or tool used in scripting or programming. "A poor craftsman blames his tools."

(Edit: BTW, did you try the code before posting it? It doesn't do what you think it does, and certainly not what the OP asked for.)

Last edited by Ser Olmy; 10-16-2019 at 09:24 AM.
 
Old 10-16-2019, 09:20 AM   #7
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.9.2009
Posts: 5,727

Rep: Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211
Quote:
Originally Posted by circus78 View Post
Hi,
my goal is to delete all files/directories in specific path, but retain the last "n".
I mean:

Code:
[~] # ls -lt /share/CACHEDEV1_DATA/Backup/NetBakData/User
drwxrwxrwx    3 admin    administ      4096 Oct 16 09:59 2019-10-16-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct 15 09:59 2019-10-15-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct 14 09:59 2019-10-14-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct  9 09:59 2019-10-09-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct  8 09:59 2019-10-08-10-00-00/
drwxrwxrwx    3 admin    administ      4096 Oct  7 09:59 2019-10-07-10-00-00/
I would like to keep most recent 4 directories.

If I type this command in relevant path (/share/CACHEDEV1_DATA/Backup/NetBakData/User), all is ok:

$ cd /share/CACHEDEV1_DATA/Backup/NetBakData/User
$ ls -lt | tail -n+4 | xargs rm -rf

If I use same command with absolute path, nothing happens:

$ ls -lt /share/CACHEDEV1_DATA/Backup/NetBakData/User | tail -n+4 | xargs rm -rf



why?

Thank you!
Run the lt -lt in each case and see what the difference is. Let us know what you think.
 
Old 10-16-2019, 09:38 AM   #8
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Quote:
Originally Posted by Ser Olmy View Post
No, because the commands weren't used correctly. That's not really what I'd call an "ideological" issue.

Perhaps, perhaps not:

The above will break if the string containing all file names in ${DirPath} exceeds the OS line length limit. This limit can be anything from a few kilobytes (Cygwin...) to a few megabytes, so this is a very real issue. For that reason the practice of using shell expansion directly in scripts is discouraged.

Whether they are slower than using a different approach depends very much on the task being performed.

As for "ugliness", a list of piped commands are usually exceedingly easy to understand. Of course, you could also make a list of commands look like a jumbled mess, but that's true for any method or tool used in scripting or programming. "A poor craftsman blames his tools."

vars have no issue with ARG_MAX
https://stackoverflow.com/questions/...nd-input-limit

but if you want to craft something that will fail as an example I will be happy to look at it


an alternative

Code:
LC_ALL=C
DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User
Keep=4

Dirs=( ${DirPath%/}/*/ )

for (( i=0 ; i < $(( ${#Dirs[@]} - $Keep )) ; i++ ))
do
     printf "%s\n" "${Dirs[$i]}"
done | xargs echo rm -r

another
Code:
LC_ALL=C
DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User

Dirs=( ${DirPath%/}/*/ )
[[ ${#Dirs[@]} -le 4 ]] && exit 
unset Dirs[-1] Dirs[-2] Dirs[-3] Dirs[-4]
xargs echo rm -r ${Dirs[@]}
 
Old 10-16-2019, 09:42 AM   #9
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,371

Rep: Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750
Quote:
pipe trains are slow, and ugly
so you would not like this
 
Old 10-16-2019, 10:34 AM   #10
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,340

Rep: Reputation: Disabled
Quote:
Originally Posted by Firerat View Post
vars have no issue with ARG_MAX
That's very interesting! I knew this worked in Linux, but what I did not know was that it's part of the POSIX specification, and as such it should work basically everywhere.

I still maintain that it's not exactly "best practices" to use what basically amounts to a trick that only works with builtins. If used in any other context than assigning values to variables, it will break if the command isn't a builtin, which in many cases depends entirely on the shell being used.

You could end up with a perfectly generic looking command that would turn out to be a bashism under certain non-obvious circumstances.
Quote:
Originally Posted by Firerat View Post
an alternative

Code:
LC_ALL=C
DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User
Keep=4

Dirs=( ${DirPath%/}/*/ )

for (( i=0 ; i < $(( ${#Dirs[@]} - $Keep )) ; i++ ))
do
     printf "%s\n" "${Dirs[$i]}"
done | xargs echo rm -r
This works fine, but the process of reading directories into an array slows the whole script down. Try this with a few thousand directory names and see what happens.

Simply generating a list of directories and piping them directly to rm skips that step entirely, and by using the correct command to fetch the directory names, the list could arrive pre-sorted.

I mean, you're using a pipe anyway.
Quote:
Originally Posted by Firerat View Post
another
Code:
LC_ALL=C
DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User

Dirs=( ${DirPath%/}/*/ )
[[ ${#Dirs[@]} -le 4 ]] && exit 
unset Dirs[-1] Dirs[-2] Dirs[-3] Dirs[-4]
xargs echo rm -r ${Dirs[@]}
Again, did you try this? What's that xargs command doing there?

Also, both xargs and rm are external commands, and as such are subject to ARG_MAX.

And all this to avoid a pipe sequence that could look as simple as this:
Code:
ls -1dtr ${DirPath} | head -n -4 | xargs rm -r

Last edited by Ser Olmy; 10-16-2019 at 10:45 AM. Reason: s/files/directories/, s/2/4/
 
Old 10-16-2019, 10:44 AM   #11
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
yes, I threw the xargs in to deal with huge numbers of dirs


ah bless, another one without the basics


xargs is specifically to get round ARG_MAX issue

that was never a problem with my original script
ok, with lots of dir individual rm -r from the for loop is going to be slow, but the actual unlink is going to take much longer than iterating over the array

man xargs
Code:
NAME
       xargs - build and execute command lines from standard input

SYNOPSIS
       xargs [options] [command [initial-arguments]]

DESCRIPTION
       This manual page documents the GNU version of xargs.  xargs reads items from the standard input, delimited by blanks (which can be pro-
       tected with double or single quotes or a backslash) or newlines, and executes the command (default is /bin/echo) one or more times with
       any initial-arguments followed by items read from standard input.  Blank lines on the standard input are ignored.

       The  command  line for command is built up until it reaches a system-defined limit (unless the -n and -L options are used).  The speci-
       fied command will be invoked as many times as necessary to use up the list of input items.  In general, there will be many fewer  invo-
       cations  of  command  than there were items in the input.  This will normally have significant performance benefits.  Some commands can
       usefully be executed in parallel too; see the -P option.

       Because Unix filenames can contain blanks and newlines, this default behaviour is often problematic; filenames containing blanks and/or
       newlines  are  incorrectly  processed  by  xargs.  In these situations it is better to use the -0 option, which prevents such problems.
       When using this option you will need to ensure that the program which produces the input for xargs also uses a null character as a sep-
       arator.  If that program is GNU find for example, the -print0 option does this for you.

       If  any  invocation of the command exits with a status of 255, xargs will stop immediately without reading any further input.  An error
       message is issued on stderr when this happens.
please educate yourself
 
Old 10-16-2019, 11:11 AM   #12
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Quote:
Originally Posted by allend View Post
so you would not like this
I nearly missed that,

Nice

this is faster ( around 2x )

Code:
time cut -d " " -f6 </var/lib/slackpkg/pkglist | od -c -w1 -An | sort -u | grep -Ev "\n|\*"
and it deals with the *
 
Old 10-16-2019, 11:12 AM   #13
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 3,340

Rep: Reputation: Disabled
Quote:
Originally Posted by Firerat View Post
yes, I threw the xargs in to deal with huge numbers of dirs
If so, you did it wrong. The xargs command never receives any data and just sits there, because it expects data to arrive via a pipe, something you're not using. A quick test run would have told you that.
Quote:
Originally Posted by Firerat View Post
Code:
SYNOPSIS
       xargs [options] [command [initial-arguments]
No matter how xargs is invoked, it must in fact be invoked before anything happens, and anything you supply as "initial-arguments" (that would be ${Dirs[@]} in this case) are obviously subject to ARG_MAX.

This is laughably easy to test. (Spoiler: The error message from xargs is "xargs: can not fit single argument within argument list size limit".)
Quote:
Originally Posted by Firerat View Post
that was never a problem with my original script
No, besides the potentially memory/time hogging array, your original script contained some odd escaping that resulted in something I've honestly never seen before. You should remove just the echo statement and try it, it's actually completely safe.
Quote:
Originally Posted by Firerat View Post
ok, with lots of dir individual rm -r from the for loop is going to be slow, but the actual unlink is going to take much longer than iterating over the array
It's generating the array that takes time, and lots of it, for no benefit at all.

There surely must exist scenarios where it makes sense to read everything into memory before processing the data, but this just isn't one.
Quote:
Originally Posted by Firerat View Post
ah bless, another one without the basics
Quote:
Originally Posted by Firerat View Post
please educate yourself
You don't think it's a bit arrogant to make statements such as these, when you haven't even checked to see that perhaps you might be in the wrong? Which in this case, you were. (And even if you weren't, why be rude?)
 
Old 10-16-2019, 11:19 AM   #14
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,850

Rep: Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309
Quote:
Originally Posted by allend View Post
so you would not like this
that is completely offtopic here, but anyway yes, it is ugly and slow.
But usually that does not count, because the machine is more or less idle and has time to execute a lot of programs simultaneously. Furthermore 0.1ms or 15ms or anything similar (as execution time) is virtually the same, all are acceptable.
But in a production environment, where there are a plenty of users running hundreds of tasks this is just waste of time, ram and cpu.
That pipe chain [most of them] can be easily replaced by a single awk/perl/python/java/whatever script, which will be much faster and more efficient. You can find examples in this forum too.
 
Old 10-16-2019, 11:21 AM   #15
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
you are right


corrected ( which would have happened during initial testing
Code:
LC_ALL=C
DirPath=/share/CACHEDEV1_DATA/Backup/NetBakData/User

Dirs=( ${DirPath%/}/*/ )
[[ ${#Dirs[@]} -le 4 ]] && exit 
unset Dirs[-1] Dirs[-2] Dirs[-3] Dirs[-4]
<<<${Dirs[@]} xargs echo rm -r
I'm writing these direct on the forum, no syntax highlight, no Ctrl+p ( vim ; )
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Running bash script from another bash script bulletproof.rs Programming 5 12-10-2017 04:22 AM
[SOLVED] BASH Script - What am I doing wrong in this test? - BASH Script BW-userx Programming 34 04-08-2017 01:36 PM
[bash-script] A question about using grep in the script thomas2004ch Linux - Software 2 03-05-2012 03:27 AM
SSH connection from BASH script stops further BASH script commands tardis1 Linux - Newbie 3 12-06-2010 08:56 AM
Bash script to create bash script jag7720 Programming 10 09-10-2007 07:01 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 01:43 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration