LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-26-2010, 08:03 PM   #1
jorgemarmo
LQ Newbie
 
Registered: Sep 2009
Posts: 17

Rep: Reputation: 0
Lightbulb I need a shell script to delete files wich location should be read from a log file


Hi you all....

I have 2 external hdd in wich I have all my files.... yesterday, I have copied all the files from hdd2 to hdd1 and I want to eliminate duplicates so I used FSLint to find them, now, I have a txt file that looks like this:
Code:
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/TESIS/TESIS CVT LABVIEW Y CODEWARRIOR/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Tesis/Super CD de la tesis/LabView/LabVIEW85RuntimeEngineFull.exe
multiplied by millions of entries...

now I want to make a shell script to delete all the files/entries (read from the log file) that begin with:
Code:
 /media/My Book/HDD_Toshiba/****
Since HDD_Toshiba is the folder in hdd1 (MyBook) that contains all the files from hdd2

I will apreciate any ideas, suggestions and comments.-

Last edited by jorgemarmo; 06-26-2010 at 08:04 PM.
 
Old 06-26-2010, 09:04 PM   #2
GlennsPref
Senior Member
 
Registered: Apr 2004
Location: Brisbane, Australia
Distribution: Devuan
Posts: 3,656
Blog Entries: 33

Rep: Reputation: 283Reputation: 283Reputation: 283
Hi, I use fslint too.

I invoke it from the shell, but it comes up in a simple gui.

I usually delete the files with fslint. (using Ctrl+click to highlight the files to remove)

You may need to be root to access the files.

Otherwise, use (lookup) sed or grep to sort the list

You could use grep like this, creating a new file,

Code:
cat "filename" | grep /media/My Book/HDD_Toshiba/**** > newfilename.txt
Use absolute directory addresses or cd to the appropriate directory.

Regards Glenn

Last edited by GlennsPref; 06-26-2010 at 09:06 PM. Reason: Ctrl+click
 
Old 06-26-2010, 09:27 PM   #3
jorgemarmo
LQ Newbie
 
Registered: Sep 2009
Posts: 17

Original Poster
Rep: Reputation: 0
GlennsPref thanks for your reply!

But I didn't explained myself correctly.... I need to delete the files indicated by those entries that begin with "/media/My Book/HDD_Toshiba/" inside the log file

BTW I use FSLint with the GUI but this log file is huge! and when I try to do anything with FSLint it crashes

Last edited by jorgemarmo; 06-26-2010 at 09:43 PM.
 
Old 06-27-2010, 01:58 AM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
I would just like to ask for clarification, do you want to delete the file from the "My Book" or from the log file?

Because if it were from the path mentioned, surely you would just do:
Code:
rm -rf /media/My Book/HDD_Toshiba
If it is from the log file, I would then suggest sed:
Code:
sed -i.bak '/HDD_Toshiba/d' log_file
 
Old 06-27-2010, 02:51 AM   #5
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Quote:
Originally Posted by grail View Post
I would just like to ask for clarification, do you want to delete the file from the "My Book" or from the log file?

Because if it were from the path mentioned, surely you would just do:
Code:
rm -rf /media/My Book/HDD_Toshiba
If it is from the log file, I would then suggest sed:
Code:
sed -i.bak '/HDD_Toshiba/d' log_file
Hi,

if I understand the OP correctly then it is not that simple. The OP has several duplicates in this directory but not all files are dupllicates. The path to the duplicates is stored in the input file. He wants to delete the duplicates in the /media/My Book/HDD_Toshiba directory. So just doing
Code:
rm -rf /media/My Book/HDD_Toshiba
would probably result in data loss. Well, actually it won't sinc the rm command will break if the path is not quoted or the space in it is escaped.
I had an idea of doing this with an one liner but the spaces always broke the rm command. Not sure if I can correct that behaviour by setting a certain shell option.
Anyway, here is the solution I suggest:
First, copy all duplicate entries to a new file
Code:
grep '/media/My Book/HDD_Toshiba/' filename > dupsonly
Then delete the files with
Code:
while read line; do
rm -f "$line"
done < dupsonly

Last edited by crts; 06-27-2010 at 03:04 AM.
 
1 members found this post helpful.
Old 06-27-2010, 11:35 AM   #6
jorgemarmo
LQ Newbie
 
Registered: Sep 2009
Posts: 17

Original Poster
Rep: Reputation: 0
grail thanks for your reply but I don't want to delete all "/media/My Book/HDD_Toshiba" just the files indicated in the log file, because being in this log means that this file is also in any other place inside "/media/My Book/" however I want to delete it from "/media/My Book/HDD_Toshiba" and keep it in the other location whatever it is....

crts thanks! now I have a couple of doubts....
I'm making some tests using echo instead of rm, and this seems to work just fine
Code:
grep '/media/My Book/HDD_Toshiba' logfile.log | while read file; do echo "$file"; done
and the spaces seems to not brake the echo, so I wonder:
1) should I expect that if I replace the echo with rm, will it break?
2) what's the difference between "while read file" "$file" and "while read line" "$line" the answer sounds a bit obvious but since "file" seems to work fine, I wonder what is the difference.
 
Old 06-27-2010, 12:50 PM   #7
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Quote:
Originally Posted by jorgemarmo View Post
I'm making some tests using echo instead of rm ...
That is a most wise thing to do.
Quote:
1) should I expect that if I replace the echo with rm, will it break?
I ran the command on some sample data an it did not break. I hope you still have a backup of the data on the Toshiba hdd. Since you have such a huge amount of data I guess you cannot check everything, so it is a good thing to have some backup - just in case. It is also a wise practice to test the command on some dummy data first.
As for your approach to pipe the grep to the while loop, I tried it and it also seemed to work fine. However, I had some aother ideas with pipes and in those cases the command broke; it printed out fine with echo even with quotes but the shell did some funny stuff like interpreting the quote as part of the filename and breaking at the first whitespace. However, in those cases I did not pipe to a while loop. So your approach might work just as well.

As for your second point, there is no difference.
Code:
while read file; do echo "$file"; done
'file' in this case is just a variable which stores the output of the read command. You can call it whatever you like. Since in such cases as yours it is used to store a complete line of a file most people call it 'line'.
I hope this clears things up a bit.
 
Old 06-27-2010, 01:42 PM   #8
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
There's one problem when operating on filenames in a script...spaces. If your files have any spaces in them, the shell will see it as a word separator and treat each part as a separate file name.

The easiest way to work around this is to change the shells default internal field separator (IFS) to newline only, making it will ignore spaces. Unfortunately, looking at the above file structure, you seem to also have multiple filenames per line, so that wouldn't work quite right either.

If you can, you should try to export the names you want to a new file first, re-formatting it so that there's exactly one filename per line, and nothing else. Then it becomes almost trivial to process them. Even something as simple as this should work with a file like that:
Code:
IFS=$'\n'  #sets the shell separator to newline

rm -f $(<deletefile.txt)

unset IFS  #returns the separator to normal (space/tab/newline)
 
Old 06-27-2010, 02:19 PM   #9
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,780

Rep: Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081
Code:
grep '^/media/My Book/HDD_Toshiba/' logfile.log | xargs -d '\n' rm
EDIT:
oops, I'm so used to using the -0 option I forgot xargs uses whitespace as delimiters.

Last edited by ntubski; 06-27-2010 at 06:00 PM. Reason: set delimiters
 
Old 06-27-2010, 02:46 PM   #10
crts
Senior Member
 
Registered: Jan 2010
Posts: 2,020

Rep: Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757Reputation: 757
Quote:
Originally Posted by ntubski View Post
Code:
grep '^/media/My Book/HDD_Toshiba/' logfile.log | xargs rm
That did not work. The command breaks because of the whitespace. According to the man page xargs splits the arguments just like the shell does.
Changing the IFS did work though, as suggested by David the H.
 
Old 06-27-2010, 07:49 PM   #11
GlennsPref
Senior Member
 
Registered: Apr 2004
Location: Brisbane, Australia
Distribution: Devuan
Posts: 3,656
Blog Entries: 33

Rep: Reputation: 283Reputation: 283Reputation: 283
text list-tools

Hi, I found this blog informative...

http://linux.dsplabs.com.au/rmnl-rem...-sam-ssam-p65/

Code:
#!/bin/zsh
echo "Create a space-separated-list from top-down list with preceeding chars."
cd ~/build

echo "remove old files"

rm -f ~/build/file2.txt
rm -f ~/build/file3.txt

echo "Remove -  (dash space) from lead of lines using sed...."
echo "sed 's/^[-\ ]*//' file1.txt (from file1.txt in pwd dir)"
echo "redirect output to new file...file2.txt"

sed 's/^[-\ ]*//' file1.txt > file2.txt

echo "Now replace carriage returns with a space... using file2.txt, create file3.txt"

sed ':a;N;$!ba;s/\n/ /g' file2.txt > file3.txt
more info... http://sed.sourceforge.net/sed1line.txt
This is probably a long way around, but it did the job for me.

Cheers, Glenn
 
Old 06-27-2010, 10:22 PM   #12
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Quote:
Originally Posted by crts View Post
That did not work. The command breaks because of the whitespace. According to the man page xargs splits the arguments just like the shell does.
Changing the IFS did work though, as suggested by David the H.
It may not be just because of the whitespace, but also because grep, like so many other unix tools, works on a per-line basis. Since your file isn't formatted that way (or at least doesn't seem to be, it isn't completely clear based on the small sample you gave), the expression would have to be modified to handle it.

By the way, I realized a few hours after posting that my suggestion of "rm $(<file.txt)" would probably not be suitable here. It would work ok if you had a short list, but the shell has a limit in the number of characters that can be handled in a single command line, so a long one like yours would probably force it to error out. It's better to use a loop in this case. xargs should also work (once you take care of the word separation issue), as I believe it will process the input in suitably-sized chunks if necessary.
 
Old 06-28-2010, 12:16 AM   #13
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Personally I have found using the while loop with a here document type expansion seems to work around the spacing issues.
So I would have done the following (which is along the same idea as crts):
Code:
while read file
do
    rm -f "$file"
done< <(grep '^/media/My Book/HDD_Toshiba/' logfile.log)
As compared to piping the grep into the while, should you expand this script at a later date any of the variables set in the loop to be used later
will retain there values as piping in creates its own subshell.
 
Old 06-28-2010, 05:40 AM   #14
jorgemarmo
LQ Newbie
 
Registered: Sep 2009
Posts: 17

Original Poster
Rep: Reputation: 0
Ok, I'm going to put solved on this thread...
Many of the solutions proposed seems to work in the very same way, after trying several of those, I finally use this one:
first with echo, just tu be sure
Code:
grep '/media/My Book/HDD_Toshiba' logfile.log | while read file; do echo "$file"; done
then with rm
Code:
grep '/media/My Book/HDD_Toshiba' logfile.log | while read file; do rm "$file"; done
those never broke with spaces, and there were many of them in the path to the files to be deleted (I'm going to put a larger example of the log file)
and about what David the H. said
Quote:
I realized a few hours after posting that my suggestion of "rm $(<file.txt)" would probably not be suitable here. It would work ok if you had a short list, but the shell has a limit in the number of characters that can be handled in a single command lin
Using the lines I put above, there's no problem since I use it on a huge file (just as reference: original logfile is about 10Mb, a second logfile I created just with the entries to be deleted, was about 2Mb).
and here is the log file example
Code:
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/TESIS/TESIS CVT LABVIEW Y CODEWARRIOR/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Tesis/Super CD de la tesis/LabView/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/!Tesis - lo entregado/Super CD de la tesis/LabView/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/Otros/LabVIEW/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/Otros/TESIS desorden/TESIS CVT LABVIEW Y CODEWARRIOR/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/Otros/TESIS desorden/pruebas CVT/Tesis/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/TESIS/Super CD de la tesis/LabView/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/TESIS/TESIS CVT LABVIEW Y CODEWARRIOR/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/Tesis/Super CD de la tesis/LabView/LabVIEW85RuntimeEngineFull.exe
/media/My Book/INCOMING SHACK/CAD-Math-Pics/Lab.VIEW.N.I_8.5/LabVIEW85RuntimeEngineFull.exe
/media/My Book/!!!MIS DOCUMENTOS/Mis imágenes/!FOTOS/2008-03-17 La Trilla/photoshop/Nena/IMG_0394.psd
/media/My Book/!!!MIS DOCUMENTOS/Mis imágenes/!FOTOS/Otros photoshop/Nena/IMG_0394.psd
/media/My Book/!!!MIS DOCUMENTOS/Temporales/Carpeta de Fotos/photoshop/Nena/IMG_0394.psd
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Documents/Pictures/Carpeta de Fotos/photoshop/Nena/IMG_0394.psd
/media/My Book/HDD_Toshiba/Fotos/Carpeta de Fotos/photoshop/Nena/IMG_0394.psd
/media/My Book/HDD_Toshiba/Fotos/Fotos Wandita post graduacion/2008-03-17 La Trilla/photoshop/Nena/IMG_0394.psd
/media/My Book/HDD_Toshiba/Fotos/Nena/IMG_0394.psd
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/!Tesis - lo entregado/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/!Tesis - lo entregado/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/!Tesis - lo entregado/Super CD de la tesis/Libro de Tesis.pdf
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/TESIS/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/TESIS/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/TESIS/Super CD de la tesis/Libro de Tesis.pdf
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Tesis/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Tesis/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Tesis/Super CD de la tesis/Libro de Tesis.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/!Tesis - lo entregado/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/!Tesis - lo entregado/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/!Tesis - lo entregado/Super CD de la tesis/Libro de Tesis.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/TESIS/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/TESIS/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/TESIS/Super CD de la tesis/Libro de Tesis.pdf
/media/My Book/HDD_Toshiba/Documentos/Tesis/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/HDD_Toshiba/Documentos/Tesis/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/HDD_Toshiba/Documentos/Tesis/Super CD de la tesis/Libro de Tesis.pdf
 
Old 06-28-2010, 05:40 AM   #15
jorgemarmo
LQ Newbie
 
Registered: Sep 2009
Posts: 17

Original Poster
Rep: Reputation: 0
Thanks you all!
 
  


Reply

Tags
batch, delete, parallel, script, xargs



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Shell script to delete folders and files dynamically and recursively rjbaca Linux - General 1 06-21-2010 11:26 AM
How can I use Shell script to edit a data at a particular location in a txt file? leena_d Programming 30 02-08-2010 12:43 AM
Shell script to automatically delete files with the same name as the parent directory pratap.iisc Programming 9 10-12-2009 10:17 AM
Help: Create a shell script to move only files wich has stopped growing proxmity Linux - Newbie 8 07-28-2009 12:40 PM
How to create/delete temp/backup files through a shell script ? Sid2007 Programming 4 10-17-2007 01:55 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:51 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration