[SOLVED] I need a shell script to delete files wich location should be read from a log file
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I need a shell script to delete files wich location should be read from a log file
Hi you all....
I have 2 external hdd in wich I have all my files.... yesterday, I have copied all the files from hdd2 to hdd1 and I want to eliminate duplicates so I used FSLint to find them, now, I have a txt file that looks like this:
Code:
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/TESIS/TESIS CVT LABVIEW Y CODEWARRIOR/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Tesis/Super CD de la tesis/LabView/LabVIEW85RuntimeEngineFull.exe
multiplied by millions of entries...
now I want to make a shell script to delete all the files/entries (read from the log file) that begin with:
Code:
/media/My Book/HDD_Toshiba/****
Since HDD_Toshiba is the folder in hdd1 (MyBook) that contains all the files from hdd2
I will apreciate any ideas, suggestions and comments.-
Last edited by jorgemarmo; 06-26-2010 at 08:04 PM.
But I didn't explained myself correctly.... I need to delete the files indicated by those entries that begin with "/media/My Book/HDD_Toshiba/" inside the log file
BTW I use FSLint with the GUI but this log file is huge! and when I try to do anything with FSLint it crashes
Last edited by jorgemarmo; 06-26-2010 at 09:43 PM.
I would just like to ask for clarification, do you want to delete the file from the "My Book" or from the log file?
Because if it were from the path mentioned, surely you would just do:
Code:
rm -rf /media/My Book/HDD_Toshiba
If it is from the log file, I would then suggest sed:
Code:
sed -i.bak '/HDD_Toshiba/d' log_file
Hi,
if I understand the OP correctly then it is not that simple. The OP has several duplicates in this directory but not all files are dupllicates. The path to the duplicates is stored in the input file. He wants to delete the duplicates in the /media/My Book/HDD_Toshiba directory. So just doing
Code:
rm -rf /media/My Book/HDD_Toshiba
would probably result in data loss. Well, actually it won't sinc the rm command will break if the path is not quoted or the space in it is escaped.
I had an idea of doing this with an one liner but the spaces always broke the rm command. Not sure if I can correct that behaviour by setting a certain shell option.
Anyway, here is the solution I suggest:
First, copy all duplicate entries to a new file
grail thanks for your reply but I don't want to delete all "/media/My Book/HDD_Toshiba" just the files indicated in the log file, because being in this log means that this file is also in any other place inside "/media/My Book/" however I want to delete it from "/media/My Book/HDD_Toshiba" and keep it in the other location whatever it is....
crts thanks! now I have a couple of doubts....
I'm making some tests using echo instead of rm, and this seems to work just fine
Code:
grep '/media/My Book/HDD_Toshiba' logfile.log | while read file; do echo "$file"; done
and the spaces seems to not brake the echo, so I wonder:
1) should I expect that if I replace the echo with rm, will it break?
2) what's the difference between "while read file" "$file" and "while read line" "$line" the answer sounds a bit obvious but since "file" seems to work fine, I wonder what is the difference.
I'm making some tests using echo instead of rm ...
That is a most wise thing to do.
Quote:
1) should I expect that if I replace the echo with rm, will it break?
I ran the command on some sample data an it did not break. I hope you still have a backup of the data on the Toshiba hdd. Since you have such a huge amount of data I guess you cannot check everything, so it is a good thing to have some backup - just in case. It is also a wise practice to test the command on some dummy data first.
As for your approach to pipe the grep to the while loop, I tried it and it also seemed to work fine. However, I had some aother ideas with pipes and in those cases the command broke; it printed out fine with echo even with quotes but the shell did some funny stuff like interpreting the quote as part of the filename and breaking at the first whitespace. However, in those cases I did not pipe to a while loop. So your approach might work just as well.
As for your second point, there is no difference.
Code:
while read file; do echo "$file"; done
'file' in this case is just a variable which stores the output of the read command. You can call it whatever you like. Since in such cases as yours it is used to store a complete line of a file most people call it 'line'.
I hope this clears things up a bit.
There's one problem when operating on filenames in a script...spaces. If your files have any spaces in them, the shell will see it as a word separator and treat each part as a separate file name.
The easiest way to work around this is to change the shells default internal field separator (IFS) to newline only, making it will ignore spaces. Unfortunately, looking at the above file structure, you seem to also have multiple filenames per line, so that wouldn't work quite right either.
If you can, you should try to export the names you want to a new file first, re-formatting it so that there's exactly one filename per line, and nothing else. Then it becomes almost trivial to process them. Even something as simple as this should work with a file like that:
Code:
IFS=$'\n' #sets the shell separator to newline
rm -f $(<deletefile.txt)
unset IFS #returns the separator to normal (space/tab/newline)
That did not work. The command breaks because of the whitespace. According to the man page xargs splits the arguments just like the shell does.
Changing the IFS did work though, as suggested by David the H.
#!/bin/zsh
echo "Create a space-separated-list from top-down list with preceeding chars."
cd ~/build
echo "remove old files"
rm -f ~/build/file2.txt
rm -f ~/build/file3.txt
echo "Remove - (dash space) from lead of lines using sed...."
echo "sed 's/^[-\ ]*//' file1.txt (from file1.txt in pwd dir)"
echo "redirect output to new file...file2.txt"
sed 's/^[-\ ]*//' file1.txt > file2.txt
echo "Now replace carriage returns with a space... using file2.txt, create file3.txt"
sed ':a;N;$!ba;s/\n/ /g' file2.txt > file3.txt
That did not work. The command breaks because of the whitespace. According to the man page xargs splits the arguments just like the shell does.
Changing the IFS did work though, as suggested by David the H.
It may not be just because of the whitespace, but also because grep, like so many other unix tools, works on a per-line basis. Since your file isn't formatted that way (or at least doesn't seem to be, it isn't completely clear based on the small sample you gave), the expression would have to be modified to handle it.
By the way, I realized a few hours after posting that my suggestion of "rm $(<file.txt)" would probably not be suitable here. It would work ok if you had a short list, but the shell has a limit in the number of characters that can be handled in a single command line, so a long one like yours would probably force it to error out. It's better to use a loop in this case. xargs should also work (once you take care of the word separation issue), as I believe it will process the input in suitably-sized chunks if necessary.
Personally I have found using the while loop with a here document type expansion seems to work around the spacing issues.
So I would have done the following (which is along the same idea as crts):
Code:
while read file
do
rm -f "$file"
done< <(grep '^/media/My Book/HDD_Toshiba/' logfile.log)
As compared to piping the grep into the while, should you expand this script at a later date any of the variables set in the loop to be used later
will retain there values as piping in creates its own subshell.
Ok, I'm going to put solved on this thread...
Many of the solutions proposed seems to work in the very same way, after trying several of those, I finally use this one:
first with echo, just tu be sure
Code:
grep '/media/My Book/HDD_Toshiba' logfile.log | while read file; do echo "$file"; done
then with rm
Code:
grep '/media/My Book/HDD_Toshiba' logfile.log | while read file; do rm "$file"; done
those never broke with spaces, and there were many of them in the path to the files to be deleted (I'm going to put a larger example of the log file)
and about what David the H. said
Quote:
I realized a few hours after posting that my suggestion of "rm $(<file.txt)" would probably not be suitable here. It would work ok if you had a short list, but the shell has a limit in the number of characters that can be handled in a single command lin
Using the lines I put above, there's no problem since I use it on a huge file (just as reference: original logfile is about 10Mb, a second logfile I created just with the entries to be deleted, was about 2Mb).
and here is the log file example
Code:
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/TESIS/TESIS CVT LABVIEW Y CODEWARRIOR/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Tesis/Super CD de la tesis/LabView/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/!Tesis - lo entregado/Super CD de la tesis/LabView/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/Otros/LabVIEW/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/Otros/TESIS desorden/TESIS CVT LABVIEW Y CODEWARRIOR/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/Otros/TESIS desorden/pruebas CVT/Tesis/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/TESIS/Super CD de la tesis/LabView/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/TESIS/TESIS CVT LABVIEW Y CODEWARRIOR/LabVIEW85RuntimeEngineFull.exe
/media/My Book/HDD_Toshiba/Documentos/Tesis/Super CD de la tesis/LabView/LabVIEW85RuntimeEngineFull.exe
/media/My Book/INCOMING SHACK/CAD-Math-Pics/Lab.VIEW.N.I_8.5/LabVIEW85RuntimeEngineFull.exe
/media/My Book/!!!MIS DOCUMENTOS/Mis imágenes/!FOTOS/2008-03-17 La Trilla/photoshop/Nena/IMG_0394.psd
/media/My Book/!!!MIS DOCUMENTOS/Mis imágenes/!FOTOS/Otros photoshop/Nena/IMG_0394.psd
/media/My Book/!!!MIS DOCUMENTOS/Temporales/Carpeta de Fotos/photoshop/Nena/IMG_0394.psd
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Documents/Pictures/Carpeta de Fotos/photoshop/Nena/IMG_0394.psd
/media/My Book/HDD_Toshiba/Fotos/Carpeta de Fotos/photoshop/Nena/IMG_0394.psd
/media/My Book/HDD_Toshiba/Fotos/Fotos Wandita post graduacion/2008-03-17 La Trilla/photoshop/Nena/IMG_0394.psd
/media/My Book/HDD_Toshiba/Fotos/Nena/IMG_0394.psd
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/!Tesis - lo entregado/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/!Tesis - lo entregado/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/!Tesis - lo entregado/Super CD de la tesis/Libro de Tesis.pdf
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/TESIS/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/TESIS/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/!!!MIS DOCUMENTOS/Documentos/2 sep2003-jun2009 USB/!TESIS/TESIS/Super CD de la tesis/Libro de Tesis.pdf
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Tesis/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Tesis/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/HDD_Toshiba/Borrable/Pen_Drive_4GB/Tesis/Super CD de la tesis/Libro de Tesis.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/!Tesis - lo entregado/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/!Tesis - lo entregado/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/!Tesis - lo entregado/Super CD de la tesis/Libro de Tesis.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/TESIS/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/TESIS/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/HDD_Toshiba/Documentos/!TESIS/Desorden/TESIS/Super CD de la tesis/Libro de Tesis.pdf
/media/My Book/HDD_Toshiba/Documentos/Tesis/2008-10-18 Libro Daniel Jorge/2008-10-18 Libro Daniel Jorge.pdf
/media/My Book/HDD_Toshiba/Documentos/Tesis/2008-10-18 Libro Daniel Jorge/CD/CVT Banco de Pruebas y Caracterización.pdf
/media/My Book/HDD_Toshiba/Documentos/Tesis/Super CD de la tesis/Libro de Tesis.pdf
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.