Copy files based on filename
Hi,
Im new to these forums so I want to apologize first of all for my english and secondly Ii want to apologize if I post in the wrong forum to aks this question. I am fairly new to linux and am looking for ways to make my life more convenient. The problem i am currently facing is the following: I want to automatically sort files based on filename. The format of these names is <identifier><date>.xlsx What i want to achieve is having a script to copy the most recent files from a specific identifier to a specfic map. So for example I have the following files: 12345_20120901.xlsx 12345_20120902.xlsx 54321_20120901.xlsx 54321_20120902.xlsx Wat i want is to have file; 12345_20120902.xlsx 54321_20120902.xlsx to be placed in a different folder.. say /mnt/khandi/stuff/ What is the best way to achieve this by using a bashscript? And again, I am very sorry if I post this in the wrong subforum and/or if it is too much to ask. Thanks for the help, it is really appreciated on forehand. Khandi |
Welcome to LQ!
If English is not your native language, I accept your apologies. We can't expect everyone be good at English, but do try to write correct English. :) Accidentally posting in the wrong forum isn't a huge issue, a moderator will move your thread. (I think this is the right forum) I don't know how to have bash find the most recent date, but if you want to do this for today's date, you could do: Code:
mv *$(date +%y%m%d).xlsx /mnt/khandi/stuff Code:
date +%y%m%d Code:
mv *20120926.xlsx /mnt/khandi/stuff Code:
mv 12345_20120926.xlsl 54321_20120926.xlsx /mnt/khandi/stuff Use cp instead of mv if you want to copy the files. |
Menno,
Thank you very much for the help, I really appreciate it. I will look at your snippets tomorrow to see if it will work for what I have in mind. It does seem to be able to get me where I want to be, in a way. Although as I see it, due to using a wildcard for the identifier it does not seem to work on a specfic identifier, right? For example. I would very much want to move the most recent file from a certain identifier based on the date (and if possible even time, example: 12345_201209071345_yadada.xlsx is a valid filename even. So now i put it like this (sorry if I misrepresented the problem by limiting it by date only and not mention time as well) In that line of thought, I was wondering though, since the date as I have it formatted in the filename is build up in <year><month><day><time> the most recent file should always have the highest number (if i ignore the first 6 characters in the filename a.k.a. <identifier_>. Do you perhaps know a way in which i would be able to differentiate between files based on the highest numerical value between different identifiers? so identifier A has several files and i want to move/copy the most recent one and i want the same to happen to identifier B. I don't want to bother you too much since you already have been of great help, but nevertheless i wanted to ask you this in the hopes you know of a way for me to get closer to a solution to my problem. Thanks for the help! |
I'm not a big expert on shell scripting, but this algorithm might work:
Code:
1: read filename You might want to read the Advanced Bash Scripting Guide at tldp.org and the various tutorials at the UNIX grymoire. |
Menno,
Thank you for the fresh perspective on the matter. It does seem I have quite some work ahead of me to figure this one out. But the pointers you gave me are really helpful. Tomorrow i have time to delve into the matter again and will most certainly give your train of thought a chance, see if I can capture your steps into some sane scripting :D. Thanks for the links to those guides they will most likely proof to be very useful. Whenever I have found a solution to my problem I will most certainly share it here for all to see so we can all share our knowledge a bit :) Thanks, again, for your help! |
I think I've found something:
Code:
mv $(ls -t | grep '\.xlsx' | head -n 1) /mnt/khandi/stuff Code:
ls -t The output of ls goes to grep, which only prints files that match a given pattern (.xlsx in this case) and then to head, which only prints the amount of lines specified after -n. The stuff between $( and ) thus gives the name of the most recently modified .xlsx file or (for a greater -n value) files, which mv then moves to /mnt/khandi/stuff. This also makes it redundant to put the date in the filename. I've tested this in bash and sh, but I'm not sure if it works in csh, zsh, ksh or tcsh. UPDATE: it also works in zsh UPDATE: typo corrected, "ls -l" should have been "ls -t" |
That is so awesome!! i will try this first thing tomorrow morning and let you know if it works out! Thanks a lot! really helpful! great! :D
|
Quote:
This might differ between different versions of bash, at least on my Debian Lenny and OpenSuse 11 you need the 't'. |
Quote:
The -l option would output all sorts of other information that I would have to sed out, while grep already puts the individual files on their own lines. I use Code:
GNU bash, version 4.2.37(2)-release (x86_64-unknown-linux-gnu) |
There is one risk in using 'ls -t' for this problem. If the file has been 'touched' in some way or is not created on the date that is in the filename, the file's date does not reflect the datepart of the filename.
Code:
wim@i3-2120:~$ touch abc_19700101.xlsx |
Ah, yes - good point!
Reading OP's first post again, I'm not quite sure about the needs. So, mr OP Khandi: what exactly is it you want to do? * Put all files with same date in the filename in specific folder * Put files created / changed on same date in specific folder * Put the most recent files in specific folder |
Quote:
Which made me realize this whole discussion is pointless: he needs a version control system (I can recommend Mercurial. (see this slashdot thread or the Wikipedia article.)) and then he just executes Code:
mv * /mnt/khandi/stuff |
Again, thanks eveyone for the kind advice. i will look into Mercurial see if that has what i need.
Pingu: So, mr OP Khandi: what exactly is it you want to do? * Put all files with same date in the filename in specific folder * Put files created / changed on same date in specific folder * Put the most recent files in specific folder What i want is to move the most recent file per identifier in a different folder. 12345_20120901.xlsx 12345_20120902.xlsx 54321_20120901.xlsx 54321_20120902.xlsx 12345 and 54321 being the identifiers. I do not have the possibility to change filenames. But they are not being modified on that location. So modification date is something i can work with. so i want The most recent file from 12345 AND the most recent one from 54321 moved to a single location. We are talking about roughly 800 different identifiers. Thats why i want to be able to grab the most recent files in the most efficient way possible. Best thing would be to be able to grab only the files modified between now and 48 hours in the past. To make sure eveything stays tidy. Thanks for all the help. Sorry i responded this late! |
something like this mite work:
Code:
ls -1 | cut -b -5 | sort | uniq | while read line |
I think what pingu was getting at was "How do you define 'most recent'?" Are we going by the modification date of the file (e.g. what you get by running 'ls -l filename') or the date in the name of the file?
|
All times are GMT -5. The time now is 07:17 PM. |