LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 09-26-2012, 01:12 PM   #1
Khandi
LQ Newbie
 
Registered: Sep 2012
Posts: 6

Rep: Reputation: Disabled
Copy files based on filename


Hi,

Im new to these forums so I want to apologize first of all for my english and secondly Ii want to apologize if I post in the wrong forum to aks this question. I am fairly new to linux and am looking for ways to make my life more convenient.

The problem i am currently facing is the following:

I want to automatically sort files based on filename. The format of these names is <identifier><date>.xlsx

What i want to achieve is having a script to copy the most recent files from a specific identifier to a specfic map. So for example I have the following files:

12345_20120901.xlsx
12345_20120902.xlsx
54321_20120901.xlsx
54321_20120902.xlsx

Wat i want is to have file;
12345_20120902.xlsx
54321_20120902.xlsx

to be placed in a different folder.. say /mnt/khandi/stuff/

What is the best way to achieve this by using a bashscript?

And again, I am very sorry if I post this in the wrong subforum and/or if it is too much to ask.

Thanks for the help, it is really appreciated on forehand.

Khandi

Last edited by Khandi; 09-26-2012 at 03:37 PM.
 
Old 09-26-2012, 03:07 PM   #2
mennohellinga
Member
 
Registered: Jun 2012
Location: Netherlands
Distribution: Archlinux
Posts: 72

Rep: Reputation: 38
Welcome to LQ!
If English is not your native language, I accept your apologies. We can't expect everyone be good at English, but do try to write correct English.
Accidentally posting in the wrong forum isn't a huge issue, a moderator will move your thread. (I think this is the right forum)

I don't know how to have bash find the most recent date, but if you want to do this for today's date, you could do:
Code:
mv *$(date +%y%m%d).xlsx /mnt/khandi/stuff
When bash reads this line, it first executes the commands between $( and ) and replaces these with their output.

Code:
date +%y%m%d
tells the computer to output the date in year-month-day format, so the command then becomes

Code:
mv *20120926.xlsx /mnt/khandi/stuff
The asterisk character is a wildcard that can be substituted for any number of any characters, which is everything, so that command then expands to:

Code:
mv 12345_20120926.xlsl 54321_20120926.xlsx /mnt/khandi/stuff
mv is the command to move files, from one place to another, so 12345_20120926.xlsx and 54321_20120926.xlsx are moved into /mnt/khandi/stuff.
Use cp instead of mv if you want to copy the files.

Last edited by mennohellinga; 09-26-2012 at 03:42 PM. Reason: typo
 
2 members found this post helpful.
Old 09-26-2012, 03:42 PM   #3
Khandi
LQ Newbie
 
Registered: Sep 2012
Posts: 6

Original Poster
Rep: Reputation: Disabled
Menno,

Thank you very much for the help, I really appreciate it. I will look at your snippets tomorrow to see if it will work for what I have in mind. It does seem to be able to get me where I want to be, in a way.

Although as I see it, due to using a wildcard for the identifier it does not seem to work on a specfic identifier, right? For example. I would very much want to move the most recent file from a certain identifier based on the date (and if possible even time, example: 12345_201209071345_yadada.xlsx is a valid filename even. So now i put it like this (sorry if I misrepresented the problem by limiting it by date only and not mention time as well)

In that line of thought, I was wondering though, since the date as I have it formatted in the filename is build up in <year><month><day><time> the most recent file should always have the highest number (if i ignore the first 6 characters in the filename a.k.a. <identifier_>. Do you perhaps know a way in which i would be able to differentiate between files based on the highest numerical value between different identifiers? so identifier A has several files and i want to move/copy the most recent one and i want the same to happen to identifier B.

I don't want to bother you too much since you already have been of great help, but nevertheless i wanted to ask you this in the hopes you know of a way for me to get closer to a solution to my problem.

Thanks for the help!

Last edited by Khandi; 09-26-2012 at 03:52 PM.
 
Old 09-26-2012, 03:51 PM   #4
mennohellinga
Member
 
Registered: Jun 2012
Location: Netherlands
Distribution: Archlinux
Posts: 72

Rep: Reputation: 38
I'm not a big expert on shell scripting, but this algorithm might work:

Code:
1: read filename
2: remove filename extension
3: read last 6 characters and store them in a variable, let's call it HIGHESTNUMBER
4: execute steps 1 and 2, but only update HIGHESTNUMBER if those six characters are greater than HIGHESTNUMBER
5: do step 4 until there are no more files left.
mv *$HIGHESTNUMBER.xlsx /mnt/khandi/stuff
We'll have to find a way to translate steps 1 through 5 to bash (or another language) and then it should work.
You might want to read the Advanced Bash Scripting Guide at tldp.org and the various tutorials at the UNIX grymoire.
 
Old 09-26-2012, 03:59 PM   #5
Khandi
LQ Newbie
 
Registered: Sep 2012
Posts: 6

Original Poster
Rep: Reputation: Disabled
Menno,

Thank you for the fresh perspective on the matter. It does seem I have quite some work ahead of me to figure this one out. But the pointers you gave me are really helpful. Tomorrow i have time to delve into the matter again and will most certainly give your train of thought a chance, see if I can capture your steps into some sane scripting . Thanks for the links to those guides they will most likely proof to be very useful. Whenever I have found a solution to my problem I will most certainly share it here for all to see so we can all share our knowledge a bit

Thanks, again, for your help!
 
Old 09-26-2012, 04:51 PM   #6
mennohellinga
Member
 
Registered: Jun 2012
Location: Netherlands
Distribution: Archlinux
Posts: 72

Rep: Reputation: 38
I think I've found something:
Code:
mv $(ls -t | grep '\.xlsx' | head -n 1) /mnt/khandi/stuff
In UNIX-like operating systems, every file has a timestamp, that tells you when it was last modified. The command
Code:
ls -t
prints all files in the directory, the most recently modified file first. The '|' character makes the output of one command the input of the other.
The output of ls goes to grep, which only prints files that match a given pattern (.xlsx in this case) and then to head, which only prints the amount of lines specified after -n. The stuff between $( and ) thus gives the name of the most recently modified .xlsx file or (for a greater -n value) files, which mv then moves to /mnt/khandi/stuff.
This also makes it redundant to put the date in the filename.

I've tested this in bash and sh, but I'm not sure if it works in csh, zsh, ksh or tcsh.

UPDATE: it also works in zsh
UPDATE: typo corrected, "ls -l" should have been "ls -t"

Last edited by mennohellinga; 09-29-2012 at 06:35 AM. Reason: typo
 
Old 09-26-2012, 05:31 PM   #7
Khandi
LQ Newbie
 
Registered: Sep 2012
Posts: 6

Original Poster
Rep: Reputation: Disabled
That is so awesome!! i will try this first thing tomorrow morning and let you know if it works out! Thanks a lot! really helpful! great!
 
Old 09-29-2012, 03:47 AM   #8
pingu
Senior Member
 
Registered: Jul 2004
Location: Skuttunge SWEDEN
Distribution: Debian preferably
Posts: 1,350

Rep: Reputation: 127Reputation: 127
Quote:
Originally Posted by mennohellinga View Post
The command
ls -l
prints all files in the directory, the most recently modified file first.
You need to add a 't' to get the output sorted - "ls -lt".
This might differ between different versions of bash, at least on my Debian Lenny and OpenSuse 11 you need the 't'.

Last edited by pingu; 09-29-2012 at 03:50 AM.
 
1 members found this post helpful.
Old 09-29-2012, 06:30 AM   #9
mennohellinga
Member
 
Registered: Jun 2012
Location: Netherlands
Distribution: Archlinux
Posts: 72

Rep: Reputation: 38
Quote:
Originally Posted by pingu View Post
You need to add a 't' to get the output sorted - "ls -lt".
This might differ between different versions of bash, at least on my Debian Lenny and OpenSuse 11 you need the 't'.
Actually, I need "ls -t". I've corrected the typo.
The -l option would output all sorts of other information that I would have to sed out, while grep already puts the individual files on their own lines.

I use
Code:
GNU bash, version 4.2.37(2)-release (x86_64-unknown-linux-gnu)
grep (GNU grep) 2.14
but I don't think the bash version is an issue because it also works in zsh 5.0.0 (x86_64-unknown-linux-gnu).

Last edited by mennohellinga; 09-29-2012 at 06:33 AM.
 
Old 09-29-2012, 01:25 PM   #10
Wim Sturkenboom
Senior Member
 
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Slackware 10.1/10.2/12, Ubuntu 12.04, Crunchbang Statler
Posts: 3,786

Rep: Reputation: 282Reputation: 282Reputation: 282
There is one risk in using 'ls -t' for this problem. If the file has been 'touched' in some way or is not created on the date that is in the filename, the file's date does not reflect the datepart of the filename.

Code:
wim@i3-2120:~$ touch abc_19700101.xlsx
wim@i3-2120:~$ ls -l abc_19700101.xlsx 
-rw-rw-r-- 1 wim wim 0 Sep 29 19:17 abc_19700101.xlsx
wim@i3-2120:~$
Where will the file go? And where should it go?
 
Old 09-29-2012, 01:50 PM   #11
pingu
Senior Member
 
Registered: Jul 2004
Location: Skuttunge SWEDEN
Distribution: Debian preferably
Posts: 1,350

Rep: Reputation: 127Reputation: 127
Ah, yes - good point!
Reading OP's first post again, I'm not quite sure about the needs.
So, mr OP Khandi: what exactly is it you want to do?
* Put all files with same date in the filename in specific folder
* Put files created / changed on same date in specific folder
* Put the most recent files in specific folder

Last edited by pingu; 09-29-2012 at 01:50 PM. Reason: Spelling
 
1 members found this post helpful.
Old 09-29-2012, 03:46 PM   #12
mennohellinga
Member
 
Registered: Jun 2012
Location: Netherlands
Distribution: Archlinux
Posts: 72

Rep: Reputation: 38
Quote:
Originally Posted by Wim Sturkenboom View Post
There is one risk in using 'ls -t' for this problem. If the file has been 'touched' in some way or is not created on the date that is in the filename, the file's date does not reflect the datepart of the filename.

Code:
wim@i3-2120:~$ touch abc_19700101.xlsx
wim@i3-2120:~$ ls -l abc_19700101.xlsx 
-rw-rw-r-- 1 wim wim 0 Sep 29 19:17 abc_19700101.xlsx
wim@i3-2120:~$
Where will the file go? And where should it go?
I think OP implements some sort of primitive version control system by clicking 'save as' every time he saves a modification. Also, I don't think most people will use any program other than OpenOffice/LibreOffice calc on .xlsx files.

Which made me realize this whole discussion is pointless: he needs a version control system (I can recommend Mercurial. (see this slashdot thread or the Wikipedia article.)) and then he just executes
Code:
mv * /mnt/khandi/stuff
to move the most recent version to the mounted storage device.
 
1 members found this post helpful.
Old 10-02-2012, 09:17 AM   #13
Khandi
LQ Newbie
 
Registered: Sep 2012
Posts: 6

Original Poster
Rep: Reputation: Disabled
Again, thanks eveyone for the kind advice. i will look into Mercurial see if that has what i need.

Pingu:

So, mr OP Khandi: what exactly is it you want to do?
* Put all files with same date in the filename in specific folder
* Put files created / changed on same date in specific folder
* Put the most recent files in specific folder

What i want is to move the most recent file per identifier in a different folder.

12345_20120901.xlsx
12345_20120902.xlsx
54321_20120901.xlsx
54321_20120902.xlsx

12345 and 54321 being the identifiers. I do not have the possibility to change filenames. But they are not being modified on that location. So modification date is something i can work with. so i want The most recent file from 12345 AND the most recent one from 54321 moved to a single location. We are talking about roughly 800 different identifiers. Thats why i want to be able to grab the most recent files in the most efficient way possible. Best thing would be to be able to grab only the files modified between now and 48 hours in the past. To make sure eveything stays tidy.

Thanks for all the help. Sorry i responded this late!

Last edited by Khandi; 10-02-2012 at 01:05 PM. Reason: typo
 
Old 10-02-2012, 09:32 AM   #14
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 5,027

Rep: Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845Reputation: 845
something like this mite work:
Code:
ls -1 | cut -b -5 | sort | uniq | while read line
do
 cp `ls -1 $line_* | tail -n 1` /mnt/khandi/stuff/
done

Last edited by schneidz; 10-02-2012 at 09:34 AM.
 
Old 10-03-2012, 06:21 AM   #15
Snark1994
Senior Member
 
Registered: Sep 2010
Location: Wales, UK
Distribution: Arch
Posts: 1,632
Blog Entries: 3

Rep: Reputation: 345Reputation: 345Reputation: 345Reputation: 345
I think what pingu was getting at was "How do you define 'most recent'?" Are we going by the modification date of the file (e.g. what you get by running 'ls -l filename') or the date in the name of the file?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Removing filename~ files from ls just like .filename files are hidden nkoplm Linux - General 3 10-04-2011 11:31 AM
copy all the files in the directory based on the modification date SriniKlr Programming 4 01-26-2011 11:08 AM
[SOLVED] Scheduled copy based on filename AndySocial Programming 10 06-18-2010 02:16 PM
need a script that will copy based on filename and then move to a network share when mrgreaper Linux - Newbie 6 09-27-2009 09:21 PM
how to remove long-windows-filename files based on exlusion list adamrosspayne Linux - Newbie 3 06-23-2006 03:25 AM


All times are GMT -5. The time now is 10:50 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration