LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 03-23-2016, 11:04 AM   #1
NoRearView
LQ Newbie
 
Registered: Dec 2015
Posts: 13

Rep: Reputation: Disabled
Scripting - creating .gz files based on filename


I am working on a directory where customer files are uploaded automatically. I would like to create a script that cleans up the files and moves them to a separate directory. After there are cleaned up, I will have a cron run on the first of the month to clean up files from the previous month. There are currently over 15,000 files in the folder.

The file name is CompanyName_INBOUND_month-day-year_time.zip. An example would "acme_INBOUND_03-22-2016_1100.zip.

What I would like to do is to create .gz files based on the company, month, and year. So all of the acme files from July of 2014 would be in one gz files, August of 2014, would be in another, and down the line. The same would go from the remaining companies.

Is it possible to create a script to read the file name and extract the company name, month, and year to create a gz file and add each matching file to its respective gz file? Unfortunately, all of the files were copied to another NAS late last year so going by the modified date is not an option.

My fallback option is just compress all of the files into a single .gz for each company and then schedule a script in cron to create new file each month. I would still have to be able to somehow read the name of the company from the file name to do that, though.

Thanks!
 
Old 03-23-2016, 11:17 AM   #2
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,573

Rep: Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142
gzip can't combine multiple files into one archive, that's tar. You could then gzip a tar file, but since the tar already contains .zip files, it'd likely just be a waste of time.

It sounds like a pretty easy job. I would start by generating a list of unique company/month/year combinations, and then loop over them to create the tarballs. What all have you done so far and where are you stuck?
 
2 members found this post helpful.
Old 03-23-2016, 11:18 AM   #3
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,634

Rep: Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965Reputation: 7965
Quote:
Originally Posted by NoRearView View Post
I am working on a directory where customer files are uploaded automatically. I would like to create a script that cleans up the files and moves them to a separate directory. After there are cleaned up, I will have a cron run on the first of the month to clean up files from the previous month. There are currently over 15,000 files in the folder.

The file name is CompanyName_INBOUND_month-day-year_time.zip. An example would "acme_INBOUND_03-22-2016_1100.zip.

What I would like to do is to create .gz files based on the company, month, and year. So all of the acme files from July of 2014 would be in one gz files, August of 2014, would be in another, and down the line. The same would go from the remaining companies.

Is it possible to create a script to read the file name and extract the company name, month, and year to create a gz file and add each matching file to its respective gz file? Unfortunately, all of the files were copied to another NAS late last year so going by the modified date is not an option.

My fallback option is just compress all of the files into a single .gz for each company and then schedule a script in cron to create new file each month. I would still have to be able to somehow read the name of the company from the file name to do that, though.
Absolutely possible, if the naming convention is the same. If it follows what you posted, you could simply read the directory, and split on the underscore to get the company name, then split on the hyphens to get month/day/year. Reading would go by month/year, so tick up a counter (starting at '01' for the month, and whatever year you want), so that your input would be something like "ls acme*01-*-2014*.zip", to get everything for acme, January, 2014.

Can you post the script you've written so far??? Using the logic above, you should be able to look at any of the bash scripting tutorials/examples on how to read directories, and get it to work.
 
1 members found this post helpful.
Old 03-24-2016, 11:53 AM   #4
NoRearView
LQ Newbie
 
Registered: Dec 2015
Posts: 13

Original Poster
Rep: Reputation: Disabled
Thanks!

I was actually looking for a place to start which what you provided was enough to get me heading in the right direction. Here is the while loop I have written. (Pardon the commented out echo commands. I was using them for troubleshooting.) The echos will get removed and the tar and rm commands will be run in the final product.

for zipfile in "$filedirectory"/*.zip
do
filename=${zipfile##*/}
company_name=${filename%%_*}
# echo $company_name
texthack=${filename#*_}
# echo $texthack
date=${texthack%_*}
date=${date##*_}
# echo $date
month=${date%%-*}
year=${date##*-}
# echo $company_name $month $year
# echo $year
echo tar -czf ../DownloadArchive/${company_name}_INBOUND_${month}-${year}.gz $zipfile
echo rm $zipfile
done

One thing I did find that I will change is to put a variable in for the absolute path to the DownloadArchive directory to avoid possible confusion on the tar command. I may have also included a few more steps than need be, but it's my first experience scripting like this since the 90's other than basic Windows batch files.

Any thoughts/suggestions? Thanks again for your input!

Last edited by NoRearView; 03-24-2016 at 11:54 AM.
 
Old 03-24-2016, 12:50 PM   #5
NoRearView
LQ Newbie
 
Registered: Dec 2015
Posts: 13

Original Poster
Rep: Reputation: Disabled
Here is my final tar command:

tar -czf ${archivedirectory}/${company_name}_INBOUND_${month}-${year}.gz $filename

I was using the full file path which was adding extra directories when the files were unzipped.
 
Old 03-24-2016, 04:39 PM   #6
NoRearView
LQ Newbie
 
Registered: Dec 2015
Posts: 13

Original Poster
Rep: Reputation: Disabled
Final update:

I found that the tar command I had was constantly created new .gz files and not adding to the existing ones. So I added this into the script which seems to have done the trick:

if [ -a ${archivedirectory}/${company_name}_INBOUND_${month}-${year}.gz ]
then
tar -rvf ${archivedirectory}/${company_name}_INBOUND_${month}-${year}.gz $filename
else
tar -cvf ${archivedirectory}/${company_name}_INBOUND_${month}-${year}.gz $filename
fi

I feel like there may have been a more simpler way to do this, though.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Autotrash – Purges files from your trash based on age and/or filename LXer Syndicated Linux News 0 08-07-2013 04:21 PM
Copy files based on filename Khandi Linux - Newbie 16 10-04-2012 12:05 PM
Help needed to sort files based on timestamp in bash scripting. maunir Programming 3 01-24-2011 01:46 PM
how to remove long-windows-filename files based on exlusion list adamrosspayne Linux - Newbie 3 06-23-2006 02:25 AM
Creating a date-based filename for a tarball? HomeBrewer Linux - Newbie 4 12-20-2003 02:16 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 07:51 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration