Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am working on a directory where customer files are uploaded automatically. I would like to create a script that cleans up the files and moves them to a separate directory. After there are cleaned up, I will have a cron run on the first of the month to clean up files from the previous month. There are currently over 15,000 files in the folder.
The file name is CompanyName_INBOUND_month-day-year_time.zip. An example would "acme_INBOUND_03-22-2016_1100.zip.
What I would like to do is to create .gz files based on the company, month, and year. So all of the acme files from July of 2014 would be in one gz files, August of 2014, would be in another, and down the line. The same would go from the remaining companies.
Is it possible to create a script to read the file name and extract the company name, month, and year to create a gz file and add each matching file to its respective gz file? Unfortunately, all of the files were copied to another NAS late last year so going by the modified date is not an option.
My fallback option is just compress all of the files into a single .gz for each company and then schedule a script in cron to create new file each month. I would still have to be able to somehow read the name of the company from the file name to do that, though.
gzip can't combine multiple files into one archive, that's tar. You could then gzip a tar file, but since the tar already contains .zip files, it'd likely just be a waste of time.
It sounds like a pretty easy job. I would start by generating a list of unique company/month/year combinations, and then loop over them to create the tarballs. What all have you done so far and where are you stuck?
I am working on a directory where customer files are uploaded automatically. I would like to create a script that cleans up the files and moves them to a separate directory. After there are cleaned up, I will have a cron run on the first of the month to clean up files from the previous month. There are currently over 15,000 files in the folder.
The file name is CompanyName_INBOUND_month-day-year_time.zip. An example would "acme_INBOUND_03-22-2016_1100.zip.
What I would like to do is to create .gz files based on the company, month, and year. So all of the acme files from July of 2014 would be in one gz files, August of 2014, would be in another, and down the line. The same would go from the remaining companies.
Is it possible to create a script to read the file name and extract the company name, month, and year to create a gz file and add each matching file to its respective gz file? Unfortunately, all of the files were copied to another NAS late last year so going by the modified date is not an option.
My fallback option is just compress all of the files into a single .gz for each company and then schedule a script in cron to create new file each month. I would still have to be able to somehow read the name of the company from the file name to do that, though.
Absolutely possible, if the naming convention is the same. If it follows what you posted, you could simply read the directory, and split on the underscore to get the company name, then split on the hyphens to get month/day/year. Reading would go by month/year, so tick up a counter (starting at '01' for the month, and whatever year you want), so that your input would be something like "ls acme*01-*-2014*.zip", to get everything for acme, January, 2014.
Can you post the script you've written so far??? Using the logic above, you should be able to look at any of the bash scripting tutorials/examples on how to read directories, and get it to work.
I was actually looking for a place to start which what you provided was enough to get me heading in the right direction. Here is the while loop I have written. (Pardon the commented out echo commands. I was using them for troubleshooting.) The echos will get removed and the tar and rm commands will be run in the final product.
for zipfile in "$filedirectory"/*.zip
do
filename=${zipfile##*/}
company_name=${filename%%_*}
# echo $company_name
texthack=${filename#*_}
# echo $texthack
date=${texthack%_*}
date=${date##*_}
# echo $date
month=${date%%-*}
year=${date##*-}
# echo $company_name $month $year
# echo $year
echo tar -czf ../DownloadArchive/${company_name}_INBOUND_${month}-${year}.gz $zipfile
echo rm $zipfile
done
One thing I did find that I will change is to put a variable in for the absolute path to the DownloadArchive directory to avoid possible confusion on the tar command. I may have also included a few more steps than need be, but it's my first experience scripting like this since the 90's other than basic Windows batch files.
Any thoughts/suggestions? Thanks again for your input!
Last edited by NoRearView; 03-24-2016 at 11:54 AM.
I found that the tar command I had was constantly created new .gz files and not adding to the existing ones. So I added this into the script which seems to have done the trick:
if [ -a ${archivedirectory}/${company_name}_INBOUND_${month}-${year}.gz ]
then
tar -rvf ${archivedirectory}/${company_name}_INBOUND_${month}-${year}.gz $filename
else
tar -cvf ${archivedirectory}/${company_name}_INBOUND_${month}-${year}.gz $filename
fi
I feel like there may have been a more simpler way to do this, though.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.