LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-16-2018, 10:31 PM   #1
skagnola
Member
 
Registered: May 2017
Distribution: CentOS
Posts: 41

Rep: Reputation: Disabled
Need some script help scrubbing tarball list for directory name greater than


Hello, all

Figure some here are script ninjas and will know of a way to do this!

I have to pull a tarball from a remote site on a nightly basis, unpack it and run an installer from within the created directory. I only want to extract these tarballs if the contents are newer than a previous version. Otherwise, the tarball will get removed. The tarballs are not always dished out guaranteed every day ergo the need to verify the version within before using it. The version is in the directory name(s) within the tarball.

Unfortunately, the version number is not presented on the tarball following download. There is a wildcard in the download URL to direct to whatever the latest build is, resulting in the downloaded tarball version hidden behind the parenthesis with no version number. Forcing to scrub the contents.

i.e.
Code:
# software14.2-(version).tar.gz
#tar tvf software14.2-(version).tar.gz 
...
drwxr--r-- 0/0 11555 1898-01-07 05:06 software14.2-58224.443-bin/webapp/
...
The bolded version number in the dir name is what I would like to read during the script run such that only if the version is greater than say *58224*, then keep going in the script.

The script so far, without checking the tar contents

Code:
#!/bin/bash

cd /usr/local/src

wget http://domain.com/software14.2-%7Bbuild.number%7D.tar.gz
RTN=$?
if [ $RTN -ne 0 ] ; then
  tail -n10 /tmp/dwnld.log | mail -s "Download Error Report" -r "root@vm.local" "skagnola@domain.com"
else
  find *.tar.gz -ctime -1 -exec tar zxpf {} \; \
  && find *.tar.gz -ctime -1 -exec rm -rf {} \; \
  && cd $(ls -1t | head -1) \
  && ./upgrade.sh --noPrompt
RTN=$?
if [ $RTN -ne 0 ] ; then
  tail -n10 /tmp/dwnld.log | mail -s "Download Error Report" -r "root@vm.local" "skagnola@domain.com"
  fi
fi
I'm not very familiar with using awk but have seen some use this to filter the output from viewing the contents of tar? Or if there is another way to accomplish this?

Halp! heh
 
Old 05-17-2018, 05:19 PM   #2
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (15 current), Slack15, Ubuntu studio, MX Linux, FreeBSD 13.1, WIn10
Posts: 10,342

Rep: Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242
scrub that idea....

Last edited by BW-userx; 05-17-2018 at 05:22 PM.
 
Old 05-17-2018, 06:57 PM   #3
keefaz
LQ Guru
 
Registered: Mar 2004
Distribution: Slackware
Posts: 6,552

Rep: Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872
A way:
Code:
old=58224
new=$(tar tvf software14.2-(version).tar.gz | grep -Po -m1 '[^-]+-\K\d+(?=\.\d+-bin)')

if (( new > old )); then
  # continue with update
fi

Last edited by keefaz; 05-17-2018 at 07:01 PM.
 
Old 05-17-2018, 10:07 PM   #4
skagnola
Member
 
Registered: May 2017
Distribution: CentOS
Posts: 41

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by BW-userx View Post
scrub that idea....
Ha! No kidding!

Quote:
Originally Posted by keefaz View Post
A way:
Code:
old=58224
new=$(tar tvf software14.2-(version).tar.gz | grep -Po -m1 '[^-]+-\K\d+(?=\.\d+-bin)')

if (( new > old )); then
  # continue with update
fi
I will give this a shot and report back! Thanks, keefaz
 
Old 05-17-2018, 11:30 PM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,124

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
No need of the decimal portion of the number for the comparison ?. What if only that portion increments (insufficient data presented to say).
 
Old 05-18-2018, 09:03 AM   #6
skagnola
Member
 
Registered: May 2017
Distribution: CentOS
Posts: 41

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by syg00 View Post
No need of the decimal portion of the number for the comparison ?. What if only that portion increments (insufficient data presented to say).
Yeah, I thought of that as well. Looked at the way the vendor is incrementing their version numbers and the longer portion is a more reliable number to base from.


So I am testing with this, a newer file version than the 58224; 58277 is within

Code:
old=58224
new=$(tar tvf software14.2-(version).tar.gz | grep -Po -m1 '[^-]+-\K\d+(?=\.\d+-bin)')
if (( new > old )); then
  find *.tar.gz -ctime -1 -exec tar zxpf {} \; && find *.tar.gz -ctime -1 -exec rm -rf {} \; && cd $(ls -1t | head -1)
else
echo "Old file"
fi
But when running it against the archive, it is echoing back "Old file". What can be added to echo out what the regex is finding, for the sake of verifying?

btw, that regex is some really cool stuff! I need to brush up on regex skills. It does really cool things!

Last edited by skagnola; 05-18-2018 at 09:07 AM.
 
Old 05-18-2018, 09:19 AM   #7
skagnola
Member
 
Registered: May 2017
Distribution: CentOS
Posts: 41

Original Poster
Rep: Reputation: Disabled
Ahh pewp.

Apologies. The dir names inside the archive look like so, so it may affect the regex...

Code:
software14.2-SNAPSHOT-58277.489-blabla-bin/
 
Old 05-18-2018, 12:30 PM   #8
keefaz
LQ Guru
 
Registered: Mar 2004
Distribution: Slackware
Posts: 6,552

Rep: Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872
Yes regexp needs to be retailored. Is this line model consistant across all tarballs?
Maybe you could extract the whole number then

Code:
oldMajor=58224
oldMinor=487

new=$(tar tvf software14.2-(version).tar.gz | grep -Po -m1 '\-\K\d+\.\d+(?=.*bin/)')
newMajor="${new%.*}"
newMinor="${new#*.}"

if (( newMajor > oldMajor || newMajor == oldMajor && newMinor > oldMinor ))
then
    # continue with update
fi
 
2 members found this post helpful.
Old 05-18-2018, 05:27 PM   #9
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,124

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Bash keeps improving, but you still need to work around its shortcomings - nice keefaz.
Quote:
Originally Posted by skagnola View Post
btw, that regex is some really cool stuff! I need to brush up on regex skills. It does really cool things!
Not a day goes by without me using regex. Couldn't live without it.
However ... the perlre that keefaz used is amongst the most difficult - to learn and get correct. It can frustrate as well as enthuse - I tend to use it only when "normal" regex fails me.
 
Old 05-18-2018, 08:02 PM   #10
keefaz
LQ Guru
 
Registered: Mar 2004
Distribution: Slackware
Posts: 6,552

Rep: Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872
It's not that difficult, perlre is just another approach. I tend to use it because it feels more natural to me
eg instead of
Code:
sed -nr 's/.*SNAPSHOT-([^-]+).*bin\/$/\1/p'
(and I use -r options to avoid evil backslashes...)
 
Old 05-19-2018, 11:51 AM   #11
skagnola
Member
 
Registered: May 2017
Distribution: CentOS
Posts: 41

Original Poster
Rep: Reputation: Disabled
keefaz, that code is amazing. It works like a charm. I'm still in awe of that stuff. From an untrained eye it looks like jibberish, but it all has a purpose!


But now, the more I'm thinking about the way this vendor tarball pull is, the script comparing versions before doing anything, I am wondering if it would be more 'fool-proof' to determine a particular file inside has a time stamp within the last 24hrs? If it does, then proceed.

There is always a file...

Code:
-rw-r--r-- 0/0  14 2018-04-06 05:58 software14.2-SNAPSHOT-58277.489-blabla-bin/VERSION
... stamped with the date it was completed. Since the vendor has had days when they did not upload a tarball - trying to figure the best way to let the script run without downloading / unpacking a version already worked with.

"Well what if they don't deploy for over 24hrs? Say over a weekend they do nothing? How would the date check work?"

I can run this only during the M-F week. From what I see in their version list history, the most they have missed are single days during the work week.

Can't thank you guys enough for the help so far!
 
Old 05-20-2018, 08:37 PM   #12
keefaz
LQ Guru
 
Registered: Mar 2004
Distribution: Slackware
Posts: 6,552

Rep: Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872Reputation: 872
I don't know the whole tar content, but say if you have just one file with name that contains "SNAPSHOT " into it

Code:
# extract date from filename
dateTarbal=$(tar tvf software14.2-(version).tar.gz |  awk '/SNAPSHOT/{print $4" "$5}')

# create today timestamp (seconds since Unix time)
today=$(date +%s)

# create snapshot timestamp
snapshot=$(date -d "$dateTarbal" +%s)

# now just substract the timestamps and convert result in hours
hours=$(( (today - snapshot) /60/60 ))

if (( hours > 24 )); then
  # continue with update
fi
But I think it would be better to rely on software versioning to check for update

Last edited by keefaz; 05-20-2018 at 08:47 PM.
 
1 members found this post helpful.
Old 05-20-2018, 09:23 PM   #13
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,124

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
More so if you don't know the timezone(s). Since we're talking awk, how about something like
Code:
awk -v pre=$oldver '/SNAPSHOT/ {match($0, /(.*SNAPSHOT-)([[:digit:].]+)(.*)/, a) ; if (a[2] > pre) {print "new version: "}}'
Presumes previous version in bash variable oldver - could also be passed in directly.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Find files greater than 100MB and store a list of their names in a file chtsalid Linux - Newbie 8 01-09-2017 06:31 AM
please help script list/find move files greate than 1G to different directory. dotran Linux - Newbie 1 04-01-2014 06:47 PM
Shell script to list all files in the specified directory which have only read permis farzanazhar Programming 7 01-30-2014 11:43 PM
Script to list files in a given directory Azzath Programming 8 04-03-2008 07:02 AM
need help scrubbing list jgruss Linux - General 2 11-01-2004 08:40 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 02:31 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration