Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
|
09-12-2011, 06:36 PM
|
#1
|
|
Member
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware64 13.37/Slackware64 13.1/Slackware 12.1
Posts: 852
Rep:
|
Help for simple bash script - searching strings
OK I've tried this one for too long and Im just beginning in Bash scripting and I need help...
I have this application running on my router which monitor and lists the data bandwidth in/out of my WAN connection. This application has its own web page that I can access to see the amount of data that was uploaded & downloaded.
Its a simple PHP page. Nothing fancy. Just text.
What I want to do is to program a simple bash script that I can run manually (or via cron every hour for example) and extract the value that corresponds to my bandwidth.
To do so, I first tried with curl to "download" the content of the page and using grep I can list the line where the value I am searching for is located. The problem is that I dont know how to extract the value from that line of text.
So more info:
I am using this command to get the page content and extract the line of interest:
Code:
curl -k -silent https://localrouter/vnstat2/ | grep "This month"
The result would be something like:
Code:
<tr><td class="label_even">This month</td><td class="numeric_even">5.10 GB</td><td class="numeric_even">110.30 MB</td><td class="numeric_even">5.21 GB</td></tr>
I highlighted the value of interest in bold (5.21 GB). How do I extract this value? The "GB" is not necessary. Please note the position of the first character of this value could change as the digits of the other numbers before (5.10 & 110.30) could very well change... The number of digits of the value itself can also change. I have no control over this... The PHP script does it.
Any bash, sed, awk, or whatever guru's out there?
Thanks!
Last edited by lpallard; 09-12-2011 at 06:39 PM.
|
|
|
|
09-12-2011, 08:39 PM
|
#2
|
|
Senior Member
Registered: Jan 2005
Location: Manalapan, NJ
Distribution: Fedora x86 and x86_64, Debian PPC and ARM, Android
Posts: 4,591
|
Here's one way.
Code:
curl -k -silent https://localrouter/vnstat2/ | grep "This month" | sed -e 's/^.*numeric_even">//;s/<.*$//'
|
|
|
1 members found this post helpful.
|
09-12-2011, 09:31 PM
|
#3
|
|
Member
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware64 13.37/Slackware64 13.1/Slackware 12.1
Posts: 852
Original Poster
Rep:
|
That's perfect! Just before you posted I was trying to achieve the same but my command was wayyy longer and did not even produce the good result...
Thanks a lot!
I'm gonna keep the thread open for a little while because I'm not done with the script and I might need help later on...
Thanks again!
|
|
|
|
09-12-2011, 11:20 PM
|
#4
|
|
Guru
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 6,321
|
And awk:
Code:
curl -k -silent https://localrouter/vnstat2/ | awk -F"[<>]*" '/This month/{print $(NF-2)}'
Last edited by grail; 09-13-2011 at 06:50 PM.
|
|
|
1 members found this post helpful.
|
09-13-2011, 06:01 AM
|
#5
|
|
Member
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware64 13.37/Slackware64 13.1/Slackware 12.1
Posts: 852
Original Poster
Rep:
|
When it's time to process text streams like this, which applications are best suited generally? Sed or awk? I understand that if I had to modify strings like replacing expressions , removing characters, etc sed a stream editor would be the best...
Do you guys have a good reference for learning these tools?
|
|
|
|
09-13-2011, 09:37 AM
|
#7
|
|
Bash Guru
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,577
|
I'm afraid I don't understand what this means:
Quote:
|
Please note the position of the first character of this value could change as the digits of the other numbers before (5.10 & 110.30) could very well change...
|
Also, is the output itself always uniform in format? And do you always only want the 3rd/last number?
Here are a couple of other solutions I thought of, assuming the above.
The first just loads the string into a variable, then strips off the unwanted parts, giving you the last number.
Code:
num=$( curl -k -silent https://localrouter/vnstat2/ | grep "This month" )
num=${num% [KMG]B*}
num=${num##*>}
echo "$num"
The second runs it through a second grep to extract all "nn.nn" style number strings, and loads the results into an array. All the numbers are thus available to you, if you need them.
Code:
nums=( $( curl -k -silent https://localrouter/vnstat2/ | grep "This month" | grep -Eo '[0-9]+\.[0-9]+' ) )
echo "${nums[2]}"
Edit: Regarding the last question; probably the most practical thing you can do first is to become familiar with regular expressions. This will give you more flexibility with all sorts of tools. Regex is supported by more applications than you know.
Then learn the basics of both sed and awk. Each has it's own strengths and weaknesses. sed is line (actually stream) based, and can often more easily do substitutions, deletions, and regex pattern applications on individual lines and whole files. awk, on the other hand, is field-based, and is often easier to use when the text can be split into sections based on characters or patterns of characters. On the other hand, it's also a full scripting language capable of doing very complex text manipulations.
A lot of people forget that there are also a number of other, more specialized, tools available, like cut, tr, head, tail, paste, and fold. Many of these are faster and easier to use than sed and awk within their own areas of expertise.
And finally, there's the shell itself, which has many powerful string manipulation tools, like the parameter expansion and arrays I used above.
Last edited by David the H.; 09-13-2011 at 10:00 AM.
Reason: as stated.
|
|
|
1 members found this post helpful.
|
09-13-2011, 10:37 AM
|
#8
|
|
Guru
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 6,321
|
Just in case you want a ruby solution too:
Code:
curl -k -silent https://localrouter/vnstat2/ | ruby -ne 'puts $_.scan(/.*This month.*>([\d.]+)/)[0][0]'
Last edited by grail; 09-13-2011 at 06:50 PM.
|
|
|
1 members found this post helpful.
|
10-01-2011, 05:13 PM
|
#9
|
|
Member
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware64 13.37/Slackware64 13.1/Slackware 12.1
Posts: 852
Original Poster
Rep:
|
All these solutions were helpful! At least Ive learned some more!
Now I'd like to write a script to do the following tasks:
- Search a specific folder for sub-folders that contains certain strings;
- Rename these folders a certain way (by removing some stuff and reorganizing the content of the file name);
- Enter the sub-folder and rename a specific file inside (there should be only one file per sub folder) the same way as its parent folder;
- If need be, delete all other files from the subfolder;
- Move the renamed file to a certain location;
- Delete the subfolder...
OK an example:
./test/
----|
----./sub1-hello-just-a-subfolder-hello3
-------|
-------file1.txt
-------junk.dat
-------junk.src
-------junk.cpp
to
./test/
---|
---./just-a-subfolder-hello
-------|
-------just-a-subfolder-hello.txt
So for this example, subfolder was renamed with removal of "sub1" & "hello3", and the string "hello" was placed in front of "just-a-subfolder"
Then the text file was renamed exactly as its parent folder i.e. "just-a-subfolder-hello" while conserving its extension.
Also all other files except the one we just renamed were deleted. Finally, "just-a-subfolder-hello.txt" will be moved to another location on the system, and folder "./test/just-a-subfolder-hello" will be deleted. Not the ./test folder!
Anybody has a suggestion for me? I kinda played around trying to write a script, but I have problems playing with recursive operations... I'd normally try again but this time I am in a rush. I prefer bash because it does not require anything exotic but if a perl, ruby or any other language is better, please do not hesitate!
Thanks!
Last edited by lpallard; 10-01-2011 at 05:14 PM.
|
|
|
|
10-02-2011, 03:45 AM
|
#10
|
|
Guru
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 6,321
|
Well the first suggestion is, what have you tried and where are you stuck?
If the following is correct:
Quote:
|
All these solutions were helpful! At least Ive learned some more!
|
Then you need to demonstrate what you have learned. The idea is not for others to do all the work for you.
|
|
|
|
10-08-2011, 12:14 PM
|
#11
|
|
Member
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware64 13.37/Slackware64 13.1/Slackware 12.1
Posts: 852
Original Poster
Rep:
|
OK sorry about the long delay in replying, I had to drop this for a few days but I just returned and had a chance to play a bit more with this...
The task is rapidly overgrowing my capacity to code... There is too many scenarios with folder naming. I need to keep learning cause I'm pretty bad
So far I adopted the baby-steps approach. Starting with a handful of folders each containing a file, I wrote a script to recursively enter each folder whose name contain a certain string, then do something in that folder. For the purpose of the first trial, I decided the script would create a new sub-folder in the folders matching the string search. It works.
Now the problem I am facing is to deal with folders that would be named pretty randomly. In the example I described at post 9 above, the folder in the example was named "sub1-hello-just-a-subfolder-hello3" but in real life, there is no guaranteed pattern for the folder nam, just guarantee that the name will contain certain strings. The order is not known and there could be more strings or less strings in the folder name. For example, "sub1" could be at beginning or end or somewhere else in the filename, there could be no "hello" and very likely spaces or other stupid characters in the filename... These folders are created by windows users... They use all kind of characters and sometimes more characters than enough... For the initial search of the folders, this should not pose any problems as even if folders were named like this:
Code:
tretretretre_8989789++_ sub1 -hello-just_a_subfolder HELLO! efdsfdsf hello...3
searching for the string "sub1" would still return the folder in the results. Its renaming the files based on the folder's name that pose a problem. Instead of starting with the untouched folder name and removing strings after strings until I get something clean like "sub1-hello3" I think it would be better to remove everything EXCEPT certain strings.
That would mean from:
Code:
tretretretre_8989789++_ sub1 -hello-just_a_subfolder HELLO! efdsfdsf hello...3
removing everything except "sub1" & "hello3" to get:
then use the result to rename the file. It however would require adding spaces between the strings so I dont get "sub1hello3" but "sub1 hello3" instead.
My script so far, very primitive.
Code:
#!/bin/bash
clear
cd /home/lpallard/test
find . -type d | grep sub1 | while read d
do
d=$(echo $d | sed 's/^..//')
cd "$d"
find . -type f | grep .txt | while read f
do
d=$(sed -e '/String1toremove/d' -e '/String2toremove/d' -e '/String3toremove/d' $d)
mv $f $d
cd ..
done
Booster please ? 
Thanks guys!
Last edited by lpallard; 10-08-2011 at 12:26 PM.
|
|
|
|
10-08-2011, 12:22 PM
|
#12
|
|
Member
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware64 13.37/Slackware64 13.1/Slackware 12.1
Posts: 852
Original Poster
Rep:
|
Looking at the real deal here (the actual folders & files), I believe it is simpler than I thought.
The current folders are more or less named like this:
Code:
StringA ### #### #### RandomStringA StringB RandomStringB
What I want is to
-Keep StringA
-Keep the # (representing numbers 0-9) but add a dash (-) in between them (so from 512 6654 7878 to 512-6654-7878)
-Keep RandomStringA
-Delete StringB
-Delete RandomStringB
-Add spaces between resulting strings
so the file would be renamed
Code:
StringA ###-####-#### RandomStringA.txt
I will keep trying more stuff. I hope this post will clarify a bit.
|
|
|
|
10-09-2011, 02:25 AM
|
#13
|
|
Guru
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 6,321
|
So I am struggling to understand where you are going with this
Post #11 would be easily solved as you seem to already know what you want to rename the file / folder to so no need to extract anything just use what you know.
As for post #12, if we assume that RandomStringA is unknown and there are only 2 spaces prior to it you can use parameter substitution to remove the last
2 strings. Then you probably need something like sed to insert the dashes between the numbers.
|
|
|
|
10-09-2011, 08:02 AM
|
#14
|
|
Member
Registered: Nov 2008
Location: Milky Way
Distribution: Slackware64 13.37/Slackware64 13.1/Slackware 12.1
Posts: 852
Original Poster
Rep:
|
OK I added post 12 because I thought #11 was confusing but it may have had the opposite effect... If you got the idea on post 11, then can we proceed from there?
Lets adopt the baby steps so I can get the point.
At this point what I *think* I have to do is to remove certain strings (the garbage identified as RandomStringA & B) and reorganize the other portions of the filename.
Lets start with step 1: I tried to use sed to collect the numerals (##). It works but I got only so far as extracting the last X digits when the numbers are either in front of the whole string or at the end...
Like "Hello 1978 1986" or "4521 2352 Hello".
This did not prove too useful at first because I am not verifying the existence of the string but extracting from it (if it exists). What I need is something that will search for the existence of a pattern. Goggling for this did not prove too successful.
So in my case, I need to search for the existence of a pattern of "[0-9][0-9][0-9] [0-9][0-9][0-9][0-9] [0-9][0-9][0-9][0-9]" if it exists, insert dashes instead of whitespaces and append to StringA. TO extract string A I can use sed to collect the X first characters of the filename, or similarly to the numeral search, search for a specific keyword.
Am I confusing you?
Last edited by lpallard; 10-09-2011 at 08:12 AM.
|
|
|
|
10-09-2011, 10:08 AM
|
#15
|
|
Guru
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 6,321
|
Assuming I do understand (could be a big if), let us use this example and see if we are on the same page:
Code:
# we store folder name in variable x
x='StringA 123 4567 4568 RandomStringA StringB RandomStringB'
# We need the first string
first=${x%% *}
# We want all the digits (we assume here there are none elsewhere with the same pattern)
digits=$(echo $x | sed -rn 's/[^ ]* ([0-9]{3}) ([0-9]{4}) ([0-9]{4}).*/\1-\2-\3/p')
Throw in some echoes for checking and let me know if we are on the right page?
|
|
|
1 members found this post helpful.
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 06:19 PM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|