![]() |
Help for simple bash script - searching strings
OK I've tried this one for too long and Im just beginning in Bash scripting and I need help...
I have this application running on my router which monitor and lists the data bandwidth in/out of my WAN connection. This application has its own web page that I can access to see the amount of data that was uploaded & downloaded. Its a simple PHP page. Nothing fancy. Just text. What I want to do is to program a simple bash script that I can run manually (or via cron every hour for example) and extract the value that corresponds to my bandwidth. To do so, I first tried with curl to "download" the content of the page and using grep I can list the line where the value I am searching for is located. The problem is that I dont know how to extract the value from that line of text. So more info: I am using this command to get the page content and extract the line of interest: Code:
curl -k -silent https://localrouter/vnstat2/ | grep "This month"Code:
<tr><td class="label_even">This month</td><td class="numeric_even">5.10 GB</td><td class="numeric_even">110.30 MB</td><td class="numeric_even">5.21 GB</td></tr>Any bash, sed, awk, or whatever guru's out there? Thanks! |
Here's one way.
Code:
curl -k -silent https://localrouter/vnstat2/ | grep "This month" | sed -e 's/^.*numeric_even">//;s/<.*$//' |
That's perfect! Just before you posted I was trying to achieve the same but my command was wayyy longer and did not even produce the good result...
Thanks a lot! I'm gonna keep the thread open for a little while because I'm not done with the script and I might need help later on... Thanks again! |
And awk:
Code:
curl -k -silent https://localrouter/vnstat2/ | awk -F"[<>]*" '/This month/{print $(NF-2)}' |
When it's time to process text streams like this, which applications are best suited generally? Sed or awk? I understand that if I had to modify strings like replacing expressions , removing characters, etc sed a stream editor would be the best...
Do you guys have a good reference for learning these tools? |
It is not generally a good idea to say always this tool or that but rather the best suited for the job or some times the one you are most adept with.
As for references :- http://www.gnu.org/software/gawk/man...ode/index.html http://www.grymoire.com/Unix/Sed.html |
I'm afraid I don't understand what this means:
Quote:
Here are a couple of other solutions I thought of, assuming the above. The first just loads the string into a variable, then strips off the unwanted parts, giving you the last number. Code:
num=$( curl -k -silent https://localrouter/vnstat2/ | grep "This month" )Code:
nums=( $( curl -k -silent https://localrouter/vnstat2/ | grep "This month" | grep -Eo '[0-9]+\.[0-9]+' ) )Then learn the basics of both sed and awk. Each has it's own strengths and weaknesses. sed is line (actually stream) based, and can often more easily do substitutions, deletions, and regex pattern applications on individual lines and whole files. awk, on the other hand, is field-based, and is often easier to use when the text can be split into sections based on characters or patterns of characters. On the other hand, it's also a full scripting language capable of doing very complex text manipulations. A lot of people forget that there are also a number of other, more specialized, tools available, like cut, tr, head, tail, paste, and fold. Many of these are faster and easier to use than sed and awk within their own areas of expertise. And finally, there's the shell itself, which has many powerful string manipulation tools, like the parameter expansion and arrays I used above. |
Just in case you want a ruby solution too:
Code:
curl -k -silent https://localrouter/vnstat2/ | ruby -ne 'puts $_.scan(/.*This month.*>([\d.]+)/)[0][0]' |
All these solutions were helpful! At least Ive learned some more!
Now I'd like to write a script to do the following tasks:
OK an example: ./test/ ----| ----./sub1-hello-just-a-subfolder-hello3 -------| -------file1.txt -------junk.dat -------junk.src -------junk.cpp to ./test/ ---| ---./just-a-subfolder-hello -------| -------just-a-subfolder-hello.txt So for this example, subfolder was renamed with removal of "sub1" & "hello3", and the string "hello" was placed in front of "just-a-subfolder" Then the text file was renamed exactly as its parent folder i.e. "just-a-subfolder-hello" while conserving its extension. Also all other files except the one we just renamed were deleted. Finally, "just-a-subfolder-hello.txt" will be moved to another location on the system, and folder "./test/just-a-subfolder-hello" will be deleted. Not the ./test folder! Anybody has a suggestion for me? I kinda played around trying to write a script, but I have problems playing with recursive operations... I'd normally try again but this time I am in a rush. I prefer bash because it does not require anything exotic but if a perl, ruby or any other language is better, please do not hesitate! Thanks! |
Well the first suggestion is, what have you tried and where are you stuck?
If the following is correct: Quote:
|
OK sorry about the long delay in replying, I had to drop this for a few days but I just returned and had a chance to play a bit more with this...
The task is rapidly overgrowing my capacity to code... There is too many scenarios with folder naming. I need to keep learning cause I'm pretty bad :( So far I adopted the baby-steps approach. Starting with a handful of folders each containing a file, I wrote a script to recursively enter each folder whose name contain a certain string, then do something in that folder. For the purpose of the first trial, I decided the script would create a new sub-folder in the folders matching the string search. It works. Now the problem I am facing is to deal with folders that would be named pretty randomly. In the example I described at post 9 above, the folder in the example was named "sub1-hello-just-a-subfolder-hello3" but in real life, there is no guaranteed pattern for the folder nam, just guarantee that the name will contain certain strings. The order is not known and there could be more strings or less strings in the folder name. For example, "sub1" could be at beginning or end or somewhere else in the filename, there could be no "hello" and very likely spaces or other stupid characters in the filename... These folders are created by windows users... They use all kind of characters and sometimes more characters than enough... For the initial search of the folders, this should not pose any problems as even if folders were named like this: Code:
tretretretre_8989789++_ sub1 -hello-just_a_subfolder HELLO! efdsfdsf hello...3That would mean from: Code:
tretretretre_8989789++_ sub1 -hello-just_a_subfolder HELLO! efdsfdsf hello...3Code:
sub1 hello3My script so far, very primitive. Code:
#!/bin/bashThanks guys! |
Looking at the real deal here (the actual folders & files), I believe it is simpler than I thought.
The current folders are more or less named like this: Code:
StringA ### #### #### RandomStringA StringB RandomStringB-Keep StringA -Keep the # (representing numbers 0-9) but add a dash (-) in between them (so from 512 6654 7878 to 512-6654-7878) -Keep RandomStringA -Delete StringB -Delete RandomStringB -Add spaces between resulting strings so the file would be renamed Code:
StringA ###-####-#### RandomStringA.txt |
So I am struggling to understand where you are going with this :(
Post #11 would be easily solved as you seem to already know what you want to rename the file / folder to so no need to extract anything just use what you know. As for post #12, if we assume that RandomStringA is unknown and there are only 2 spaces prior to it you can use parameter substitution to remove the last 2 strings. Then you probably need something like sed to insert the dashes between the numbers. |
OK I added post 12 because I thought #11 was confusing but it may have had the opposite effect... If you got the idea on post 11, then can we proceed from there?
Lets adopt the baby steps so I can get the point. At this point what I *think* I have to do is to remove certain strings (the garbage identified as RandomStringA & B) and reorganize the other portions of the filename. Lets start with step 1: I tried to use sed to collect the numerals (##). It works but I got only so far as extracting the last X digits when the numbers are either in front of the whole string or at the end... Like "Hello 1978 1986" or "4521 2352 Hello". This did not prove too useful at first because I am not verifying the existence of the string but extracting from it (if it exists). What I need is something that will search for the existence of a pattern. Goggling for this did not prove too successful. So in my case, I need to search for the existence of a pattern of "[0-9][0-9][0-9] [0-9][0-9][0-9][0-9] [0-9][0-9][0-9][0-9]" if it exists, insert dashes instead of whitespaces and append to StringA. TO extract string A I can use sed to collect the X first characters of the filename, or similarly to the numeral search, search for a specific keyword. Am I confusing you? |
Assuming I do understand (could be a big if), let us use this example and see if we are on the same page:
Code:
# we store folder name in variable x |
| All times are GMT -5. The time now is 12:07 AM. |