[SOLVED] Help for simple bash script - searching strings
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
OK I've tried this one for too long and Im just beginning in Bash scripting and I need help...
I have this application running on my router which monitor and lists the data bandwidth in/out of my WAN connection. This application has its own web page that I can access to see the amount of data that was uploaded & downloaded.
Its a simple PHP page. Nothing fancy. Just text.
What I want to do is to program a simple bash script that I can run manually (or via cron every hour for example) and extract the value that corresponds to my bandwidth.
To do so, I first tried with curl to "download" the content of the page and using grep I can list the line where the value I am searching for is located. The problem is that I dont know how to extract the value from that line of text.
So more info:
I am using this command to get the page content and extract the line of interest:
I highlighted the value of interest in bold (5.21 GB). How do I extract this value? The "GB" is not necessary. Please note the position of the first character of this value could change as the digits of the other numbers before (5.10 & 110.30) could very well change... The number of digits of the value itself can also change. I have no control over this... The PHP script does it.
When it's time to process text streams like this, which applications are best suited generally? Sed or awk? I understand that if I had to modify strings like replacing expressions , removing characters, etc sed a stream editor would be the best...
Do you guys have a good reference for learning these tools?
Please note the position of the first character of this value could change as the digits of the other numbers before (5.10 & 110.30) could very well change...
Also, is the output itself always uniform in format? And do you always only want the 3rd/last number?
Here are a couple of other solutions I thought of, assuming the above.
The first just loads the string into a variable, then strips off the unwanted parts, giving you the last number.
The second runs it through a second grep to extract all "nn.nn" style number strings, and loads the results into an array. All the numbers are thus available to you, if you need them.
Edit: Regarding the last question; probably the most practical thing you can do first is to become familiar with regular expressions. This will give you more flexibility with all sorts of tools. Regex is supported by more applications than you know.
Then learn the basics of both sed and awk. Each has it's own strengths and weaknesses. sed is line (actually stream) based, and can often more easily do substitutions, deletions, and regex pattern applications on individual lines and whole files. awk, on the other hand, is field-based, and is often easier to use when the text can be split into sections based on characters or patterns of characters. On the other hand, it's also a full scripting language capable of doing very complex text manipulations.
A lot of people forget that there are also a number of other, more specialized, tools available, like cut, tr, head, tail, paste, and fold. Many of these are faster and easier to use than sed and awk within their own areas of expertise.
And finally, there's the shell itself, which has many powerful string manipulation tools, like the parameter expansion and arrays I used above.
Last edited by David the H.; 09-13-2011 at 11:00 AM.
Reason: as stated.
So for this example, subfolder was renamed with removal of "sub1" & "hello3", and the string "hello" was placed in front of "just-a-subfolder"
Then the text file was renamed exactly as its parent folder i.e. "just-a-subfolder-hello" while conserving its extension.
Also all other files except the one we just renamed were deleted. Finally, "just-a-subfolder-hello.txt" will be moved to another location on the system, and folder "./test/just-a-subfolder-hello" will be deleted. Not the ./test folder!
Anybody has a suggestion for me? I kinda played around trying to write a script, but I have problems playing with recursive operations... I'd normally try again but this time I am in a rush. I prefer bash because it does not require anything exotic but if a perl, ruby or any other language is better, please do not hesitate!
OK sorry about the long delay in replying, I had to drop this for a few days but I just returned and had a chance to play a bit more with this...
The task is rapidly overgrowing my capacity to code... There is too many scenarios with folder naming. I need to keep learning cause I'm pretty bad
So far I adopted the baby-steps approach. Starting with a handful of folders each containing a file, I wrote a script to recursively enter each folder whose name contain a certain string, then do something in that folder. For the purpose of the first trial, I decided the script would create a new sub-folder in the folders matching the string search. It works.
Now the problem I am facing is to deal with folders that would be named pretty randomly. In the example I described at post 9 above, the folder in the example was named "sub1-hello-just-a-subfolder-hello3" but in real life, there is no guaranteed pattern for the folder nam, just guarantee that the name will contain certain strings. The order is not known and there could be more strings or less strings in the folder name. For example, "sub1" could be at beginning or end or somewhere else in the filename, there could be no "hello" and very likely spaces or other stupid characters in the filename... These folders are created by windows users... They use all kind of characters and sometimes more characters than enough... For the initial search of the folders, this should not pose any problems as even if folders were named like this:
searching for the string "sub1" would still return the folder in the results. Its renaming the files based on the folder's name that pose a problem. Instead of starting with the untouched folder name and removing strings after strings until I get something clean like "sub1-hello3" I think it would be better to remove everything EXCEPT certain strings.
removing everything except "sub1" & "hello3" to get:
Code:
sub1 hello3
then use the result to rename the file. It however would require adding spaces between the strings so I dont get "sub1hello3" but "sub1 hello3" instead.
My script so far, very primitive.
Code:
#!/bin/bash
clear
cd /home/lpallard/test
find . -type d | grep sub1 | while read d
do
d=$(echo $d | sed 's/^..//')
cd "$d"
find . -type f | grep .txt | while read f
do
d=$(sed -e '/String1toremove/d' -e '/String2toremove/d' -e '/String3toremove/d' $d)
mv $f $d
cd ..
done
What I want is to
-Keep StringA
-Keep the # (representing numbers 0-9) but add a dash (-) in between them (so from 512 6654 7878 to 512-6654-7878)
-Keep RandomStringA
-Delete StringB
-Delete RandomStringB
-Add spaces between resulting strings
so the file would be renamed
Code:
StringA ###-####-#### RandomStringA.txt
I will keep trying more stuff. I hope this post will clarify a bit.
So I am struggling to understand where you are going with this
Post #11 would be easily solved as you seem to already know what you want to rename the file / folder to so no need to extract anything just use what you know.
As for post #12, if we assume that RandomStringA is unknown and there are only 2 spaces prior to it you can use parameter substitution to remove the last
2 strings. Then you probably need something like sed to insert the dashes between the numbers.
OK I added post 12 because I thought #11 was confusing but it may have had the opposite effect... If you got the idea on post 11, then can we proceed from there?
Lets adopt the baby steps so I can get the point.
At this point what I *think* I have to do is to remove certain strings (the garbage identified as RandomStringA & B) and reorganize the other portions of the filename.
Lets start with step 1: I tried to use sed to collect the numerals (##). It works but I got only so far as extracting the last X digits when the numbers are either in front of the whole string or at the end...
Like "Hello 1978 1986" or "4521 2352 Hello".
This did not prove too useful at first because I am not verifying the existence of the string but extracting from it (if it exists). What I need is something that will search for the existence of a pattern. Goggling for this did not prove too successful.
So in my case, I need to search for the existence of a pattern of "[0-9][0-9][0-9] [0-9][0-9][0-9][0-9] [0-9][0-9][0-9][0-9]" if it exists, insert dashes instead of whitespaces and append to StringA. TO extract string A I can use sed to collect the X first characters of the filename, or similarly to the numeral search, search for a specific keyword.
Assuming I do understand (could be a big if), let us use this example and see if we are on the same page:
Code:
# we store folder name in variable x
x='StringA 123 4567 4568 RandomStringA StringB RandomStringB'
# We need the first string
first=${x%% *}
# We want all the digits (we assume here there are none elsewhere with the same pattern)
digits=$(echo $x | sed -rn 's/[^ ]* ([0-9]{3}) ([0-9]{4}) ([0-9]{4}).*/\1-\2-\3/p')
Throw in some echoes for checking and let me know if we are on the right page?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.