[SOLVED] Shell Scripting: Scraping Public IP and Emailing - [: too many arguments
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Shell Scripting: Scraping Public IP and Emailing - [: too many arguments
Ok, so I wrote a simple script to scrape whatsmyip.org for my public IP address and then email it to me if it has changed. I set this up as a cron job ever 30mins.
the problem is, it works for about 18 hours and then will have an issue where it says "line 33: [: too many arguments"
Which line 33 happens to be my if statement. Now like I said it works fine for about 18 hours, i run the script with 2>&1 to a .out file so i can see what its doing...
this is the output of the .out file, the "xxx" does actually show the ip.."
PHP Code:
xxx.xxx.xxx.xxx already in file xxx.xxx.xxx.xxx already in file xxx.xxx.xxx.xxx already in file xxx.xxx.xxx.xxx already in file xxx.xxx.xxx.xxx already in file xxx.xxx.xxx.xxx already in file
/opt/public-ip/public-ip.sh: line 33: [: too many arguments [<-] 220 mx.google.com ESMTP r33sm4143833qcs.42
here is the code for my script
PHP Code:
23 24 25 # Obtain the current public IP Address 26 27 IP=`curl -s http://www.whatismyip.org` 28 echo $IP 29 30 31 # see if new ip is already in the file 32 33 if [ $IP = `grep $IP /opt/public-ip/currentip` ];then 34 echo "already in file" 35 else 36 echo $IP >> /opt/public-ip/currentip 37 38 # ssmtp -vvv no@no.com < /opt/public-ip/currentip 39 # ssmtp -vvv no@no.com < /opt/public-ip/currentip 40 # ssmtp -vvv no@no.com < /opt/public-ip/currentip 41 fi 42 43 exit 0
so it is something with my IF statement.... I don't know what it could be... like i said it works right for 18hrs and then has a hiccup shows that too many arguments and emails me...
could it be my curl statement is returning something that is not matching whats in the currentip file? if so, what kind of logic can i use to weed that out?
@Reuti - I get what you are saying, I did not think of that. I will adjust my code to compensate for multiple close entries, and I do like the idea of going with a greater than count!
I wrote a function for a script a while back that queries my external ip. It hits several sites on a random basis, and you can easily add or subtract from the list as you find them. It currently has seven working sites. All you need is an url that sends back the address as a simple text string.
Code:
get_current_ip() {
local -a ipsite
local -i xc num i
ipsite=(
"http://automation.whatismyip.com/n09230945.asp"
"http://showip.codebrainz.ca/"
"http://cfaj.freeshell.org/ipaddr.cgi"
"http://icanhazip.com/"
"http://wooledge.org/myip.cgi"
"http://ifconfig.me/ip"
"https://secure.informaction.com/ipecho/"
)
xc=1
num=${#ipsite[@]}
until (( xc == 0 )) ; do
(( i = RANDOM % num ))
ip=$( wget -t 1 -T 5 -q -O- ${ipsite[i]} )
xc=$?
done
echo -n "${ip//[^0-9.]}"
}
Edit: note that this is assuming you are using bash. And speaking of which...
I must say I consider that extremely bad advice. I do realise you did say you assume Bash is used, but still.
First, original poster did not specify Bash, and the script name implies a generic POSIX shell (.sh suffix). Second, I believe [[ to be harmful.
[[ is Bash-specific. It is not a POSIX shell feature; you cannot assume it is implemented in anything but Bash. In many distros nowadays /bin/sh is symlinked to dash (which is a good idea, in my opinion, since it is an actual POSIX shell), which not support [[ .
Therefore, you can only rely on it being available if you use Bash explicitly (#!/bin/bash or equivalent).
The bigger reason is that the word-splitting no-escaping feature of [[ is a very close logical cousin of the magic quotes concept in PHP. Over a few years it was found that magic quotes in PHP produces security problems, since it allowed developers to ignore proper quoting and escaping rules. This, in my opionion, was a major reason why PHP was/is perceived as an inherently insecure language. Like [[, magic quotes were heavily recommended for use early on; even being enabled by default for PHP installations at one point.
Magic quotes in PHP are being phased out, both in the language and configuration files. The PHP site documents all of them to be deprecated, and states that the configuration options will definitely be removed in future versions.
I fail to see any significant differences between these two features. Both were intended to ease the rules for script writers, by overriding standard quoting and escaping rules. (Well, magic quotes were marketed as a security feature; perhaps some consider that a big difference.) Therefore, I suspect history will repeat itself, and that teaching users to prefer [[ over [ will lead to significant problems later on -- specifically, in a failure to understand proper quoting and escaping rules, this time in shell scripts. Personally, I'd recommend biting the bullet and learning them early on, and always applying them, even when technically not required.
Pattern matching can just as easily be done using a case statement, but I do like the regular expression matching [[ has.
The bigger reason is that the word-splitting no-escaping feature of [[ is a very close logical cousin of the magic quotes concept in PHP. Over a few years it was found that magic quotes in PHP produces security problems, since it allowed developers to ignore proper quoting and escaping rules. This, in my opionion, was a major reason why PHP was/is perceived as an inherently insecure language. Like [[, magic quotes were heavily recommended for use early on; even being enabled by default for PHP installations at one point.
Magic quotes in PHP are being phased out, both in the language and configuration files. The PHP site documents all of them to be deprecated, and states that the configuration options will definitely be removed in future versions.
Are they really that the similar? Aren't there some difference on how scripts are handled? I also believe it just depends on the version's script or code, and parser or generator of bash.
Quote:
I fail to see any significant differences between these two features.
Lots of good differences but those differences doesn't really matter anymore when you get used to those two and know how to take advantage of them, in proper way that is. What I like with [[ though is speed in parsing and cleaner syntax. e.g. [[ NUMBER -op N || $VAR = "$VAR2" ]].
Quote:
Pattern matching can just as easily be done using a case statement, but I do like the regular expression matching [[ has.
Depends. Sometimes it's a lot easier with [[. But then, parsing with case statements is quicker. Again, it is in how you take advantage of their features.
sh is a universal shell but bash is still the most popular or most distributed shell around and easier (and safer) to code in my opinion.
also, you can't do this properly in sh:
Code:
VALUES='1 2 3 4 * abc[1] 1234?'
for A in $VALUES; do
echo "$A"
done
to yield:
Code:
1
2
3
4
* <- expect different output
...
when there's a file around. do we have to use sed to fix that?
Code:
while read -d ' '; do echo "$REPLY"; done <<< "$VALUES "
read -a VALUES_A <<< "$VALUES"; IFS=$'\n' eval "echo \"\${VALUES_A[*]}\"
` also has some issues. i already forgot them though.
some scripts also do this:
Code:
NEWLINE="
"
IFS="$NEWLINE" for A in `grep exp BIGFILE.TXT`; do ... ; done
which is very expensive since it allocates all the output of `*` at once with respect to the value of IFS and parse it as one command. It also may cause problems if BIGFILE.TXT contains glob expressions.
Code:
while read A; do ...; done < <(exec grep exp BIGFILE.TXT)
Last edited by konsolebox; 07-04-2011 at 04:10 AM.
That's why I deliberately used the word "consider". It all depends on how portable the script needs to be, and what the coder is willing to deal with. The link I gave fully details the positives and negatives of both forms, so it shouldn't be hard to decide which suits you the best. The very last paragraph even says it explicitly:
Quote:
When should the new test command [[ be used, and when the old one [? If portability to the BourneShell is a concern, the old syntax should be used. If on the other hand the script requires BASH or KornShell, the new syntax is much more flexible.
This applies to all shell features, really. If you can be reasonably certain that a script will never executed in a non-bash environment, then there's no real reason to stick to posix-only syntax. I say take full advantage of what your shell has to offer whenever you can, and don't arbitrarily limit yourself to a less convenient subset of features. Posix-compliance mode will still be there for when you really need it.
Now I have no idea exactly what the "magic quotes" thing in PHP is, but from your description I don't see the same level of worry happening here. They aren't going out of their way to override the need for quoting everywhere, they simply defined the [[ keyword so that it doesn't perform globbing or word-splitting after expansions. A variable or other substitution is always treated as a single element when inside the double-brackets, no matter what the contents. That's all. It's a specific, localized parsing rule that helps to avoid a lot of syntax problems that plague older tests. It doesn't mean you don't still have to be very careful everywhere else.
Don't read me wrong, either. Of course I agree with you that it's important to learn proper quoting. I just doubt very highly that this single exception is going to lead to the quoting/security slippery-slope-disaster you envision.
"$()" is specified by posix, by the way, so as long as you aren't using a truly ancient shell there's no real reason to use anything else.
(I managed to confuse this thread with the thread at hand; therefore the edit. The other thread talks about system-level scripting: startup scripts, service scripts, cron jobs, for which I recommend the POSIX shell, namely dash . For general scripting, I much prefer Bash, and am happy to use bash-specific features. Except for [[ .)
I would prefer not to advocate the use of [[ in bash for novice users. I personally avoid it in all my scripts, because the more traditional [ and case work well for my needs.
I do prefer bash over any other shell for general utility scripts. If you check out the shell scripts I've written here, they almost invariably use explicitly Bash. My use of `..` is an anachronism I'd prefer to get rid of, but the only big-endian machine I have access to runs SunOS 5.10, which has an ancient sh that does not support $(..). There are members here in a similar situation, so I tend to use `..` instead of $(..).
Quote:
Originally Posted by konsolebox
also, you can't do this properly in sh:
Code:
VALUES='1 2 3 4 * abc[1] 1234?'
for A in $VALUES; do
echo "$A"
done
to yield:
[code]1
2
3
4
* <- expect different output
Sure you can. You simply add a set -f before the for loop to disable pathname expansion. You can add set +f as the first thing in the loop body, if you need pathname expansion in the loop body. Remember to add set -f after the loop to turn pathname expansion back on.
But I'm still not advocating any POSIX shell over bash in general. (I just recommend using dash for startup scripts and such.)
Quote:
Originally Posted by konsolebox
` also has some issues. i already forgot them though.
Absolutely. I don't advocate its use either. $(..) is superior over `..` , and happens to be standard in POSIX shells too. No reason to not use $(..).
Unless you do use an ancient version of sh, like I do.
Quote:
Originally Posted by konsolebox
some scripts also do this:
Code:
NEWLINE="
"
IFS="$NEWLINE" for A in `grep exp BIGFILE.TXT`; do ... ; done
which is very expensive since it allocates all the output of `*` at once with respect to the value of IFS and parse it as one command. It also may cause problems if BIGFILE.TXT contains glob expressions.
Code:
while read A; do ...; done < <(exec grep exp BIGFILE.TXT)
The obvious alternative in POSIX shells,
Code:
grep exp BIGFILE.TXT | while read -r LINE ; do ... ; done
does stream the grep results line by line, but the loop body is a subshell (making it difficult to pass results outside). The -r parameter tells the shell to not interpret backslash escapes.
I personally like the Bash-specific <( list ) a lot. It solves very cleanly the aforementioned result-passing problem. However, the way it does it -- the expression resolves to a file name, with the "file" containing the data -- is perhaps a bit surprising. For example, this command will work just fine:
Code:
dd if=<( for A in one two three ; do echo "$A" ; done )
It is not an actual file, but a path to the read end of the pipe, usually in /proc/self/fd/. For example, command
Code:
echo <( true )
will echo the path to the read end of the pipe instead of accessing the pipe.
Quote:
Originally Posted by David the H.
@NominalAnimal:
That's why I deliberately used the word "consider". It all depends on how portable the script needs to be, and what the coder is willing to deal with.
I misread it as a polite recommendation. I didn't notice the reservation, sorry.
Quote:
Originally Posted by David the H.
The link I gave fully details the positives and negatives of both forms, so it shouldn't be hard to decide which suits you the best. The very last paragraph even says it explicitly [.]
Here I disagree strongly.
The linked page does not even acknowledge the risk in not understanding proper escaping and quoting rules, it just tells users to "do this, and you don't need to worry about it".
(That is exactly what happened with magic quotes in PHP. Magic quotes were a feature that was supposed to help prevent SQL injection attacks. Whatever input a PHP script received from a HTML form, had single quotes ('), double quotes ("), backslashes (\) and NULs (zero byte) automatically escaped with a backslash. Sounds perfectly reasonable, doesn't it? And yet, it ended up causing a lot of grief instead.)
I personally would prefer users learned the quoting and escaping rules first (Bash uses POSIX rules AFAIK), and apply them always, even when not technically required. That solves a huge number of problems at once, from whitespace in file names to non-ASCII support.
In all the cases I'm aware of, creating exceptions to common rules to ease programming, has resulted in more problems that it has solved. I hate seeing an error repeated.
Quote:
Originally Posted by David the H.
They aren't going out of their way to override the need for quoting everywhere, they simply defined the [[ keyword so that it doesn't perform globbing or word-splitting after expansions. A variable or other substitution is always treated as a single element when inside the double-brackets, no matter what the contents. That's all. It's a specific, localized parsing rule that helps to avoid a lot of syntax problems that plague older tests. It doesn't mean you don't still have to be very careful everywhere else.
You may be right, but I'm personally not convinced. I believe pointing new users at [[ will result in even more fragile scripts in the future, since fewer users will consider quoting rules at all.
I deal with a lot of (job submission) scripts written by a lot of different users, mostly in Bash. They're typically very fragile, mostly due to total ignorance of quoting rules. Most work only because all our path components happen to consist of just letters and numbers. The scripts tend to break at the smallest hint of change. Fortunately, most users seem to reuse known working scripts, so script failures are not common enough to be considered a problem; only when any kind of changes occur.
With very little effort, the scripts could be robust. Being aware of [[ first will reduce the incentive to understanding quoting and escaping rules to practically nil. In beginner scripts, I mostly see commands, variable assignments, and the if conditional. Often beginners perceive the need to quote command parameters as specific to that command -- and of course it is not, it is needed for and used by the shell to determine where the parameter boundaries are. Natively, each parameter is a separate string.
Quote:
Originally Posted by David the H.
Don't read me wrong, either. Of course I agree with you that it's important to learn proper quoting. I just doubt very highly that this single exception is going to lead to the quoting/security slippery-slope-disaster you envision.
Sure, and that's ok. I'm often wrong. I did react strongly, because I do find your advice to others helpful, trustworty and quite valuable.
Last edited by Nominal Animal; 07-04-2011 at 06:14 PM.
Reason: Oops, confused two different threads. Sorry! No change in opinion, though.
I like the comments here http://tldp.org/LDP/abs/html/testcon...ml#DBLBRACKETS and certainly [[ ]] is avail in ksh as well as bash; possibly others.
The main rule I recommend for any prod scripts is to always specify the desired shell in the #! line, so that if not found the script simply dies immediately, instead of defaulting to the 'current' shell and possibly doing unexpected things.
The linked page does not even acknowledge the risk in not understanding proper escaping and quoting rules, it just tells users to "do this, and you don't need to worry about it".
I am not sure where you got this sentiment from? As with many others I often like / respect anything you have to write,
but here I was not sure we are necessarily looking at the same link??
The page makes constant reference to the fact that if you wish to remain POSIX compliant that you should use [ over [[, eg.
Quote:
[[ is a new improved version of it, which is a keyword, not a program. This has beneficial effects on the ease of use, as shown below. [[ is understood by KornShell and BASH (e.g. 2.03), but not by the older POSIX or BourneShell.
#And
When should the new test command [[ be used, and when the old one [? If portability to the BourneShell is a concern, the old syntax should be used.
It does also mention that there are subtle 'differences' and identifies that quoting need not be done when using [[,
but I did not read this as "should not be done".
Personally I go with the old adage, use the right tool for the job. So to the OP, I will reiterate what others have said
(aside from the above discussion), if POSIX compliance is a must then I agree that [ should be the choice and many have
offered appropriate solutions to work out your issue, but if not, then letting 'if' do the work or [[ take away
the worry of using an empty variable can be alternatives.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.