Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am running a raspberry pi farm and one of the little buggers acts as my media server.
Lets call it raspidlna shall we?
On occasion, the software I use for this (minidlna) has a breakdown, due to one of many reasons related to the infrastructure in place.
As a big fan of self-healing approaches I have nagios monitoring the status of the http service provided by minidlna, but I would like to take it one step further, and actually monitor the number of video files being served.
This can be obtained through an http call to http://raspidlna:8200 which in turn responds with the following html:
I highlighted in bold the text block related to the Video Files, what I want to read is the sub block in italic and underlined, the number of files.
Now I know we can perform alot of operations to parse strings using sed, awk, grep and even string substitutions with bash, but I must admit whenever I try looking at all of it, I get quite a headache, so... That brings me to you experts... Hopefully one of you will have a nice and clean idea on how to work some magic with this...
All I wrote so far script wise was:
Code:
#!/bin/bash
content=$(wget http://raspidlna:8200/ -q -O -)
declare -i count=????
echo "Total Videos $count"
if (("$count" < 1)) ; then
echo would now run ./startMinidlna.sh
fi
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195
Rep:
This might well be beyond the capabilities of Bash. However, utilities exist for parsing HTML. It seems hxselect from the html-xml-utils package is an option. Or pup. Both have references on Google.
I don't have experience with either one, but a lot of experience in Bash. And would not do this in Bash...
This can be done, but it will not be quite straightforward. There are some things in your favor.
#1 the string "Video files" is unique and specific to the line needed, so a simple grep will isolate the correct line.
#2 the format of the line is fixed, so the correct string should always start at the same character position or offset.
#3 once we isolate that string, the first non-numeric character (<) will mark the end of the numeric string we need.
Consider grep pattern matching and BASH string subset addressing and extraction and you have all of the tools needed to pull out the number correctly. Do some reading, then come back here with what you come up with and any questions. You CAN do it.
PS: After some thought about #2 and #3, I realized that the only numeric characters on the line are exactly those you need. If we can filter out all non-numeric characters we have the answer even faster. Grep has an option for that.
Something like
I agree that you can find the line using bash and further using any of awk or tr, you can break it down into fields by selecting the delimiters of < or > and then also use the cut command to get to just the number value.
What I'm not sure about is multiline files being stored as a variable. But this is due to my inexperience with that, not anything in particular I know about that part of the topic.
EDIT: You can probably just use sed to delete the following, and presumably, fixed terms:
<tr><td>Video files</td><td>
</td></tr>
The minor snag might be some indeterminate number of white spaces left before that, however once again, awk, or cut, and maybe tr will "see" column delimited data and you can find the string value of that number from there.
Just add a "shebang" line as the first line of your script, such as:
Code:
#!/usr/bin/ruby
Now, you can write the entire remainder of your script in <<Ruby>>, and no one will ever know, nor care, that you did so. You now have at your disposal everything that <<Ruby>> brings to the table, which certainly includes an HTML parser.
And of course, you have your pick of languages: Perl, PHP, Python, Ruby, Haskell, Java (ick...) . . . .
"The bash scripting-language is highly overrated." You arenot in any way confined to it.
The OP is only interested in the number - not parsing the tags. The grep offered above will do it; personally I would use a single call to sed. If there were multiple stanzas (that entailed summing), awk would be my first instinct, but appears unnecessary here.
If your version of grep supports perl-compatible regular expressions you could use the -P option with some zero-width assertions. See "man perlre" for the details.
However, that is quite fragile and if the spacing, especially line breaking, changes then it will need to be adjusted. Same goes for other non-parsing solutions.
For robustness you might use a proper XHTML processor instead like one of the XPath tools. Either of the following XPaths should work
XPath and regex are easy enough to do in perl with the help of the HTML::TreeBuilder::XPath module from CPAN. Ruby was mentioned above, it too has XPath modules, and I expect so do many of the other scripting languages.
so the HTML is provided by nagios?
i'm not sure how the number of video files served relates to minidlna crashing or not, but anyhow:
wouldn't it be cleverer to explore the possibilities of the software that is providing this information? nagios?
what you are doing there makes for some nice code golfing, but really it's just a hacky duct tape approach.
It's simply easier to use a language system which is readily available and which already includes such niceties as an HTML parser, regular-expression support, and so on.
Quote:
Actum Ne Agas:Do Not Do A Thing Already Done.
Thanks to "#!shebang," Bash makes it equally easy for you to write a "Bash script" in whatever is the most-appropriate scripting language. Only Dr. Korn's shell endeavored to build a "serious" programming language as its built-in scripting engine, and "#!shebang" is frankly a more-elegant way to do it.
As one author put it, "It's kind of like building a particularly-elegant archway over the front door of a supermarket. You might look upon it and even be proud of it, but you might not want to admit to having done it."
Not sure if this will help but "JSON is a minimal readable format for strcuturing data. It's used primarily to transmkt data between a server and web apps as an alternative to XML." Python has a built-in JSON libary to "pretty print" JSON output in order to find a specific entry. Just use python -m json.tool to indent and organize the JSON output via cat test.jason | python -m json.tool For more advanced JSON parsing you can install jq which has options to extract specific values from jason input. In that case ust pipe the output to jq instead...
Last edited by justmy2cents; 09-01-2017 at 01:07 AM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.