Find command question

herpplederppleston · 02-10-2017, 02:16 PM

Quote:

Originally Posted by TB0ne

No worries, and yes, research is always the key to learning better. The options can get confusing, and we're always happy to explain things if you're stuck. The man pages are always a good starting point...unless you try looking at the ones for sed and awk, both of which can be hideously complicated, and are both very powerful commands. I think there's even an entire book written on them:
http://shop.oreilly.com/product/9781565922259.do

As a hint, look at the "-F" flag for awk, then look at your input string. See anything common at the beginning/end of what you're after that you can use as a field-separator?

Haha wow, I bet that's a fun read

I suppose the forward slashes would be useable as a field separator? I know I could use something like the below in order to define what is a valid IP address:

(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)

astrogeek · 02-10-2017, 02:33 PM

Not as bad as it looks, and very much worth learning.

As you are actually learning and paying attention, perhaps a working example will encourage you to follow through.

Here is a simple awk that does the trick:

Code:

echo 'https://192.168.3.4/random/directories/here/' |awk -F\/ '{print $3}'
192.168.3.4

Hopefully you can see that it does work, and then when you figure out why it works you'll feel empowered to learn more.

Awk is the go-to tool for most text extractions and manipulations.

Regular expressions are the foundation for most of the real power of text manipulations, and sed puts those right at your fingertips. Here is a sed that produces the same result.

Code:

echo 'https://192.168.3.4/random/directories/here/' |sed 's/.*\/\/\([^\/]*\).*/\1/'
192.168.3.4

Not so many slashes, but it does not enforce "valid IP address" and works equally well to extract a host name:

Code:

echo 'https://some.host.com/random/directories/here/' |sed 's/.*\/\/\([^\/]*\).*/\1/'
some.host.com

Try to figure why it works, it is not so difficult to understand!

And just to expand on TB0ne's comments - the rules are there to keep things helpful and friendly here at LQ. If you are looking for someone to do your homework for you, this is not the place.

On the other hand, if you are learning and growing and want to share in the experience - it is all good!

Welcome to LQ and good luck!

herpplederppleston · 02-10-2017, 02:50 PM

Quote:

Originally Posted by astrogeek

Not as bad as it looks, and very much worth learning.

As you are actually learning and paying attention, perhaps a working example will encourage you to follow through.

Here is a simple awk that does the trick:

Code:

echo 'https://192.168.3.4/random/directories/here/' |awk -F\/ '{print $3}'
192.168.3.4

Hopefully you can see that it does work, and then when you figure out why it works you'll feel empowered to learn more.

Awk is the go-to tool for most text extractions and manipulations.

Regular expressions are the foundation for most of the real power of text manipulations, and sed puts those right at your fingertips. Here is a sed that produces the same result.

Code:

echo 'https://192.168.3.4/random/directories/here/' |sed 's/.*\/\/\([^\/]*\).*/\1/'
192.168.3.4

Not so many slashes, but it does not enforce "valid IP address" and works equally well to extract a host name:

Code:

echo 'https://some.host.com/random/directories/here/' |sed 's/.*\/\/\([^\/]*\).*/\1/'
some.host.com

Try to figure why it works, it is not so difficult to understand!

And just to expand on TB0ne's comments - the rules are there to keep things helpful and friendly here at LQ. If you are looking for someone to do your homework for you, this is not the place.

On the other hand, if you are learning and growing and want to share in the experience - it is all good!

Welcome to LQ and good luck!

Ah thankyou very much. I'll admit awk and sed are commands I'm not familiar with much, but clearly I should look into it some more! I'm definitely very much a noob, have only been learning linux for a few months, as I'm self teaching skills for pentesting
Definitely here to learn/grow/share in the experience so seems like I came to the right place!

Thanks all for your input

TB0ne · 02-10-2017, 03:34 PM

Quote:

Originally Posted by herpplederppleston

Ah thankyou very much. I'll admit awk and sed are commands I'm not familiar with much, but clearly I should look into it some more! I'm definitely very much a noob, have only been learning linux for a few months, as I'm self teaching skills for pentesting
Definitely here to learn/grow/share in the experience so seems like I came to the right place!

Thanks all for your input

Astrogeek nailed it. Your regex for identifying a valid IP address is a great start...and regex'es can be VERY daunting. The plus is, once you get comfortable with them, you can start using them in all sorts of ways, including in sed/awk...even in CRON jobs:
http://www.linuxquestions.org/questi...2/#post4654486

The complexity and flexibility of *nix is truly awesome, compared to other OS'es.

herpplederppleston · 02-10-2017, 03:44 PM

Quote:

Originally Posted by TB0ne

Astrogeek nailed it. Your regex for identifying a valid IP address is a great start...and regex'es can be VERY daunting. The plus is, once you get comfortable with them, you can start using them in all sorts of ways, including in sed/awk...even in CRON jobs:
http://www.linuxquestions.org/questi...2/#post4654486

The complexity and flexibility of *nix is truly awesome, compared to other OS'es.

Yeah I wish I could claim I figured that out on my own, but that was one clue I found via google to potentially help formulate the correct command.. Even though I understand how it works and allows for a valid IP

echo 'https://192.168.3.4/random/directories/here/' |awk -F\/ '{print $3}'

I completely understand this command up until $3
I know $ signs signify variables (?), but I don't understand why the number 3 is used?

Turbocapitalist · 02-11-2017, 12:18 AM

Quote:

Originally Posted by herpplederppleston

I completely understand this command up until $3
I know $ signs signify variables (?), but I don't understand why the number 3 is used?

The $3 stands for the third field in the line, fields being delimited by the pattern specified by -F. In this case you've set awk to define fields as spans of characters that are separated by a single slash. That is because you are using a single slash / as the delimiter and the span after the second occurrence is the third such span. If you use a pattern instead you can grab a span of slashes as the delimiter. The -F can be an actual pattern in most (all?) versions of awk.

Once you've identified the field, you can then work on the field further.

Keep checking the manual page for awk, it's a good reference work for the language and will make sense more and more. Also there are some good books on sed and awk, check your local college technical or engineering library. IFF the library is any good then there will be at least one such book available.

Some wiggle room in that task is that what is allowed in a URL is not universally agreed upon.

Regarding sed, the characters delimiting the search and replace pattern only need to be three of a kind. So if you are working with a lot of slashes in your patterns then you can use something else like a pound sign # or pipe | or an exclamation mark:

Code:

echo $URL | sed -e 's#^.*//##; s#/.*$##;'

herpplederppleston · 02-11-2017, 03:16 AM

Quote:

Originally Posted by Turbocapitalist

The $3 stands for the third field in the line, fields being delimited by the pattern specified by -F. In this case you've set awk to define fields as spans of characters that are separated by a single slash. That is because you are using a single slash / as the delimiter and the span after the second occurrence is the third such span. If you use a pattern instead you can grab a span of slashes as the delimiter. The -F can be an actual pattern in most (all?) versions of awk.

Once you've identified the field, you can then work on the field further.

Keep checking the manual page for awk, it's a good reference work for the language and will make sense more and more. Also there are some good books on sed and awk, check your local college technical or engineering library. IFF the library is any good then there will be at least one such book available.

Some wiggle room in that task is that what is allowed in a URL is not universally agreed upon.

Regarding sed, the characters delimiting the search and replace pattern only need to be three of a kind. So if you are working with a lot of slashes in your patterns then you can use something else like a pound sign # or pipe | or an exclamation mark:

Code:

echo $URL | sed -e 's#^.*//##; s#/.*$##;'

Got it, thanks for explaining that