LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-10-2017, 02:16 PM   #16
herpplederppleston
LQ Newbie
 
Registered: Feb 2017
Posts: 8

Original Poster
Rep: Reputation: Disabled

Quote:
Originally Posted by TB0ne View Post
No worries, and yes, research is always the key to learning better. The options can get confusing, and we're always happy to explain things if you're stuck. The man pages are always a good starting point...unless you try looking at the ones for sed and awk, both of which can be hideously complicated, and are both very powerful commands. I think there's even an entire book written on them:
http://shop.oreilly.com/product/9781565922259.do

As a hint, look at the "-F" flag for awk, then look at your input string. See anything common at the beginning/end of what you're after that you can use as a field-separator?
Haha wow, I bet that's a fun read

I suppose the forward slashes would be useable as a field separator? I know I could use something like the below in order to define what is a valid IP address:

(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
 
Old 02-10-2017, 02:33 PM   #17
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,269
Blog Entries: 24

Rep: Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196Reputation: 4196
Not as bad as it looks, and very much worth learning.

As you are actually learning and paying attention, perhaps a working example will encourage you to follow through.

Here is a simple awk that does the trick:

Code:
echo 'https://192.168.3.4/random/directories/here/' |awk -F\/ '{print $3}'
192.168.3.4
Hopefully you can see that it does work, and then when you figure out why it works you'll feel empowered to learn more.

Awk is the go-to tool for most text extractions and manipulations.

Regular expressions are the foundation for most of the real power of text manipulations, and sed puts those right at your fingertips. Here is a sed that produces the same result.

Code:
echo 'https://192.168.3.4/random/directories/here/' |sed 's/.*\/\/\([^\/]*\).*/\1/'
192.168.3.4
Not so many slashes, but it does not enforce "valid IP address" and works equally well to extract a host name:

Code:
echo 'https://some.host.com/random/directories/here/' |sed 's/.*\/\/\([^\/]*\).*/\1/'
some.host.com
Try to figure why it works, it is not so difficult to understand!

And just to expand on TB0ne's comments - the rules are there to keep things helpful and friendly here at LQ. If you are looking for someone to do your homework for you, this is not the place.

On the other hand, if you are learning and growing and want to share in the experience - it is all good!

Welcome to LQ and good luck!
 
Old 02-10-2017, 02:50 PM   #18
herpplederppleston
LQ Newbie
 
Registered: Feb 2017
Posts: 8

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by astrogeek View Post
Not as bad as it looks, and very much worth learning.

As you are actually learning and paying attention, perhaps a working example will encourage you to follow through.

Here is a simple awk that does the trick:

Code:
echo 'https://192.168.3.4/random/directories/here/' |awk -F\/ '{print $3}'
192.168.3.4
Hopefully you can see that it does work, and then when you figure out why it works you'll feel empowered to learn more.

Awk is the go-to tool for most text extractions and manipulations.

Regular expressions are the foundation for most of the real power of text manipulations, and sed puts those right at your fingertips. Here is a sed that produces the same result.

Code:
echo 'https://192.168.3.4/random/directories/here/' |sed 's/.*\/\/\([^\/]*\).*/\1/'
192.168.3.4
Not so many slashes, but it does not enforce "valid IP address" and works equally well to extract a host name:

Code:
echo 'https://some.host.com/random/directories/here/' |sed 's/.*\/\/\([^\/]*\).*/\1/'
some.host.com
Try to figure why it works, it is not so difficult to understand!

And just to expand on TB0ne's comments - the rules are there to keep things helpful and friendly here at LQ. If you are looking for someone to do your homework for you, this is not the place.

On the other hand, if you are learning and growing and want to share in the experience - it is all good!

Welcome to LQ and good luck!
Ah thankyou very much. I'll admit awk and sed are commands I'm not familiar with much, but clearly I should look into it some more! I'm definitely very much a noob, have only been learning linux for a few months, as I'm self teaching skills for pentesting
Definitely here to learn/grow/share in the experience so seems like I came to the right place!

Thanks all for your input
 
Old 02-10-2017, 03:34 PM   #19
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,681

Rep: Reputation: 7971Reputation: 7971Reputation: 7971Reputation: 7971Reputation: 7971Reputation: 7971Reputation: 7971Reputation: 7971Reputation: 7971Reputation: 7971Reputation: 7971
Quote:
Originally Posted by herpplederppleston View Post
Ah thankyou very much. I'll admit awk and sed are commands I'm not familiar with much, but clearly I should look into it some more! I'm definitely very much a noob, have only been learning linux for a few months, as I'm self teaching skills for pentesting
Definitely here to learn/grow/share in the experience so seems like I came to the right place!

Thanks all for your input
Astrogeek nailed it. Your regex for identifying a valid IP address is a great start...and regex'es can be VERY daunting. The plus is, once you get comfortable with them, you can start using them in all sorts of ways, including in sed/awk...even in CRON jobs:
http://www.linuxquestions.org/questi...2/#post4654486

The complexity and flexibility of *nix is truly awesome, compared to other OS'es.
 
Old 02-10-2017, 03:44 PM   #20
herpplederppleston
LQ Newbie
 
Registered: Feb 2017
Posts: 8

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by TB0ne View Post
Astrogeek nailed it. Your regex for identifying a valid IP address is a great start...and regex'es can be VERY daunting. The plus is, once you get comfortable with them, you can start using them in all sorts of ways, including in sed/awk...even in CRON jobs:
http://www.linuxquestions.org/questi...2/#post4654486

The complexity and flexibility of *nix is truly awesome, compared to other OS'es.
Yeah I wish I could claim I figured that out on my own, but that was one clue I found via google to potentially help formulate the correct command.. Even though I understand how it works and allows for a valid IP

echo 'https://192.168.3.4/random/directories/here/' |awk -F\/ '{print $3}'

I completely understand this command up until $3
I know $ signs signify variables (?), but I don't understand why the number 3 is used?
 
Old 02-11-2017, 12:18 AM   #21
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,328
Blog Entries: 3

Rep: Reputation: 3726Reputation: 3726Reputation: 3726Reputation: 3726Reputation: 3726Reputation: 3726Reputation: 3726Reputation: 3726Reputation: 3726Reputation: 3726Reputation: 3726
Quote:
Originally Posted by herpplederppleston View Post
I completely understand this command up until $3
I know $ signs signify variables (?), but I don't understand why the number 3 is used?
The $3 stands for the third field in the line, fields being delimited by the pattern specified by -F. In this case you've set awk to define fields as spans of characters that are separated by a single slash. That is because you are using a single slash / as the delimiter and the span after the second occurrence is the third such span. If you use a pattern instead you can grab a span of slashes as the delimiter. The -F can be an actual pattern in most (all?) versions of awk.

Once you've identified the field, you can then work on the field further.

Keep checking the manual page for awk, it's a good reference work for the language and will make sense more and more. Also there are some good books on sed and awk, check your local college technical or engineering library. IFF the library is any good then there will be at least one such book available.

Some wiggle room in that task is that what is allowed in a URL is not universally agreed upon.

Regarding sed, the characters delimiting the search and replace pattern only need to be three of a kind. So if you are working with a lot of slashes in your patterns then you can use something else like a pound sign # or pipe | or an exclamation mark:

Code:
echo $URL | sed -e 's#^.*//##; s#/.*$##;'
 
Old 02-11-2017, 03:16 AM   #22
herpplederppleston
LQ Newbie
 
Registered: Feb 2017
Posts: 8

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Turbocapitalist View Post
The $3 stands for the third field in the line, fields being delimited by the pattern specified by -F. In this case you've set awk to define fields as spans of characters that are separated by a single slash. That is because you are using a single slash / as the delimiter and the span after the second occurrence is the third such span. If you use a pattern instead you can grab a span of slashes as the delimiter. The -F can be an actual pattern in most (all?) versions of awk.

Once you've identified the field, you can then work on the field further.

Keep checking the manual page for awk, it's a good reference work for the language and will make sense more and more. Also there are some good books on sed and awk, check your local college technical or engineering library. IFF the library is any good then there will be at least one such book available.

Some wiggle room in that task is that what is allowed in a URL is not universally agreed upon.

Regarding sed, the characters delimiting the search and replace pattern only need to be three of a kind. So if you are working with a lot of slashes in your patterns then you can use something else like a pound sign # or pipe | or an exclamation mark:

Code:
echo $URL | sed -e 's#^.*//##; s#/.*$##;'
Got it, thanks for explaining that
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
find command question mokku Linux - Newbie 16 06-09-2009 12:07 PM
Question on Find Command: - Vs + JockVSJock Linux - Newbie 4 04-29-2009 01:44 AM
Question about find command centosfan Linux - Server 2 11-10-2008 07:36 PM
Question on find command rytrom Linux - Newbie 3 08-07-2003 02:14 AM
Find Command - Basic Question tlb04 Linux - General 3 05-06-2003 08:45 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 01:12 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration