LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-26-2011, 11:24 AM   #1
zer0signal
Member
 
Registered: Oct 2010
Location: Cleveland
Distribution: Slackware, Fedora, RHEL (4,5), LFS 6.7, CentOS
Posts: 258

Rep: Reputation: 29
Question Bash Scripting - Output as Multiple Files


Ok here is what I am trying to do. I have wrote a 1 line command that parses a file, locates the IP Address in the file and then trims the output the way I want it, and then sorts numerically and by uniqueness and then >> appends to output.txt

I can get all the IP's into 1 file "output.txt", but what I am really looking for is some type of way to create a text file, for each IP it finds labeled xxx.xxx.xxx.xxx.txt and also put that ip address into that file..

xxx.xxx.xxx.xxx = the ip address it finds

Can anyone offer suggestions on the best approach for this...?

Thanks
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 01-26-2011, 11:38 AM   #2
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
If you have the IP stored in a variable (using command substitution) you can simply do
Code:
echo $ip > $ip.txt
What is the command line you mentioned? If using awk it can be even more straightforward.
 
Old 01-26-2011, 11:43 AM   #3
zer0signal
Member
 
Registered: Oct 2010
Location: Cleveland
Distribution: Slackware, Fedora, RHEL (4,5), LFS 6.7, CentOS
Posts: 258

Original Poster
Rep: Reputation: 29
Quote:
What is the command line you mentioned? If using awk it can be even more straightforward.
I am using awk to specify my separator.

cat /var/log/secure |grep "Failed password" | awk -F'from' '{ print $2 } ' | cut -d" " -f2 | sort -n -u > list.txt
 
Old 01-26-2011, 12:26 PM   #4
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
awk has all the grep and cut functionality, so that your line can be condensed into:
Code:
awk '
/Failed password/ {
  ip = gensub(/.*from ([[:digit:].]+) .*/,"\\1","g")
  if ( ! _[ip]++ ) {  
    print ip > (ip ".txt")
    print ip
  }
}' /var/log/secure | sort -n > list.txt
This will create single IP files and will write the whole list of IPs (without duplicates) into list.txt. Feel free to ask for explanation if something is not clear.
 
Old 01-26-2011, 12:46 PM   #5
zer0signal
Member
 
Registered: Oct 2010
Location: Cleveland
Distribution: Slackware, Fedora, RHEL (4,5), LFS 6.7, CentOS
Posts: 258

Original Poster
Rep: Reputation: 29
uh.... ok? lol I really don't know much awk, and I'm just learning to script. If you would not mind being able to break that down, so I can understand the logic and flow. If not I can research the web on what is actually being stated there. I mean I am able to break down some of it to understand. =)


awk ' <-- that seems to start the awk statement

/Failed password/ { <-- parsing ID

ip = gensub(/.*from ([[:digit:].]+) .*/,"\\1","g") <-- specify the variable and next deliemiter... dont know what the rest of the line is.

if ( ! _[ip]++ ) { <- if statement for the variable ip

print ip > (ip ".txt") <-- print each variable with to a txt with the varable as file name

print ip <-- ?
}
}' /var/log/secure | sort -n > list.txt <-- dumping secure log to list.txt file?


sorry looks really sloppy

Last edited by zer0signal; 01-26-2011 at 01:06 PM.
 
Old 01-26-2011, 02:22 PM   #6
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Well.. here is my explanation: an awk rule is made of
Code:
pattern { action }
In my example, we have a single rule whose pattern is /Failed password/. This means that the action (that is the code inside brackets) is executed only for those line matching the regular expression. This accomplishes the task of grep in your code.

First we have to extract the IP address from the line. I don't know what the line exactly is in your secure file, but I can guess based on your code. The gensub function can do substitutions in a string. Here we want to ignore all the parts of the string but the IP address:
Code:
ip = gensub(/.*from ([[:digit:].]+) .*/,"\\1","g")
The regular expression matches any string followed by from and a space, then matches any string made of numbers and dots (a rough expression to match IP addresses) and finally a space followed by any other string. Note the parentheses around the part matching the IP address: their purpose is to keep in memory the matched string. Now the whole line can be substituted by the matching IP address using "\\1" as substitution string, where 1 means the first part of the string kept in memory. Indeed the regular expression might have multiple parentheses to retain different parts of the string, so that we can use \\2 or \\3 as well. Please, refer to the GNU awk manual here for more details.

Now we have extracted the IP address with a reasonable confidence and we want either to write it into a file (named as the IP address itself) and to add it to a complete list. Since we'll use shell redirection later, we send it to standard output. The first task is accomplished by:
Code:
print ip > (ip ".txt")
where the file name is simply the concatenation inside parentheses of the content of the ip variable and the string .txt. The second task is even more simple:
Code:
print ip
However we want to avoid duplicates, i.e. the complete list will not contains the same address twice or more and the .txt files will not be written multiple times. We want awk to print the ip variable only the first time it contains that particular IP address.

First take in mind that in awk true is any number different from 0 or any non-empty string, whereas false is 0 or the null string. Here
Code:
_[ip]++
an array element (the name of the array is an underscore for brevity) whose index is the current ip address, is incremented by one. Note the C notation ++ after the variable name. It means that the variable is evaluated as is and then incremented by one. The opposite would have been
Code:
++_[ip]
where first the variable is incremented and then evaluated. This is a subtle difference that let we evaluate 0 (false) the first time we assign the ip-th element of the array, any other number (true) the subsequent times. It's difficult to explain this piece of code, but I hope it's a little more clear.

However we want a true condition only the first time the IP is encountered in order to print it. Hence we have to invert the logical expression using the not operator (in awk is an exclamation mark):
Code:
if ( ! _[ip]++ )
and the trick is done!

Finally, following your code we want to sort numerically the output (note that we already managed for duplicates) and write it to the list.txt file:
Code:
... | sort -n > list.txt

Last edited by colucix; 01-26-2011 at 02:27 PM. Reason: spelling corrected (hopefully)
 
2 members found this post helpful.
Old 01-26-2011, 03:16 PM   #7
zer0signal
Member
 
Registered: Oct 2010
Location: Cleveland
Distribution: Slackware, Fedora, RHEL (4,5), LFS 6.7, CentOS
Posts: 258

Original Poster
Rep: Reputation: 29
Quote:
_[ip]++
So this is creating a scaling variable based on how many IPs it finds... So variable would be ip1,ip2,ip3,ip4,ip5,ip6 etc. Until there is no more data for the variables?
 
Old 01-26-2011, 03:31 PM   #8
corp769
LQ Guru
 
Registered: Apr 2005
Location: /dev/null
Posts: 5,818

Rep: Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007
Quote:
Originally Posted by zer0signal View Post
So this is creating a scaling variable based on how many IPs it finds... So variable would be ip1,ip2,ip3,ip4,ip5,ip6 etc. Until there is no more data for the variables?
Yes. Kind of like in C, var1 = var1++
 
Old 01-26-2011, 03:41 PM   #9
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
Quote:
Originally Posted by zer0signal View Post
So this is creating a scaling variable based on how many IPs it finds... So variable would be ip1,ip2,ip3,ip4,ip5,ip6 etc. Until there is no more data for the variables?
Nope. It will assign 1, 2, 3 and so on to the array element with index ip
Code:
_[ip]
A uninitialized variable in awk has value 0, so that the first time the element is evaluated it returns 0. The second time it is evaluated, it returns 1 since it has been incremented in the previous pass.
Code:
_[ip] = 0
_[ip] = 1
_[ip] = 2
Take in mind that in awk the array index can be any string (not only a number). Suppose you read the IP 192.168.0.1. The first time you evaluate the IPth element you have:
Code:
_[192.168.0.1] = 0
After that the ++ notation increments its value by one. The next time you encounter the same IP address you have
Code:
_[192.168.0.1] = 1
and then again it's incremented by one, and so on.
 
1 members found this post helpful.
Old 01-26-2011, 04:27 PM   #10
zer0signal
Member
 
Registered: Oct 2010
Location: Cleveland
Distribution: Slackware, Fedora, RHEL (4,5), LFS 6.7, CentOS
Posts: 258

Original Poster
Rep: Reputation: 29
Ok, that makes it a lot clearer. I appreciate your advice and explanation on this subject! =) I have been going over the GNU-awk page. Defiantly something I'm going to take deeper, because of its ability with text parsing and output!

Thanks again! =)
 
Old 01-26-2011, 04:30 PM   #11
corp769
LQ Guru
 
Registered: Apr 2005
Location: /dev/null
Posts: 5,818

Rep: Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007
Quote:
Originally Posted by colucix View Post
Nope. It will assign 1, 2, 3 and so on to the array element with index ip
Code:
_[ip]
A uninitialized variable in awk has value 0, so that the first time the element is evaluated it returns 0. The second time it is evaluated, it returns 1 since it has been incremented in the previous pass.
Code:
_[ip] = 0
_[ip] = 1
_[ip] = 2
Take in mind that in awk the array index can be any string (not only a number). Suppose you read the IP 192.168.0.1. The first time you evaluate the IPth element you have:
Code:
_[192.168.0.1] = 0
After that the ++ notation increments its value by one. The next time you encounter the same IP address you have
Code:
_[192.168.0.1] = 1
and then again it's incremented by one, and so on.
My fault, I knew what to say, just didn't say it correctly. I do that a lot :P
 
Old 01-26-2011, 04:33 PM   #12
zer0signal
Member
 
Registered: Oct 2010
Location: Cleveland
Distribution: Slackware, Fedora, RHEL (4,5), LFS 6.7, CentOS
Posts: 258

Original Poster
Rep: Reputation: 29
Ha! =) It's cool, just trying to grasp the concept =)
 
Old 01-26-2011, 04:37 PM   #13
corp769
LQ Guru
 
Registered: Apr 2005
Location: /dev/null
Posts: 5,818

Rep: Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007
Do you fully understand it now?
 
Old 01-26-2011, 06:12 PM   #14
zer0signal
Member
 
Registered: Oct 2010
Location: Cleveland
Distribution: Slackware, Fedora, RHEL (4,5), LFS 6.7, CentOS
Posts: 258

Original Poster
Rep: Reputation: 29
Quote:
Originally Posted by corp769 View Post
Do you fully understand it now?
For the most part I can cipher through what is actually going on. Will I able to pull this out of thin air next I have to code something like this.. No, lol but that to be expected with starting off. But I understand what is actually going on. =)
 
Old 01-26-2011, 06:43 PM   #15
corp769
LQ Guru
 
Registered: Apr 2005
Location: /dev/null
Posts: 5,818

Rep: Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007Reputation: 1007
That's good. It took me a while to get awk and gawk down to a science....
 
  


Reply

Tags
bash, multiple, output



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bash Scripting: Redirect output of entire script to file & screen Kristijan Programming 3 10-12-2017 03:17 PM
bash scripting: sorting and using using multiple files in a script daberkow Linux - Newbie 8 05-28-2009 09:24 AM
Bash Scripting - Reading Konsole Response / Output user99099099 Programming 4 01-23-2009 04:32 AM
Bash Scripting POSIX Class [[:alnum:]] giving wrong output livetoday Linux - Newbie 3 01-21-2008 11:56 PM
BASH scripting browsing multiple directories PokerFace Programming 3 10-02-2002 12:50 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:31 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration