LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-05-2019, 04:21 PM   #1
newtolinux2020
LQ Newbie
 
Registered: Apr 2019
Posts: 8

Rep: Reputation: Disabled
script to extract IP address from a honeypot log txt file


grep -E -o "([0-9]{1,3}[\.]){3}[0-9]{1,3}" */pentbox/other/log_honeypot.txt > IP.txt

I'm using the above code to extract the IP addresses off pentbox honeypot but it creates the file and nothing is in it. Can anyone help me please?
 
Old 04-05-2019, 05:12 PM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,126

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
The regex looks ok - look at the rest of the command. Any messages ?.
 
Old 04-05-2019, 05:15 PM   #3
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.9.2009
Posts: 5,727

Rep: Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211
Quote:
Originally Posted by newtolinux2020 View Post
grep -E -o "([0-9]{1,3}[\.]){3}[0-9]{1,3}" */pentbox/other/log_honeypot.txt > IP.txt

I'm using the above code to extract the IP addresses off pentbox honeypot but it creates the file and nothing is in it. Can anyone help me please?
The redirect is going to create the output file even if there's nothing to redirect

Please post a few lines...10 or so...of the log_honeypot.txt file.

Also, please post
Code:
ls -l */pentbox/other/log_honeypot.txt
...that leading * could be the problem. What is the actual path to the file?

(tested the grep regex...that works great!) -- so it's not finding the file.
You should be getting
Code:
grep: */pentbox/other/log_honeypot.txt : No such file or directory
on the screen when you run that line of code. Yes?
 
Old 04-05-2019, 06:04 PM   #4
newtolinux2020
LQ Newbie
 
Registered: Apr 2019
Posts: 8

Original Poster
Rep: Reputation: Disabled
thank you for replying.

the full path to the file is /root/pentbox-1.8/other/log_honeypot.txt

the first 10 lines are:

INTRUSION ATTEMPT DETECTED! from 000.00.00.00:46940 (2019-03-24 17:15:56 +0000)
-----------------------------
SSH-2.0-PUTTY

INTRUSION ATTEMPT DETECTED! from 000.00.00.00:40953 (2019-03-24 17:16:39 +0000)
-----------------------------
SSH-2.0-PUTTY

INTRUSION ATTEMPT DETECTED! from 000.00.00.00:61033 (2019-03-24 17:17:37 +0000)
-----------------------------
SSH-2.0-PUTTY

INTRUSION ATTEMPT DETECTED! from 000.00.00.00:22734 (2019-03-24 17:18:35 +0000)
-----------------------------
SSH-2.0-PUTTY

INTRUSION ATTEMPT DETECTED! from 000.00.00.00:56706 (2019-03-24 17:19:36 +0000)
-----------------------------
SSH-2.0-PUTTY

INTRUSION ATTEMPT DETECTED! from 000.00.00.00:35205 (2019-03-24 17:20:35 +0000)
-----------------------------
SSH-2.0-PUTTY

INTRUSION ATTEMPT DETECTED! from 000.00.00.00:22192 (2019-03-24 17:21:37 +0000)
-----------------------------
SSH-2.0-PUTTY

INTRUSION ATTEMPT DETECTED! from 000.00.00.00:33720 (2019-03-24 17:22:36 +0000)
-----------------------------
SSH-2.0-PUTTY

INTRUSION ATTEMPT DETECTED! from 000.00.00.00:42042 (2019-03-24 17:23:05 +0000)
-----------------------------
SSH-2.0-PUTTY

INTRUSION ATTEMPT DETECTED! from 000.00.00.00:43272 (2019-03-24 17:23:20 +0000)
-----------------------------
SSH-2.0-PUTTY


this is what happens when i run the ls -l

ls -l /root/pentbox-1.8/other/log_honeypot.txt
-rw-r--r-- 1 root root 1377443 Apr 5 21:01 /root/pentbox-1.8/other/log_honeypot.txt
 
Old 04-05-2019, 06:16 PM   #5
newtolinux2020
LQ Newbie
 
Registered: Apr 2019
Posts: 8

Original Poster
Rep: Reputation: Disabled
I managed to get it to work, now my next question is, how could i get the script to count duplicate IP addresses and eliminate 172 and 192 addresses within it? also i'm getting alot of 00.00.00.0.0 addresses.
 
Old 04-05-2019, 08:20 PM   #6
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.9.2009
Posts: 5,727

Rep: Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211
Quote:
Originally Posted by newtolinux2020 View Post
I managed to get it to work, now my next question is, how could i get the script to count duplicate IP addresses and eliminate 172 and 192 addresses within it? also i'm getting alot of 00.00.00.0.0 addresses.
Please use [code] tags when posting code or output. Thanks!
I found this regexp on the 'net
Code:
(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])
it only matches 'valid' IP addresses, so that should elminate the 00.00.00.00

Pipe your output to
Code:
sort -u
to get a unique list. See man sort. That won't give you a count, tho. Pipe to just sort then iterate the resulting list and count the duplicates. You could also eliminate those beginning with 172 and 192 then, too.

So something like
Code:
grep -E -o (([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]) | \
sort | grep -v ^172 | grep -v ^192 > sortedIP.txt
to get the sorted, cleaned up list.
Then, loop sortedIP.txt to get the counts you want.
 
Old 04-05-2019, 09:09 PM   #7
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,126

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
@newtolinux2020, you need to understand any solutions you find - use them to learn if you don't. Use it as a base for your own solutions as shown above.
You also need to think about what you care about in your data - better to just toss the zero records before complicating things unnecessarily IMHO. Personally I would use a tool that has the logic to handle the summing as it processes the data - awk, perl, python, ... pick your favourite
 
1 members found this post helpful.
Old 04-05-2019, 09:25 PM   #8
newtolinux2020
LQ Newbie
 
Registered: Apr 2019
Posts: 8

Original Poster
Rep: Reputation: Disabled
Thank you all very much for the help. I'm still learning all this and you have given me a lot of new ideas so I can make my own script.
 
Old 04-06-2019, 12:43 PM   #9
newtolinux2020
LQ Newbie
 
Registered: Apr 2019
Posts: 8

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by scasey View Post
Please use [code] tags when posting code or output. Thanks!
I found this regexp on the 'net
Code:
(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])
it only matches 'valid' IP addresses, so that should elminate the 00.00.00.00

Pipe your output to
Code:
sort -u
to get a unique list. See man sort. That won't give you a count, tho. Pipe to just sort then iterate the resulting list and count the duplicates. You could also eliminate those beginning with 172 and 192 then, too.

So something like
Code:
grep -E -o (([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]) | \
sort | grep -v ^172 | grep -v ^192 > sortedIP.txt
to get the sorted, cleaned up list.
Then, loop sortedIP.txt to get the counts you want.
I was wondering if you could tell me where the file location goes in here as I'm just using it for learning and testing.
 
Old 04-06-2019, 01:33 PM   #10
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.9.2009
Posts: 5,727

Rep: Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211
Quote:
Originally Posted by newtolinux2020 View Post
I was wondering if you could tell me where the file location goes in here as I'm just using it for learning and testing.
Oops. Typo. It goes after the regexp.
Code:
grep <regexp> <sourcefile> | sort | grep -v <regexp> | grep -v <regexp> > outputfile
See man grep and man sort
 
Old 04-06-2019, 03:05 PM   #11
newtolinux2020
LQ Newbie
 
Registered: Apr 2019
Posts: 8

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by scasey View Post
Oops. Typo. It goes after the regexp.
Code:
grep <regexp> <sourcefile> | sort | grep -v <regexp> | grep -v <regexp> > outputfile
See man grep and man sort

So I should type that and it will sort it? I wont need the ((05) etc next to it.
 
Old 04-06-2019, 03:43 PM   #12
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.9.2009
Posts: 5,727

Rep: Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211
In #6 I left out the name of your input file. Add it after the regexp in that post.
Yes, you still need the regexp.
Code:
grep -E -o (([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4] [0-9]|25[0-5]) /root/pentbox-1.8/other/log_honeypot.txt | sort | grep -v ^172 | grep -v ^192 > sortedIP.txt

Last edited by scasey; 04-06-2019 at 04:00 PM.
 
Old 04-06-2019, 04:22 PM   #13
newtolinux2020
LQ Newbie
 
Registered: Apr 2019
Posts: 8

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by scasey View Post
In #6 I left out the name of your input file. Add it after the regexp in that post.
Yes, you still need the regexp.
Code:
grep -E -o (([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4] [0-9]|25[0-5]) /root/pentbox-1.8/other/log_honeypot.txt | sort | grep -v ^172 | grep -v ^192 > sortedIP.txt
Code:
grep <regexp> /root/pentbox-1.8/other/log_honeypot.txt | sort -u | grep -v <regexp> | grep -v <regexp> > test.txt
keeps giving this error

bash: syntax error near unexpected token `|'
 
Old 04-06-2019, 04:32 PM   #14
scasey
LQ Veteran
 
Registered: Feb 2013
Location: Tucson, AZ, USA
Distribution: CentOS 7.9.2009
Posts: 5,727

Rep: Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211Reputation: 2211
Put delimiters around the regexp.
Code:
grep -E -o "(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])" file...etc.
Yes, I'm not being good about my copy/paste, but as syg00 pointed out, you need to understand what's being given you, too.
Sorry.
 
Old 04-06-2019, 04:49 PM   #15
newtolinux2020
LQ Newbie
 
Registered: Apr 2019
Posts: 8

Original Poster
Rep: Reputation: Disabled
I managed to get it working now. It has stopped the duplicate IP addresses and forgets the 192, 178 and 00 addresses.

Thank you so much for helping me learn from this and I'll head on to make it count them up now
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] read txt file into an array and make a second txt file zimbot Linux - General 12 09-05-2015 01:39 PM
LXer: Honeypot Tutorials : Modes and Working of Honeypot LXer Syndicated Linux News 0 07-08-2013 05:20 AM
Copy the contents of a txt file to other txt files (with similar names) by cp command Aquarius_Girl Linux - Newbie 7 07-03-2010 12:54 AM
cat onelinefile.txt >> newfile.txt; cat twofile.txt >> newfile.txt keep newline? tmcguinness Programming 4 02-12-2009 06:38 AM
How can read from file.txt C++ where can save this file(file.txt) to start reading sam_22 Programming 1 01-11-2007 05:11 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 10:22 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration