LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-04-2015, 11:05 AM   #1
atjurhs
Member
 
Registered: Aug 2012
Posts: 190

Rep: Reputation: Disabled
diffing the line numbers


hi guys

i am trying to find the "size" of a "block" of data in LARGE data files, the example below test_data.txt is very simplified. by "size" i mean the difference in line numbers of a block, and the "size" will be constant throughout the file so

1234 6.600000 4321
1234 8.500000 4321
1234 1.800000 4321
1234 2.300000 4321
1234 8.500000 4321
1234 2.800000 4321

if i define a block as whenever i find 8.500000 in the second column, then in the example the the block size would be 3 becasue 8.500000 occurs on the 5th line and on the 2nd. right now i am using

Code:
 grep -n "8.500000" test_data.txt | cut -f1 -d:
and/or

Code:
 awk '/8.500000/ {print FNR}' test_data.txt
obviously i don't remeber how to tag text as code?

btw, the grep command is much much faster

both of these commands give an entire list (long list of number for files greater than a gig) of line numbers which i then have to subtract one from another to come up with 3 in the example. not that i'm opposed to doing math, but i would think awk or grep should be able to do this for me

ideas?

tabby

Last edited by atjurhs; 03-04-2015 at 12:47 PM.
 
Old 03-04-2015, 12:08 PM   #2
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Arch
Posts: 3,136

Rep: Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336
With grep:
Code:
#!/bin/sh
{ 
  grep -Fwm1 8.500000 >/dev/null # discard stdin until first "8.500000"
  grep -Fwnm1 8.500000 | cut -f1 -d: # output number of lines 'til the second one
} < test_data.txt
With awk:
Code:
awk '$2 == "8.500000" { 
  if (!start) {
    start = FNR;
  } else {
    print (FNR - start);
    exit;
  }}' test_data.txt
Quote:
obviously i don't remeber how to tag text as code?
http://www.linuxquestions.org/questi...do=bbcode#code

Last edited by ntubski; 03-04-2015 at 01:17 PM. Reason: echo $() is redundant
 
Old 03-04-2015, 12:52 PM   #3
atjurhs
Member
 
Registered: Aug 2012
Posts: 190

Original Poster
Rep: Reputation: Disabled
many thanks!

on the bash grep command how does the

Code:
 echo $(grep -Fwnm1 8.500000 | cut -f1 -d:)
give back the second occurence, is it the f1, and where is the subtraction done because there is no subtraction operecend i'm guessing it's done in the cut
 
Old 03-04-2015, 01:18 PM   #4
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Arch
Posts: 3,136

Rep: Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336Reputation: 1336
Quote:
Originally Posted by atjurhs View Post
no subtraction operecend i'm guessing it's done in the cut
No, the cut is just used for the same purpose as in your original post. There is actually no subtraction performed at all in the grep method. What happens is the first grep reads until it finds the first "8.50000", and the second grep continues where the first one left off and outputs the line number relative to where it started reading.



Oh, I just realized I did that thing where one uses echo on $(), which is completely redundant. Fixed.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] start the line only with numbers anshaa Linux - Newbie 9 08-13-2012 03:11 AM
[SOLVED] binary diffing pairs of files genderbender Programming 3 01-19-2011 07:21 AM
[SOLVED] Trying to number every other line and append those numbers to end of line kmkocot Programming 7 04-23-2010 11:17 AM
Grep's line numbers parsed into one line of output. judgex Programming 8 08-14-2006 04:22 AM
printing line numbers? fisheromen1031 Programming 1 07-27-2004 02:19 PM


All times are GMT -5. The time now is 11:47 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration