LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-11-2015, 10:40 PM   #1
kumar23kan
LQ Newbie
 
Registered: Oct 2015
Posts: 7

Rep: Reputation: Disabled
commands to select range of information


Hai
I have a txt file with several numbers and characters separated by both the space and tabs. I have to select a line starting with one character in that line i have to look for numbers below say 3000, beyond which the entire line has to be deleted till it reaches the next line. can somebody help please.
 
Old 10-11-2015, 11:07 PM   #2
berndbausch
Senior Member
 
Registered: Nov 2013
Location: Tokyo
Distribution: A few
Posts: 3,909

Rep: Reputation: 1096Reputation: 1096Reputation: 1096Reputation: 1096Reputation: 1096Reputation: 1096Reputation: 1096Reputation: 1096
Quote:
Originally Posted by kumar23kan View Post
Hai
I have a txt file with several numbers and characters separated by both the space and tabs. I have to select a line starting with one character in that line i have to look for numbers below say 3000, beyond which the entire line has to be deleted till it reaches the next line. can somebody help please.
awk, grep, cut, sed are typically the commands you would use. Your description doesn't help me understand your line structure. Can you provide a few sample lines?
 
1 members found this post helpful.
Old 10-11-2015, 11:14 PM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 18,383

Rep: Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044
Sounds like homework.
 
1 members found this post helpful.
Old 10-11-2015, 11:41 PM   #4
kumar23kan
LQ Newbie
 
Registered: Oct 2015
Posts: 7

Original Poster
Rep: Reputation: Disabled
dear syg00, I am a biologist and it is not my homework. I am trying to organize information.

dear berndbausch
The content of the txt is as follows


rd15256 1.1361 0.3236 0.0692 81.3499 28.2168 5.3160 1006 TG 1006 RG2 rim55501 (3,-1) Ringer sample
rd15256 DOS= 2.255E+00 litl(DOS)= -2.855E+01 liter123= 1.407E+01 -2.105E+01 -2.157E+01 square.= 0.02
rd15256 gen_vecs_uvz(segmental): -0.0047 -0.0009 0.0112 0.0112 -0.0064 0.0050 0.0071 0.0095 0.0045
! Specimen rd: sample hills ABR ABR_orth DOS litl(DOS)
ring+ 1.1359 0.3236 0.0696 81.3296 28.2136 5.3503 1 2.2673 -3.04321E+01
hills+ 1.1358 0.3235 0.0698 81.3211 28.2122 5.3645 2 2.2801 -3.20847E+01
hills+ 1.1358 0.3235 0.0700 81.3127 28.2109 5.3787 3 2.2981 -3.43627E+01
hills+ 1.1357 0.3235 0.0702 81.3043 28.2096 5.3928 4 2.3214 -3.73644E+01
hills+ 1.1356 0.3235 0.0704 81.2959 28.2083 5.4069 5 2.3503 -4.12124E+01
hills+ 1.1355 0.3235 0.0706 81.2875 28.2070 5.4209 6 2.3850 -4.60633E+01
hills+ 1.1355 0.3235 0.0707 81.2792 28.2056 5.4349 7 2.4256 -5.21206E+01
hills+ 1.1354 0.3234 0.0709 81.2709 28.2043 5.4487 8 2.4725 -5.96533E+01
hills+ 1.1353 0.3234 0.0711 81.2627 28.2030 5.4625 9 2.5256 -6.90238E+01
hills+ 1.1352 0.3234 0.0713 81.2545 28.2017 5.4762 10 2.5851 -8.07300E+01
hills+ 1.1352 0.3234 0.0715 81.2464 28.2005 5.4897 11 2.6510 -9.54735E+01
hills+ 1.1351 0.3234 0.0716 81.2384 28.1992 5.5031 12 2.7235 -1.14272E+02
hills+ 1.1350 0.3234 0.0718 81.2304 28.1979 5.5164 13 2.8027 -1.38659E+02
hills+ 1.1349 0.3234 0.0720 81.2225 28.1967 5.5296 14 2.8884 -1.71049E+02
hills+ 1.1349 0.3233 0.0721 81.2147 28.1954 5.5426 15 2.9808 -2.15472E+02
hills+ 1.1348 0.3233 0.0723 81.2070 28.1942 5.5554 16 3.0799 -2.79198E+02

rd15257 1.1398 0.3159 0.0582 81.7857 27.5442 4.4724 1006 TG 1006 SD rim55501 (3,-1) Ringer sample
rd15257 DOS= 1.273E+00 litl(DOS)= -4.041E+00 liter123= 9.115E+00 -6.256E+00 -6.900E+00 square.= 0.10
rd15257 gen_vecs_uvz(segmental): -0.0009 0.0104 0.0052 -0.0074 -0.0044 0.0084 0.0119 -0.0020 0.0085
! Specimen rd: sample hills ABR ABR_orth DOS litl(DOS)
ring+ 1.1398 0.3163 0.0584 81.7800 27.5805 4.4883 1 1.2806 -4.10694E+00
hills+ 1.1398 0.3165 0.0585 81.7772 27.5985 4.4962 2 1.2899 -4.32799E+00
hills+ 1.1398 0.3167 0.0586 81.7744 27.6164 4.5041 3 1.3028 -4.68175E+00
hills+ 1.1398 0.3169 0.0587 81.7715 27.6343 4.5120 4 1.3195 -5.17392E+00
hills+ 1.1397 0.3171 0.0588 81.7687 27.6522 4.5198 5 1.3396 -5.80978E+00
hills+ 1.1397 0.3173 0.0589 81.7659 27.6700 4.5276 6 1.3632 -6.59338E+00
hills+ 1.1397 0.3175 0.0590 81.7631 27.6876 4.5354 7 1.3900 -7.52640E+00
hills+ 1.1397 0.3177 0.0591 81.7602 27.7052 4.5431 8 1.4197 -8.60637E+00
hills+ 1.1397 0.3179 0.0592 81.7574 27.7227 4.5508 9 1.4520 -9.82422E+00
hills+ 1.1396 0.3181 0.0593 81.7546 27.7400 4.5584 10 1.4867 -1.11608E+01
hills+ 1.1396 0.3183 0.0594 81.7518 27.7572 4.5660 11 1.5233 -1.25820E+01
hills+ 1.1396 0.3185 0.0595 81.7490 27.7743 4.5735 12 1.5615 -1.40319E+01
hills+ 1.1396 0.3187 0.0596 81.7461 27.7913 4.5810 13 1.6008 -1.54239E+01
hills+ 1.1396 0.3189 0.0597 81.7433 27.8081 4.5884 14 1.6409 -1.66272E+01
hills+ 1.1396 0.3191 0.0598 81.7405 27.8248 4.5957 15 1.6815 -1.74496E+01
hills+ 1.1395 0.3193 0.0599 81.7376 27.8414 4.6030 16 1.7225 -1.76129E+01
hills+ 1.1395 0.3195 0.0600 81.7348 27.8578 4.6102 17 1.7639 -1.67202E+01


This goes one to for several thousand lines I can manually curate but i think using linux commands might save lots of time
 
Old 10-11-2015, 11:56 PM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 18,383

Rep: Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044
Given that input what is the expected output. As stated, your initial post is so generic as to be meaningless.
 
Old 10-12-2015, 12:10 AM   #6
kumar23kan
LQ Newbie
 
Registered: Oct 2015
Posts: 7

Original Poster
Rep: Reputation: Disabled
the first line contains 1006 as the eighth character, I want remove all data which exceed 3000
 
Old 10-12-2015, 12:58 AM   #7
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 524

Rep: Reputation: 316Reputation: 316Reputation: 316Reputation: 316
Quote:
Originally Posted by kumar23kan View Post
the first line contains 1006 as the eighth character, I want remove all data which exceed 3000
You still haven't clearly described the problem. It looks like there are blocks of 20 or so lines, and your remark suggests that you want to filter blocks based on the eighth item of the first line of each block.

This could be done by a bash script with commands like read (to read lines), cut (to select one item from the line), a conditional statement like if [ $item -gt 3000 ], and echo (to write the lines you need to save). Put the appropriate code in a while loop to process blocks until you reach the end of the file, and inside that loop you could use another while loop to copy the desired lines to the output file until the next block is detected.
 
Old 10-12-2015, 01:43 AM   #8
Sardog
LQ Newbie
 
Registered: Jun 2014
Location: Comox, BC
Distribution: CrunchBang
Posts: 7

Rep: Reputation: Disabled
You might have better luck parsing this file if you wrote a python script. It is worth the effort.
 
Old 10-12-2015, 02:31 AM   #9
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 18,383

Rep: Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044
Quote:
Originally Posted by kumar23kan View Post
the first line contains 1006 as the eighth character, I want remove all data which exceed 3000
No, the eighth field contains 1006.
If it is greater than 3000, do you want to delete that entire line, and all lines down to the next blank line ?. Your terminology is just not logical.
 
Old 10-12-2015, 02:42 AM   #10
kumar23kan
LQ Newbie
 
Registered: Oct 2015
Posts: 7

Original Poster
Rep: Reputation: Disabled
yes the eighth field if it exceeds i would like to delete that line and following line till it reaches the second group. I am not expertised in writing scripts in python scripts, i am just a beginner

Last edited by kumar23kan; 10-12-2015 at 02:44 AM.
 
Old 10-12-2015, 02:47 AM   #11
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 13,498

Rep: Reputation: 4313Reputation: 4313Reputation: 4313Reputation: 4313Reputation: 4313Reputation: 4313Reputation: 4313Reputation: 4313Reputation: 4313Reputation: 4313Reputation: 4313
if not python, you can use awk, perl or other language. What do you prefer (I mean which one did you try already, what can you handle easier?)
 
Old 10-12-2015, 02:52 AM   #12
berndbausch
Senior Member
 
Registered: Nov 2013
Location: Tokyo
Distribution: A few
Posts: 3,909

Rep: Reputation: 1096Reputation: 1096Reputation: 1096Reputation: 1096Reputation: 1096Reputation: 1096Reputation: 1096Reputation: 1096
To remove all lines whose eighth field is greater than 3000, then following awk program would be a solution:
Code:
awk '$8 <= 3000 { print }' nameofyourdatafile
awk programs are a series of condition-action pairs. Here, the condition is "field 8 is up to 3000", and the action is obviously to print such a line.

If you only want to delete the lines that start with "rd", for example, and whose 8th field is greater than 3000:
Code:
awk '/^rd/ && $8 > 3000 { next  }
                        { print }' nameofyourdatafile
The first condition is "line starts with rd and has an 8th field greater than 3000". The corresponding action is to skip to the next line, in other words, do nothing for the current line.
The second condition is empty and matches all lines.

I have to say though, I still don't understand what you want to achieve.

Last edited by berndbausch; 10-12-2015 at 02:54 AM.
 
1 members found this post helpful.
Old 10-12-2015, 03:46 AM   #13
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 18,383

Rep: Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044Reputation: 3044
Slight modification to delete the block to the next null line, which is maybe what the OP wants.
Code:
awk '$8 <= 3000 { print $0"\n" }' RS='' nameofyourdatafile > reduceddatafile
 
Old 10-12-2015, 05:39 AM   #14
kumar23kan
LQ Newbie
 
Registered: Oct 2015
Posts: 7

Original Poster
Rep: Reputation: Disabled
syg00 and berndbausch thanks for the help...
 
Old 10-12-2015, 07:02 AM   #15
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,760

Rep: Reputation: 3050Reputation: 3050Reputation: 3050Reputation: 3050Reputation: 3050Reputation: 3050Reputation: 3050Reputation: 3050Reputation: 3050Reputation: 3050Reputation: 3050
Or if you like:
Code:
awk '$8 <= 3000' RS='' ORS='\n\n' nameofyourdatafile > reduceddatafile
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Using awk command to select date range to ftp ibarra Linux - Newbie 12 04-18-2013 02:41 PM
Commands to change authentication information neel_learning_linux Debian 1 11-21-2010 05:12 AM
Awk script to select range from file jeesun Linux - General 8 11-26-2009 05:46 AM
How to select TCP_Ports Range sachinsharma10 Linux - Networking 4 09-06-2007 06:14 AM
Cannot find reliable information on shell commands Rakin Linux - Newbie 7 02-21-2003 11:55 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 05:03 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration