LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Text file manipulation: Extracting specific rows according to numerical pattern (http://www.linuxquestions.org/questions/linux-newbie-8/text-file-manipulation-extracting-specific-rows-according-to-numerical-pattern-760273/)

CHARL0TTE 10-07-2009 05:46 AM

Text file manipulation: Extracting specific rows according to numerical pattern
 
Hi everyone,

I hope that someone can help.

I have a large number of text files which I need to extract specific rows of interest. The in-house software that generates these text files outputs lots of 'junk', so I just want to extract the relevant stuff. Fortunately, I have been able to initially script these files into a reasonably sensible sequence, so hopefully this is a simple problem for someone with a bit more experience than me!

So, I've got a text file where I only want to extract row 1, 4, 5, 8, 9, 12, 13, 16 etc (so row 1 + 3 + 1 + 3 etc). I know that I can use awk NR to select the rows of interest, but this is rather clunky and some files are longer than others, so far from ideal.

Can anyone help with something that takes advantage of the regular pattern in these data?

Many thanks,

Charlotte

catkin 10-07-2009 05:51 AM

Please show what you have tried so far and some sample data, preferably in code tags. Unless we know the format of the data we cannot suggest how to extract the necessary columns -- what defines a "column"?

CHARL0TTE 10-07-2009 06:21 AM

Hi Catkin,

I haven't come up with a solution, but could only fathom that awk NR might be somewhere in the right direction, but really not sure - my pocket guide doesn't seem to help on this one.

Here is a subsection of what my text files look like:

1 AXM234_1A 1 Picture S7_1000_9000 856325 99802 2 11935 5 0 11934
2 AXM234_1A 1 Picture S1_1000_9000 840356 4032 2 11935 5 0 11995
3 AXM234_1A 1 Picture S18_1000_9000 872293 4032 2 11935 5 0 11934
4 AXM234_1A 1 Picture P1_1_1_3_6 882623 4033 2 11935 5 0 11900
5 AXM234_1A 1 Picture S8_1000_9000 995334 99802 2 11935 5 0 11934
6 AXM234_1A 1 Picture S2_1000_9000 1011303 4032 2 11935 5 0 11995
7 AXM234_1A 1 Picture S3_1000_9000 1027271 2 11935 5 0 11934
8 AXM234_1A 1 Picture P4_2_1_2_4 1043408 2 11935 5 0 11900

I have generated the data in this format as I need to combine specific column data with rows 1, 4, 5, 8, 9, 12, 13, 16 etc as these items represent the 'on' and 'off' of the stimuli of interest. So, using the above example I want to extract all of the info in row 1, 4, 5, and 8. After I can do that, I think I know how to then combine the relevant column bits.

Many thanks,

Charlotte

catkin 10-07-2009 07:14 AM

That's a very straightforward job for awk, even if there are multiple spaces between the columns (as would have been visible using code tags). See this example.


All times are GMT -5. The time now is 12:16 PM.