LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-08-2012, 09:02 PM   #16
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled

%%%%%

Last edited by Trd300; 05-01-2012 at 05:34 AM.
 
Old 03-09-2012, 12:05 AM   #17
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
%%%%%

Last edited by Trd300; 05-01-2012 at 05:34 AM.
 
Old 03-09-2012, 12:51 AM   #18
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
%%%%%

Last edited by Trd300; 05-01-2012 at 05:34 AM.
 
Old 03-09-2012, 10:39 AM   #19
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,255

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
Script:
Code:
#!/usr/bin/awk -f

BEGIN{ FS = "\t+" 
	OFS = "|" }

NR > 1 && $1 ~ /-/ && $3 ~ /-/ && $4 ~ /-/{
    f4 = f3 = ""
    split($1,c1,"-")
    
    n = split($3,c3," *; *")
    o = split($4,c4," *; *")
    
    for( i = 1; i <= n; i++){
	split(c3[i], p, "[ -]")
	
	if( p[2] >= c1[1] && p[2] <= c1[2] && p[3] >= c1[1] && p[3] <= c1[2])
	    f3 = f3 (f3?"; ":"")c3[i]
    }

    for( i = 1; i <= o; i++){
	split(c4[i], q, "[ -]")
	
	if ( ( c1[1] >= q[2] && c1[1] <= q[3] ) || ( c1[2] >= q[2] && c1[2] <= q[3] ) )
	    f4 = f4 (f4?"; ":"")c4[i]
    }

    $3 = f3    
    $4 = f4    
}
1
Data:
Code:
Field1	Field2	Field3	Field4
5-10	A2	AGE 6-8 text 1.; AGE 7-15 text 2.	SIZE 1-20 text 3.; SIZE 9-18 text 4.
12-22	B2	AGE 3-8 text 5.; AGE 14-20 text x.	SIZE 10-19 text 6.; SIZE 10-11 text 7.; SIZE 23-28 text 8.
20-40	C2	AGE 10-50 text q.; AGE 15-30 text w.; AGE 30-41 text e.; AGE 25-30 text r.	SIZE 34-46 text t.; SIZE 12-17 text y.; SIZE 25-36 text u.; SIZE 2-100 text i.; SIZE 16-22 text o.; SIZE 43-57 text p.
Output:
Code:
Field1	Field2	Field3	Field4
5-10|A2|AGE 6-8 text 1.|SIZE 1-20 text 3.; SIZE 9-18 text 4.
12-22|B2|AGE 14-20 text x.|SIZE 10-19 text 6.
20-40|C2|AGE 25-30 text r.|SIZE 34-46 text t.; SIZE 2-100 text i.; SIZE 16-22 text o.
Using "|" to show fields better
 
1 members found this post helpful.
Old 03-09-2012, 10:40 PM   #20
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
Thanks a lot grail, it works like a charm !

I didn't think at all to look at the separator in the split option...

However, I am curious.
* What does the "+" mean in
Code:
BEGIN { FS="\t+"
* Why you can replace "print" at the end by "1"?

Thanks again, you've been a great teacher !
 
Old 03-10-2012, 03:02 AM   #21
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,255

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
* and + are part of the regex:

* - zero or more

+ - one or more

As we know that all fields are separated by at least a single tab we want one or more.

As for the "1", any boolean truth will cause awk to print. As we want to print all lines it is the easiest form instead of:
Code:
{ print }
 
1 members found this post helpful.
Old 03-10-2012, 03:44 AM   #22
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
Ok, I understand now.

Thanks !

Last question (I swear it is the last one...), more general and totally apart from this script.
This line:
Code:
NR > 1 && $1 ~ /-/ && $3 ~ /-/ && $4 ~ /-/
means for all lines starting from the 2nd one, we want fields 1, 3 and 4 to match with "-", right ?

How do you write that you want a field matching a "space", or "nothing" in case the field is empty?
Is it correct to write it like that:
Code:
$1 ~ /[ ""]/

Last edited by Trd300; 03-10-2012 at 03:50 AM.
 
Old 03-10-2012, 06:44 AM   #23
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,255

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
Quote:
means for all lines starting from the 2nd one, we want fields 1, 3 and 4 to match with "-", right ?
Yes but I would say that the three fields mentioned "contains" a '-' ... probably picky but in my mind if you said a field "matched" a '-' I would think that is all that is in the field,
ie $3 == "-"
Quote:
How do you write that you want a field matching a "space", or "nothing" in case the field is empty?
For a space you can just use / /, but nothing is not really applicable unless you say that all delimiters are at an exact measurement, eg. if you say that there is exactly one comma between each
field then 2 commas side by side would indicate a empty field. To look for that you would then just need:
Code:
$1 == ""
So to look for spaces or empty:
Code:
$1 == "" || $1 ~ / /
This would also allow $1 to look like:
Code:
word1 word2
Then you could look for:
Code:
$1 == "" || $1 ~ /^ $/
To combine and look for multiple spaces you can do:
Code:
$1 ~ /^ *$/
 
1 members found this post helpful.
Old 03-10-2012, 07:27 PM   #24
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
Thanks grail !

This chat was very helpful.
 
Old 03-11-2012, 12:55 AM   #25
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
%%%%%

Last edited by Trd300; 05-01-2012 at 05:35 AM.
 
Old 03-11-2012, 01:31 AM   #26
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,255

Rep: Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686Reputation: 2686
I to found that tabs can be an iffy choice when using the empty logic. General reason seems to be (for me) that awk thinks of 8 spaces as a single tab where my editor uses 4. This may also be your problem if your data does not have the correct equivalent space inbetween items.

So looking at the data you supplied, I had to make it look like the following:
Code:
5-10	A2	AGE 6 8 text 1.; AGE 7 15 text 2.	SIZE 1 20 text 3.; SIZE 9 18 text 4.
12-22	B2	AGE 3 8 text 5.; AGE 14 20 text x.	SIZE 10 19 text 6.; SIZE 10 11 text 7.; SIZE 23 28 text 8.
20-40	C2		SIZE 34 46 text t.; SIZE 12 17 text y.; SIZE 25 36 text u.; SIZE 2 100 text i.; SIZE 16 22 text o.; SIZE 43 57 text p.
Here you will note the extra gap between C2 and the next lot of text. When you run the following:
Code:
awk -F"\t" '{print NF}' file
The output from this is that each line now has 4 fields. try this and see if it helps answer your questions?
 
Old 03-11-2012, 05:18 AM   #27
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
I don't really see any difference between your input and mine.

But I tried to add an extra gap for the empty field, and it doesn't change anything !
It is exactly the same as my previous post, for all conditions I tested before.

usually I don't have any problem with tab/space with my editor, it is weird...
 
Old 03-11-2012, 06:58 AM   #28
Trd300
Member
 
Registered: Feb 2012
Posts: 89

Original Poster
Rep: Reputation: Disabled
%%%%%

Last edited by Trd300; 05-01-2012 at 05:35 AM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how to find the 1 number out of 10 numbers, if I have taken 9 numbers from my brain ? rpittala Linux - Newbie 4 01-30-2012 06:40 PM
[SOLVED] find the total of numbers that are higher than x in a text file with numbers (using awk??) Mike_V Programming 12 11-24-2010 10:51 AM
sequence of numbers, how to extract which numbers are missing jonlake Programming 13 06-26-2006 04:28 AM
print openoffice in arabic numbers error reaky Linux - General 0 06-02-2004 10:51 AM
Adding numbers, break on non-numbers... Cruger Programming 1 03-22-2004 10:18 AM


All times are GMT -5. The time now is 06:33 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration