LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 11-19-2008, 08:35 AM   #16
herveld
LQ Newbie
 
Registered: Nov 2008
Posts: 11

Original Poster
Rep: Reputation: 0

Quote:
Originally Posted by PTrenholme View Post

Output for your test files:
Code:
[tmp]$ gawk -v kmin=2 -v field=2 -f match2.awk file1 file2

Match:  file1   file2   Value
        4       3       bbb
        5       4       aaa
        6       5       ccc
Thanks again, looks pretty good!

Just wondering how come it gives no result for kmin=4 ?
Output should be :
Match: file1 file2 Value
3 2 aaa
4 3 bbb
5 4 aaa
6 5 ccc
 
Old 11-20-2008, 09:04 AM   #17
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,187

Rep: Reputation: 354Reputation: 354Reputation: 354Reputation: 354
Ah, my bad!

I started with the first program I wrote which matched single lines, and that logic constrains sequences to have unique values. I didn't change that part when I modified the code, so . . .

Anyhow, I'll have to think about it some more.

Note, please, that you're now seeing that programmers, like computers, do what they understand your problem to be, not what you think they understand. An example of why they're always complaining about clients "changing the specifications" during the course of a project. Most often, what they're really saying is that they and the client failed to understand each other at the start of the project.

Aw well, let me see if I can get it right. More later.
 
Old 11-24-2008, 09:29 PM   #18
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,187

Rep: Reputation: 354Reputation: 354Reputation: 354Reputation: 354
Sorry, this took a while.

Here's the output from my latest iteration on trying to guess what you needed. Note that I added a couple lines to your test "file2" so you could see what happens when one sequence on the letf matches several sequences on the right.

One thing that wasn't clear was whether or not you wanted to see only the maximal length matching sequences, or all matches of more than kmin characters. In this code I opted to show all the matching sequences.

First, the test files:
Code:
$ cat file1
1 111
2 222
3 aaa
4 bbb
5 aaa
6 ccc
7 333
$ cat file2
1 444
2 aaa
3 bbb
4 aaa
5 ccc
6 555
7 666
8 aaa
9 ccc
And here's the output produced:
Code:
$ gawk -f match5 -v field=2 -v kmin=2 file1 file2
File "file1" contained 7 records.
File "file2" contained 9 records.

7 sequences matched between "file1" and "file2":

        Sequence 1:
                "aaa"   3       2
                "bbb"   4       3

        Sequence 2:
                "aaa"   3       2
                "bbb"   4       3
                "aaa"   5       4

        Sequence 3:
                "aaa"   3       2
                "bbb"   4       3
                "aaa"   5       4
                "ccc"   6       5

        Sequence 4:
                "bbb"   4       3
                "aaa"   5       4

        Sequence 5:
                "bbb"   4       3
                "aaa"   5       4
                "ccc"   6       5

        Sequence 6:
                "aaa"   5       4
                "ccc"   6       5

        Sequence 7:
                "aaa"   5       8
                "ccc"   6       9
I also tried it on a couple of larger files. What I used were two chapters of one of Anvil's novels. Note the FS= setting in the command line to tell gawk to treat each word as a record. (Perhaps a quick and dirty check for plagiarism, eh? Although a "good" checker would need to be somewhat more sophisticated.)
Code:
$ gawk -f match5 -v RS="[[:punct:] ]" -v kmin=5 t1 t2
File "t1" contained 3153 records.                                        
File "t2" contained 6786 records.                                        

4 sequences matched between "t1" and "t2":

        Sequence 1:
                "of"    2389    4585
                "what"  2390    4586
                "appeared"      2391    4587
                "to"    2392    4588
                "be"    2393    4589

        Sequence 2:
                "opened"        2891    851
                "his"   2892    852
                "mouth" 2893    853
                "and"   2895    854
                "shut"  2896    855

        Sequence 3:
                "opened"        2891    851
                "his"   2892    852
                "mouth" 2893    853
                "and"   2895    854
                "shut"  2896    855
                "it"    2897    856

        Sequence 4:
                "his"   2892    852
                "mouth" 2893    853
                "and"   2895    854
                "shut"  2896    855
                "it"    2897    856
Here's the gawk program to does the above. Note that this code uses GNU awk extensions and, therefore, may not be portable to other awk implementations.
PHP Code:
###############################################################################
#
# genmatch - Replacement of match that (optionally) returns the matched
#            string(s) in an array
#
# WARNING: REGEXP must be a string, not a regular expression constant.
#
# Function return: Number of matches made
#
###############################################################################
function genmatch(TARGET,    # String in which to search for matches
          
REGEXP,    # Expression to use to identify target strings
          
MATCHED,    # Array to contain the matched strings
# Local variables
        
target,        # Local copy of TARGET, modified as needed
        
retv,        # "match" return array
        
i,        # Loop index
        
n)        # Number of matches made
{
  
# Clean out the return array
  
delete MATCHED;
  
# Create a local copy of the target string
  
target = (TARGET) ? TARGET : $0;
  
0;
  
# For each match in target . . .
  
while (match(targetREGEXPretv)) {
    
# Set the MATCHED value and increment the count
    
MATCHED[++n] = retv[0];
    
# And, for any matched sub-expressions, add them to MATCHED
    
0;
    while (
retv[++i]) {
      
MATCHED[n,i]=retv[i];
    }
    
# Now remove the matched expression from the target string so we can find
    # the next match, if any.
    
sub(REGEXP""target);
  }
  
# All done. Return the number of matched strings found
  
return n;
}
###############################################################################
#
# Function to find matched sequences two files
#
# "Global" variables used:
#
#    kmin    Minimum number of matched values needed to define a "sequence"
#    sep    Non-numeric character which does not occur in any match target
#    records    Number of records in each file
#    data    Match values extracted from each input file
#
# Note: More than one identical sequence may occur in any file,
#
# Function return: The number of matched sequences found
#
################################################################################
function match_seq(left,    # "left-hand" file name
           
right,    # "right-hand" file name
           
ret,        # Matched sequence(s)
    # Local variables
           
l_values,    # Left file string:line# pairs
           
nv,        # Number of values (work variable - meaning changes)
           
ns,        # Number of matched sequences found
           
test,    # Candidate sequence modified for search
           
source,    # Left file candidate sequence before modification
           
fields,    # match() return array
           
seq,        # Length of current sequence
           
seq_v,    # Values in current sequence
           
rows,    # Right file matched row number array
           
i,        # Loop index
           
j,        # Loop index
           
k,         # Temp integer
           
l,        # Loop index
           
ll,        # Loop index
           
n_l,        # Length of "left file"
           
n_r)        # Length of "right file"
{
  
# Get the number of match fields in the two files
  
n_l records[left];
  
n_r records[right];
  
# Initialize the number of matched sequences to zero
  
seq 0;
  
delete ret;

  
# Sanity check: Return "no match found" if either file is to short to contain ANY match sequence
  
if ((n_l kmin) || (n_r kmin)) {
    return 
seq;
  }

  
# Get the number of match values in the left file
  
nv split(data[left], l_valuessep);
  
  
# Find all sequences in "left" matching any sequences in "right"
  
test "";
  
source "";
  for (
2nv 1; ++l) {
    if (!
match(l_values[l], /^(.*):([[:digit:]]+)$/, fields)) { # This fills "fields" with the field value and record number
      
printf("match_seq: Fatal error: Record %d of \"%s\" (\"%s\") could not be parsed.\n"lleftl_values[l]);
      exit;
    }
    
source sep fields[1":" fields[2];
    
=  1;
    
values[k] = fields[1];
    
rows[k]   = fields[2];
    
test sep fields[1":([[:digit:]]+)"
    
for (ll 1ll nv; ++ll) {
      if (!
match(l_values[ll], /^(.*):([[:digit:]]+)$/, fields)) {
    
printf("match_seq: Fatal error: Record %d of \"%s\" (\"%s\") could not be parsed.\n"llleftgensub(sep"|","g",l_values[ll]));
    exit;
      }
      
test test sep fields[1":([[:digit:]]+)";
      
# No point in proceeding if the candidate sequence has failed.
      
if (data[right] !~ test sep) {
    break;
      }
      
source source sep fields[1":" fields[2];
      
values[++k] = fields[1];
      
rows[k]     = fields[2];
      if (
ll >= kmin 1) {
    
# Add this sequence to the set of matched sequences if we can find it in "right"
    
if (ns genmatch(data[right], test sepseq_v)) {
      for (
1<=ns; ++i) {
        ++
seq;
        for (
1seq_v[i,j]; ++j) {
          
ret["value"seqj] = values[j];
          
ret["row"seqj]   = rows[j];
          
ret[seq,j] = seq_v[i,j];
        }
      }
    }
      }
    }
  }
  return 
seq;
}

BEGIN {
  if (!
kmin)    kmin =3;    # Minimum number of consecutive matches (Default: 3)
  
if (!field)    field=0;    # Field on which to match (Default: Whole line)
  
sep  SUBSEP;        # Non-printing subscript separation character
  
if (out) {            # "csv" matched sequence output file (if specified)
    
printf("\"Left_File\",\"Right_File\",\"Left_Row\",\"Right_Row\",\"Value\"\n") > out;
  }
}

# Read the match field data from all input files into memory
{
  
value = $ field;                    # Value read from input record
  
if (value == ""next;                # Skip empty records;
  
records[FILENAME] = FNR;                # Input file names (and record count)
  
data[FILENAME] = data[FILENAMEsep value ":" FNR;    # Copy of input file match fields with record
                            # numbers appended
}

# At this point we've read all the input into memory
END {
  
# Count the input files and put their names in the "names" array
  
nfiles asorti(recordsname);
  
# Add a terminating separator to each match field copy
  
for (1<= nfiles; ++i) {
    
data[name[i]] = data[name[i]] sep;
    
printf("File \"%s\" contained %d records.\n"name[i], records[name[i]]);
  }
  
# Find the matched sequences in each pair of input files
  
for (1nfiles; ++i) {
    for (
1<= nfiles; ++j) {
      if (
nm match_seq(name[i], name[j], matched)) {
    
printf("\n%d sequences matched between \"%s\" and \"%s\":\n"nmname[i], name[j]);
    for (
1<= nm; ++k) {
      
printf("\n\tSequence %d:\n"k);
      for (
l=1matched[k,l]; ++l) {
        
printf("\t\t\"%s\"\t%d\t%d\n"matched["value",k,l], matched["row",k,l], matched[k,l]);
      }
    }
      }
    }
  }

 
Old 11-26-2008, 02:21 PM   #19
herveld
LQ Newbie
 
Registered: Nov 2008
Posts: 11

Original Poster
Rep: Reputation: 0
Looks great, thanks a lot !

Is there a way to output only the maximal length matching sequence ?

Hence for the "file1 and file2" tests files, only "Sequence 3" would be output.

Last edited by herveld; 11-26-2008 at 02:23 PM.
 
Old 11-26-2008, 04:58 PM   #20
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,187

Rep: Reputation: 354Reputation: 354Reputation: 354Reputation: 354
My point was, that if that were done, you'd miss the "Sequence 7," since the rows 8 and 9 I added to the end of your file 2 match a sub-sequence of "Sequence 3." The code could be changed to handle all that, but it would probably involve more "housekeeping" arrays.

On the other hand, if your (unstated so far) requirements would exclude such matches, than it might be fairly simple to change. Of course, there's another problem: What about short sequences and longer sequences (somewhere else in the same file) that match to the same sequence the the other file? (Of course the shorter sequence would only match to part of the sequence to which the longer sequence matched.)

Have you tried the code on your large files? I'm afraid that the execution time may increase geometrically with files size, and it took several second to compare the two chapters from Anvil's book, so 50,000 record files might take several minutes - or longer - to finish.
 
Old 11-27-2008, 05:04 PM   #21
herveld
LQ Newbie
 
Registered: Nov 2008
Posts: 11

Original Poster
Rep: Reputation: 0
In fact I don't really care about sub-sequences, only the maximal length matching sequence matters.

I tried your latest code on a 400,000 records file and a 300 records subset file and it took about an hour to complete on my macbook pro. I stated kmin=200 and got 7500 matching sequences (instead of only 1 maximal length matching sequence).
 
Old 11-27-2008, 07:58 PM   #22
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,187

Rep: Reputation: 354Reputation: 354Reputation: 354Reputation: 354
OK, but can you define what you mean by "match" and "maximal length?"

For example, "a b c d" matches "a b c e" as "a b c," but matches the first and second "a b c" in "a b c e d a b e a b c." So what should be reported as a match in the second case?

And consider "a b c d" against "a b c e x b c d y a b." Is "a b c" or "b d e" to be reported as the "maximal length" match?

A definition should be something like this:
Quote:
Let X be an ordered set of n elements (x[1], x[2], ... , x[n]) and Y another ordered set of m elements, (y[1], y[2], ..., y[m]) from some domain D.
Let S = {{(i,j)} | (x[i], x[i+1],..., x[j]) for i = 1, 2, ..., n-1 and j=i+1, i+2, ..., n}
and T = {{(r,t)} | (y[r], y[r+1],..., y[m]) for l = 1, 2, ..., m-1 and t=r+1, r+2, ..., m}

Denote by S(i,j) and T(r,t) specific elements of the S and T sets.

Then S(i,j) "matches" T(r,t) when j - i = t - r and x[i]=y[r], x[i+1]=y[r+1], ..., x[j] = y[t] and
there does not exist any element, S(u,v) of S such that S(i,j) is a subset of S(u,v) and . . .
The problem I have is finding out what, precisely, you want to use to fill in the ellipsis at the end of that sample definition.
 
Old 11-28-2008, 10:38 AM   #23
herveld
LQ Newbie
 
Registered: Nov 2008
Posts: 11

Original Poster
Rep: Reputation: 0
OK, let's say in case of equal length matches such as "a b c" in your first example and "a b c" & "b c d" in your second example, then all of these matching sequences should be reported, given that there is no longer matching sequences.

However, "a b", "b c", or "c d" should not be reported even if kmin=2.

Does that answer your question ?
 
Old 11-28-2008, 11:20 AM   #24
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,187

Rep: Reputation: 354Reputation: 354Reputation: 354Reputation: 354
OK, I'll give it a shot. (If my wife doesn't protest too much . . .)
 
Old 11-29-2008, 01:49 PM   #25
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,187

Rep: Reputation: 354Reputation: 354Reputation: 354Reputation: 354
OK, here's my attempt to read your mind.

First, here's four test files based on your two original files:
Code:
$ cat file1[/B]
1 111
2 222
3 aaa
4 bbb
5 aaa
6 ccc
7 333
$ cat file2
1 444
2 aaa
3 bbb
4 aaa
5 ccc
6 555
7 666
$ cat file3
1 444
2 aaa
3 bbb
4 aaa
5 ccc
6 555
7 666
8 aaa
9 ccc
$ cat file4
1 444
2 aaa
3 bbb
4 aaa
5 ccc
6 333
7 666
And heres the output. Please look closely at each pair of files to see if the results match what you expect.
Code:
$ gawk -f match7 -v field=2 -v kmin=2 file1 file2 file3 file4
File "file1" contained 7 records.
File "file2" contained 7 records.
File "file3" contained 9 records.
File "file4" contained 7 records.

1 sequence matched between "file1" and "file2":

        Sequence 1:
                "aaa"   3       2
                "bbb"   4       3
                "aaa"   5       4
                "ccc"   6       5

1 sequence matched between "file1" and "file3":

        Sequence 1:
                "aaa"   3       2
                "bbb"   4       3
                "aaa"   5       4
                "ccc"   6       5

1 sequence matched between "file1" and "file4":

        Sequence 1:
                "aaa"   3       2
                "bbb"   4       3
                "aaa"   5       4
                "ccc"   6       5
                "333"   7       6

1 sequence matched between "file2" and "file3":

        Sequence 1:
                "444"   1       1
                "aaa"   2       2
                "bbb"   3       3
                "aaa"   4       4
                "ccc"   5       5
                "555"   6       6
                "666"   7       7

1 sequence matched between "file2" and "file4":

        Sequence 1:
                "444"   1       1
                "aaa"   2       2
                "bbb"   3       3
                "aaa"   4       4
                "ccc"   5       5

2 sequences matched between "file3" and "file4":

        Sequence 1:
                "444"   1       1
                "aaa"   2       2
                "bbb"   3       3
                "aaa"   4       4
                "ccc"   5       5

        Sequence 2:
                "aaa"   8       4
                "ccc"   9       5
Here's the program listing:

Note: The CSV output option has NOT been tested.

Code:
$ cat match
###############################################################################
#
# genmatch - Replacement of match that (optionally) returns the matched
#            string(s) in an array
#
# WARNING: REGEXP must be a string, not a regular expression constant.
#
# Function return: Number of matches made
#
###############################################################################
function genmatch(TARGET,       # String in which to search for matches
                  REGEXP,       # Expression to use to identify target strings
                  MATCHED,      # Array to contain the matched strings
# Local variables
                target,         # Local copy of TARGET, modified as needed
                retv,           # "match" return array
                i,              # Loop index
                n)              # Number of matches made
{
  # Clean out the return array
  delete MATCHED;
  # Create a local copy of the target string
  target = (TARGET) ? TARGET : $0;
  n = 0;
  # For each match in target . . .
  while (match(target, REGEXP, retv)) {
    # Set the MATCHED value and increment the count
    MATCHED[++n] = retv[0];
    # And, for any matched sub-expressions, add them to MATCHED
    i = 0;
    while (retv[++i]) {
      MATCHED[n,i]=retv[i];
    }
    # Now remove the matched expression from the target string so we can find
    # the next match, if any.
    sub(REGEXP, "", target);
  }
  # All done. Return the number of matched strings found
  return n;
}
###############################################################################
#
# Function to find matched sequences two files
#
# "Global" variables used:
#
#       kmin    Minimum number of matched values needed to define a "sequence"
#       sep     Non-numeric character which does not occur in any match target
#       records Number of records in each file
#       data    Match values extracted from each input file
#
# Note: More than one identical sequence may occur in any file,
#
# Function return: The number of matched sequences found
#
################################################################################
function match_seq(left,        # "left-hand" file name
                   right,       # "right-hand" file name
                   ret,         # Matched sequence(s) structure
        # Local variables
                   l_values,    # Left file (string:line#) pairs
                   nv,          # Number of values (work variable - meaning changes)
                   ns,          # Number of matched sequences found
                   prior,       # Values in the sequence matched so far
                   source,      # Left file candidate sequence before modification to "test"
                   prior_source,# Value of "source" prior to last candidate sequence value addition
                   test,        # Candidate sequence modified for search
                   prior_test,  # Prior value of "test"
                   fields,      # match() return array
                   seq,         # Length of current candidate sequence
                   seq_v,       # Values in current candidate sequence
                   rows,        # Right file matched row number array
                   i,           # Loop index
                   j,           # Loop index
                   k,           # Temp integer
                   l,           # Loop index
                   ll,          # Loop index
                   n_l,         # Length of "left file"
                   n_r)         # Length of "right file"
{
  # Get the number of match fields in the two files
  n_l = records[left];
  n_r = records[right];
  # Initialize the number of matched sequences to zero
  seq = 0;
  delete ret;

  # Sanity check: Return "no match found" if either file is to short to contain ANY match sequence
  if ((n_l < kmin) || (n_r < kmin)) {
    return seq;
  }

  # Get the number of match values in the left file (plus 2, since there is a "sep" at each end of "data[left]")
  nv = split(data[left], l_values, sep);

  # Find all maximal sequences in "left" matching any sequences in "right"
  for (l = 2; l < nv - 1; ++l) {
    if (!match(l_values[l], /^(.*):([[:digit:]]+)$/, fields)) { # This fills "fields" with the field value and record number
      # This section should be unreachable for correctly prepared input
      printf("match_seq: Fatal error: Record %d of \"%s\" (\"%s\") could not be parsed.\n", l, left, l_values[l]);
      exit;
    }
    source = sep fields[1] ":" fields[2];
    prior_source = "";
    k =  1;
    values[k] = fields[1];
    rows[k]   = fields[2];
    test = sep fields[1] ":([[:digit:]]+)"
    prior_test = test;
    for (ll = l + 1; ll < nv; ++ll) {
      if (!match(l_values[ll], /^(.*):([[:digit:]]+)$/, fields)) {
        # This section should be unreachable if the input is prepared correctly
        printf("match_seq: Fatal error: Record %d of \"%s\" (\"%s\") could not be parsed.\n", ll, left, gensub(sep, "|","g",l_values[ll]));
        exit;
      }
      prior_test = test;
      test = test sep fields[1] ":([[:digit:]]+)";
      # Exit the inner loop if this candidate sequence fails to match in data[right]}
      if (data[right] !~ test sep) {
        test = prior_test;
        source = prior_source;
        --k;
        break;
      }
      prior_source = source;
      source = source sep fields[1] ":" fields[2];
      values[++k] = fields[1];
      rows[k]     = fields[2];
    }
    # Is the candidate long enough?
    if (k + 2 > kmin) {
    # O.K., we've found a live one. Add it to the return structure.
      if (ns = genmatch(data[right], test sep, seq_v)) {
        for (i = 1; i <=ns; ++i) {
          ++seq;
          for (j = 1; seq_v[i,j]; ++j) {
            ret["value", seq, j] = values[j];
            ret["row", seq, j]   = rows[j];
            ret[seq,j] = seq_v[i,j];
          }
        }
        # Don't look for matches from inside a matched sequence.
        l = l + k;
      }
      else {
      # This section should never be reached since we've already verified the existence of at least one match
        printf("match_seq: Fatal error: Confirmed sequence \"%s\" not found in \"%s\".\n",
                gensub(/:\(\[\[:digit:\]\]\+\)\+\|/, "|", "g", gensub(sep, "|", "g", test sep)), right);
        exit;
      }
    }
  }
  return seq;
}

BEGIN {
  if (!kmin)    kmin =3;        # Minimum number of consecutive matches (Default: 3)
  if (!field)   field=0;        # Field on which to match (Default: Whole line)
  sep  = SUBSEP;                # Non-printing subscript separation character
  if (out) {                    # "csv" matched sequence output file (if specified)
    printf("\"Left_File\",\"Right_File\",\"Left_Row\",\"Right_Row\",\"Value\"\n") > out;
  }
}

# Read the match field data from all input files into memory
{
  value = $ field;                                      # Value read from input record
  if (value == "") next;                                # Skip empty records;
  records[FILENAME] = FNR;                              # Input file names (and record count)
  data[FILENAME] = data[FILENAME] sep value ":" FNR;    # Copy of input file match fields with record
                                                        # numbers appended
}

# At this point we've read all the input into memory
END {
  # Count the input files and put their names in the "names" array
  nfiles = asorti(records, name);
  # Add a terminating separator to each match field copy
  for (i = 1; i <= nfiles; ++i) {
    data[name[i]] = data[name[i]] sep;
    printf("File \"%s\" contained %d records.\n", name[i], records[name[i]]);
  }
  # Find the matched sequences in each pair of input files
  for (i = 1; i < nfiles; ++i) {
    for (j = i + 1; j <= nfiles; ++j) {
      if (nm = match_seq(name[i], name[j], matched)) {
        printf("\n%d sequence%s matched between \"%s\" and \"%s\":\n", nm, (nm==1)?"":"s", name[i], name[j]);
        for (k = 1; k <= nm; ++k) {
          printf("\n\tSequence %d:\n", k);
          for (l=1; matched[k,l]; ++l) {
            printf("\t\t\"%s\"\t%d\t%d\n", matched["value",k,l], matched["row",k,l], matched[k,l]);
          }
        }
      }
    }
  }
}
 
Old 12-01-2008, 03:35 PM   #26
herveld
LQ Newbie
 
Registered: Nov 2008
Posts: 11

Original Poster
Rep: Reputation: 0
Looks like a winner, thank you so much!

You are the king of gawk!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Find/grep command to find matching files, print filename, then print matching content stefanlasiewski Programming 9 06-30-2016 05:30 PM
LXer: Finding Overlapping Matches Using Perl's Lookahead Assertion Matching On Linux LXer Syndicated Linux News 0 09-09-2008 08:11 AM
Delete files containing matching text hessodreamy Linux - Newbie 11 05-21-2008 09:05 PM
help with matching input with 2 different files in php ice99 Programming 1 12-09-2006 08:55 PM
Remembering patterns and printing only those patterns using sed bernie82 Programming 5 05-26-2005 05:18 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:19 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration