LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Shell script to read lines in a text file and filter user data (http://www.linuxquestions.org/questions/linux-newbie-8/shell-script-to-read-lines-in-a-text-file-and-filter-user-data-763384/)

srimal 10-21-2009 04:39 AM

Shell script to read lines in a text file and filter user data
 
hi all,

I have this file [myfile.txt] with some user data.

example:
$cat myfile.txt
FName|LName|Gender|Company|Branch|Bday|Salary|Age
aaaa|bbbb|male|cccc|dddd|19900814|15000|20|
eeee|asdg|male|gggg|ksgu|19911216|||
aara|bdbm|male|kkkk|acke|19931018||23|
asad|kfjg|male|kkkc|gkgg|19921213|14000|24|
aera|bprb|male|cccc|pppp||15000|20|
.
.
. // and so on


So what I want to do is to take out (to a file) the missing fields as following format:

<FName> <LName> <Company> Missing Field/s:<> <>

example output:

eeee asdg gggg Missing Field/s: Salary Age
aara bdbm kkkk Missing Field/s: Salary

CAN ANYONE HELP ME PLEASE !!!!!!!!!!!!!!!

druuna 10-21-2009 06:29 AM

Hi,

I do have to assume things to solve this:

1) fields 1, 2 and 4 (FName, LName and Company) are always present,
2) rest of the fields could be missing.

I came up with this:
Code:

#!/bin/bash

awk '
BEGIN { FS = "|"
        misFields = ""
 }
{
if ( $3 == "" ) { misFields=misFields" Gender" }
if ( $5 == "" ) { misFields=misFields" Branch" }
if ( $6 == "" ) { misFields=misFields" Bday" }
if ( $7 == "" ) { misFields=misFields" Salary" }
if ( $8 == "" ) { misFields=misFields" Age" }

if ( misFields != "" ) { print $1, $2, $4, "Missing Field/s:" misFields }
misFields=""
}
' myfile.txt

Test run with the data given in first post:
Quote:

$ cat myfile.txt
FName|LName|Gender|Company|Branch|Bday|Salary|Age
aaaa|bbbb|male|cccc|dddd|19900814|15000|20|
eeee|asdg|male|gggg|ksgu|19911216|||
aara|bdbm|male|kkkk|acke|19931018||23|
asad|kfjg|male|kkkc|gkgg|19921213|14000|24|
aera|bprb|male|cccc|pppp||15000|20|
$
$
$ ./blaat
eeee asdg gggg Missing Field/s: Salary Age
aara bdbm kkkk Missing Field/s: Salary
aera bprb cccc Missing Field/s: Bday
Hope this helps.

srimal 10-21-2009 07:47 AM

Thank you
 
Thank you very much....for your great help.
:D





Quote:

Originally Posted by druuna (Post 3727065)
Hi,

I do have to assume things to solve this:

1) fields 1, 2 and 4 (FName, LName and Company) are always present,
2) rest of the fields could be missing.

I came up with this:
Code:

#!/bin/bash

awk '
BEGIN { FS = "|"
        misFields = ""
 }
{
if ( $3 == "" ) { misFields=misFields" Gender" }
if ( $5 == "" ) { misFields=misFields" Branch" }
if ( $6 == "" ) { misFields=misFields" Bday" }
if ( $7 == "" ) { misFields=misFields" Salary" }
if ( $8 == "" ) { misFields=misFields" Age" }

if ( misFields != "" ) { print $1, $2, $4, "Missing Field/s:" misFields }
misFields=""
}
' myfile.txt

Test run with the data given in first post:


Hope this helps.


ghostdog74 10-21-2009 08:04 AM

Code:

awk 'BEGIN{FS="|"}
NR==1{
    for(o=1;o<=NF;o++){
        a[o]=$o
    }
    next
}
{
    missing="";f=0
    for (i=1;i<=NF;i++){
        if (!$i){
            missing=missing" "a[i]
            f=1
        }
    }
    if(f){ print $1,$2,$4,missing}
}' file


druuna 10-21-2009 08:10 AM

@ghostdog74:

Nice solution, but......

1) It prints all the lines, not just the faulty ones,
2) The "Missing Field/s:" part is missing in your output.

ghostdog74 10-21-2009 08:41 AM

Quote:

Originally Posted by druuna (Post 3727165)
1) It prints all the lines, not just the faulty ones,
2) The "Missing Field/s:" part is missing in your output.

1) is trivial to solve by rearranging where the print statement is.
2) i left the output un-formatted for the OP to do if he desires. What is important is the logic


All times are GMT -5. The time now is 12:16 PM.