LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-29-2009, 07:12 AM   #1
srimal
LQ Newbie
 
Registered: Oct 2009
Posts: 5

Rep: Reputation: 0
Question Shell script for read user data with emptyLines in a text file and filter them


Hi all,
I have a file filled with user data following manner.The user details are arrange not in ordered manner and separated with an empty line.So it is needed to filter out them as below.

$cat myfile.txt
Code:
userid: ASDFG
userNo: 200834570000Z
SupName: ALSKFJDENC2
PostDetail: SUP_COLOMBO_INCIDENT_MAN|20080827|99991231|
PostDetail: SUP_KANDY_INCIDENT_MAN|20080827|99991231|
creator : ddddd

userid: KMVNBBCMX
SupName: KSJMNCBBXW3
creator: ssss
userNo: 209738270000Z
PostDetail: SUP_KEGALLE_INCIDENT_MAN|20080827|99991231|

userNo: 234235345358Z
SupName: MLAHSRTXVV4
userid: MLDKSURNTNVMXJ
PostDetail: SUP_KEGALLE_INCIDENT_MAN|20080827|99991231|
OutPut format:
Quote:
userid|SupName|userNo|creator|PostDetail(first part)|PostDetail(first part)
My Expected O/P:
Quote:
ASDFG|ALSKFJDENC2|200834570000Z|ddddd|SUP_COLOMBO_INCIDENT_MAN|SUP_KANDY_INCIDENT_MAN
KMVNBBCMX|KSJMNCBBXW3|209738270000Z|ssss|SUP_KEGALLE_INCIDENT_MAN
MLDKSURNTNVMXJ|MLAHSRTXVV4|234235345358Z||SUP_KEGALLE_INCIDENT_MAN
Can any one guide me to a solution please?
Thanks!

Last edited by srimal; 10-29-2009 at 07:15 AM.
 
Old 10-29-2009, 08:39 AM   #2
AngTheo789
Member
 
Registered: Sep 2009
Posts: 110

Rep: Reputation: 24
A shell script usually means something scripted in Bash or Ksh etc, but you can actually use many other scripting languages as well (like PHP, Perl or Python) and launch these on the command shell. You need to decide what scripting language you want to use - usually the one you know best.
 
Old 10-29-2009, 01:22 PM   #3
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,576
Blog Entries: 31

Rep: Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195
Code:
#!/bin/bash
shopt -s extglob

PostDetail=''
while read line
do
    buf="${line%%:*}"
    keyword="${buf%% *}"
    buf="${line#*:}"
    data="${buf##*( )}"
    case "$keyword" in
        'userid' )
            userid="$data"
            ;;
        'userNo' )
            userNo="$data"
            ;;
        'SupName' )
            SupName="$data"
            ;;
        'PostDetail' )
            PostDetail="$PostDetail|${data%%|*}"
            ;;
        'creator' )
            creator="$data"
            ;;
        '' )
            echo "$userid|$SupName|$userNo|$creator$PostDetail"
            PostDetail=''
            ;;
    esac
done < input.txt
echo "$userid|$SupName|$userNo|$creator$PostDetail"
 
Old 10-30-2009, 06:51 AM   #4
srimal
LQ Newbie
 
Registered: Oct 2009
Posts: 5

Original Poster
Rep: Reputation: 0
Smile RE:

What is meant by
Code:
shopt -s extglob
I got an error like
Code:
shopt: not found
For out put, I got the last user has |ssss| entry instead of ||.
Any way Many Thanks for the guide. That'll be a great base for me to go ahead.
 
Old 10-30-2009, 09:57 AM   #5
bartonski
Member
 
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Blog Entries: 1

Rep: Reputation: 47
Quote:
Originally Posted by AngTheo789 View Post
A shell script usually means something scripted in Bash or Ksh etc, but you can actually use many other scripting languages as well (like PHP, Perl or Python) and launch these on the command shell. You need to decide what scripting language you want to use - usually the one you know best.
All of the languages Theo mentioned have a data structure called a 'hash' or 'associative array', which is particularly well suited to this type of text manipulation. A hash is essentially an array which uses text fields to index the members of the array, rather than numbers. You lose the built in ordering of the array, but get an enormous amount of flexibility in return.

Here's my implementation in Perl:

Code:
#! /usr/bin/perl

sub printd {
    my @d;
    for $key ( qw(userid SupName userNo creator PostDetail) ){
        $h{$key} ||= "";
        push @d, $h{$key};
    }
    print join( '|', @d ) . "\n";
}

while(<>){
    chomp;
    if( ! /:/ ){
        printd;
        %h = (); # clear data hash after we've printed a section.
    } elsif ( /^PostDetail: ([^|]+)\|/ ) {
        $h{"PostDetail"} = defined $h{"PostDetail"} ? $h{"PostDetail"} . "|$1" : $1;
    } else {
        $_ =~ /^([^:]+): ?(.*)/;
        $h{$1} = $2;
    }
}
printd;
In this case '%h' is my hash. The key to the hash (ie the string which acts as an index), is the word before the colon, in each case.

The strength of this approach is that you don't have to create an enormous case statement to handle each one of your fields. With a limited data set such as yours, this is not a tremendous win, but if you had a few hundred fields, you would only have to expand the qw() expression within printd().

Note, for the sake of brevity, I've removed a number of perl's safety features such as 'use strict' and 'use warnings'. Don't do that in any code that you care about.
 
Old 10-31-2009, 12:05 PM   #6
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,576
Blog Entries: 31

Rep: Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195
Quote:
Originally Posted by srimal View Post
What is meant by
Code:
shopt -s extglob
I got an error like
Code:
shopt: not found
For out put, I got the last user has |ssss| entry instead of ||.
Any way Many Thanks for the guide. That'll be a great base for me to go ahead.
What is the first line of your script? #!/bin/bash or something else? Details of bash's shopt built-in are at the GNU Bash Reference Manual.

Here's a corrected version of the script
Code:
#!/bin/bash
shopt -s extglob

PostDetail=''
while read line
do
    buf="${line%%:*}"
    keyword="${buf%% *}"
    buf="${line#*:}"
    data="${buf##*( )}"
    case "$keyword" in
        'userid' )
            userid="$data"
            ;;
        'userNo' )
            userNo="$data"
            ;;
        'SupName' )
            SupName="$data"
            ;;
        'PostDetail' )
            PostDetail="$PostDetail|${data%%|*}"
            ;;
        'creator' )
            creator="$data"
            ;;
        '' )
            echo "$userid|$SupName|$userNo|$creator$PostDetail"
            PostDetail=''
            creator=''
            ;;
    esac
done < input.txt
echo "$userid|$SupName|$userNo|$creator$PostDetail"
 
Old 10-31-2009, 05:43 PM   #7
bartonski
Member
 
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Blog Entries: 1

Rep: Reputation: 47
... started composing this yesterday...

ps. Kudos to katkin; that is a fairly elegant piece of shell scripting. One thing that puzzles me: Just for kicks, I changed the hashbang line to '#! /bin/sh', and the script ran correctly. If I'm not mistaken, parameter expansion is not portable to posix, yet it works here... what's up with that?

Last edited by bartonski; 10-31-2009 at 05:51 PM.
 
Old 11-01-2009, 05:37 AM   #8
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,576
Blog Entries: 31

Rep: Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195Reputation: 1195
Quote:
Originally Posted by bartonski View Post
ps. Kudos to katkin; that is a fairly elegant piece of shell scripting. One thing that puzzles me: Just for kicks, I changed the hashbang line to '#! /bin/sh', and the script ran correctly. If I'm not mistaken, parameter expansion is not portable to posix, yet it works here... what's up with that?
Well, thank you bartonski, you are very kind and you raise an interesting question ...

According to the GNU Bash Reference Manual's section on invoking bash, "If Bash is invoked with the name sh, it tries to mimic the startup behavior of historical versions of sh as closely as possible, while conforming to the posix standard as well". That's quite a juggling act because the historical Bourne Shell is a long way from POSIX compliance!

The GNU Bash Reference Manual's section on bash POSIX mode lists many differences between bash in native mode and bash in POSIX mode but it does not mention parameter expansion (except regards $PS1 and $PS2), shopt (except regards $PATH), extglob or filename expansion (except regards ~ and redirection).

Surprising? I was surprised Why, then, does srimal's shell report "shopt: not found". Which shell is it? Will be interesting to learn what the first line of the script is.

BTW The shopt -s extglob is required for "${buf##*( )}"
Code:
c:~$ buf='    abc'
c:~$ echo "'${buf##*( )}'"
'    abc'
c:~$ shopt -s extglob
c:~$ echo "'${buf##*( )}'
'abc'
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Shell script to read lines in a text file and filter user data srimal Linux - Newbie 5 10-21-2009 08:41 AM
How to read data from file to use in shell script? ozymandias Linux - Newbie 7 10-27-2006 02:19 PM
How to read a single line from a text file into a shell script. SkipHuffman Linux - Software 2 08-16-2006 03:10 PM
PHP Shell Script - Read piped text dlublink Linux - Software 3 08-13-2005 02:57 PM
How to find and change a specific text in a text file by using shell script Bassam Programming 1 07-18-2005 08:15 PM


All times are GMT -5. The time now is 02:16 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration