Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place! |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
10-29-2009, 06:12 AM
|
#1
|
LQ Newbie
Registered: Oct 2009
Posts: 5
Rep:
|
Shell script for read user data with emptyLines in a text file and filter them
Hi all,
I have a file filled with user data following manner.The user details are arrange not in ordered manner and separated with an empty line.So it is needed to filter out them as below.
$cat myfile.txt
Code:
userid: ASDFG
userNo: 200834570000Z
SupName: ALSKFJDENC2
PostDetail: SUP_COLOMBO_INCIDENT_MAN|20080827|99991231|
PostDetail: SUP_KANDY_INCIDENT_MAN|20080827|99991231|
creator : ddddd
userid: KMVNBBCMX
SupName: KSJMNCBBXW3
creator: ssss
userNo: 209738270000Z
PostDetail: SUP_KEGALLE_INCIDENT_MAN|20080827|99991231|
userNo: 234235345358Z
SupName: MLAHSRTXVV4
userid: MLDKSURNTNVMXJ
PostDetail: SUP_KEGALLE_INCIDENT_MAN|20080827|99991231|
OutPut format:
Quote:
userid|SupName|userNo|creator|PostDetail(first part)|PostDetail(first part)
|
My Expected O/P:
Quote:
ASDFG|ALSKFJDENC2|200834570000Z|ddddd|SUP_COLOMBO_INCIDENT_MAN|SUP_KANDY_INCIDENT_MAN
KMVNBBCMX|KSJMNCBBXW3|209738270000Z|ssss|SUP_KEGALLE_INCIDENT_MAN
MLDKSURNTNVMXJ|MLAHSRTXVV4|234235345358Z||SUP_KEGALLE_INCIDENT_MAN
|
Can any one guide me to a solution please?
Thanks!
Last edited by srimal; 10-29-2009 at 06:15 AM.
|
|
|
10-29-2009, 07:39 AM
|
#2
|
Member
Registered: Sep 2009
Posts: 110
Rep:
|
A shell script usually means something scripted in Bash or Ksh etc, but you can actually use many other scripting languages as well (like PHP, Perl or Python) and launch these on the command shell. You need to decide what scripting language you want to use - usually the one you know best.
|
|
|
10-29-2009, 12:22 PM
|
#3
|
LQ 5k Club
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
|
Code:
#!/bin/bash
shopt -s extglob
PostDetail=''
while read line
do
buf="${line%%:*}"
keyword="${buf%% *}"
buf="${line#*:}"
data="${buf##*( )}"
case "$keyword" in
'userid' )
userid="$data"
;;
'userNo' )
userNo="$data"
;;
'SupName' )
SupName="$data"
;;
'PostDetail' )
PostDetail="$PostDetail|${data%%|*}"
;;
'creator' )
creator="$data"
;;
'' )
echo "$userid|$SupName|$userNo|$creator$PostDetail"
PostDetail=''
;;
esac
done < input.txt
echo "$userid|$SupName|$userNo|$creator$PostDetail"
|
|
|
10-30-2009, 05:51 AM
|
#4
|
LQ Newbie
Registered: Oct 2009
Posts: 5
Original Poster
Rep:
|
RE:
What is meant by I got an error like For out put, I got the last user has |ssss| entry instead of ||.
Any way Many Thanks for the guide. That'll be a great base for me to go ahead.

|
|
|
10-30-2009, 08:57 AM
|
#5
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Rep:
|
Quote:
Originally Posted by AngTheo789
A shell script usually means something scripted in Bash or Ksh etc, but you can actually use many other scripting languages as well (like PHP, Perl or Python) and launch these on the command shell. You need to decide what scripting language you want to use - usually the one you know best.
|
All of the languages Theo mentioned have a data structure called a 'hash' or 'associative array', which is particularly well suited to this type of text manipulation. A hash is essentially an array which uses text fields to index the members of the array, rather than numbers. You lose the built in ordering of the array, but get an enormous amount of flexibility in return.
Here's my implementation in Perl:
Code:
#! /usr/bin/perl
sub printd {
my @d;
for $key ( qw(userid SupName userNo creator PostDetail) ){
$h{$key} ||= "";
push @d, $h{$key};
}
print join( '|', @d ) . "\n";
}
while(<>){
chomp;
if( ! /:/ ){
printd;
%h = (); # clear data hash after we've printed a section.
} elsif ( /^PostDetail: ([^|]+)\|/ ) {
$h{"PostDetail"} = defined $h{"PostDetail"} ? $h{"PostDetail"} . "|$1" : $1;
} else {
$_ =~ /^([^:]+): ?(.*)/;
$h{$1} = $2;
}
}
printd;
In this case '%h' is my hash. The key to the hash (ie the string which acts as an index), is the word before the colon, in each case.
The strength of this approach is that you don't have to create an enormous case statement to handle each one of your fields. With a limited data set such as yours, this is not a tremendous win, but if you had a few hundred fields, you would only have to expand the qw() expression within printd().
Note, for the sake of brevity, I've removed a number of perl's safety features such as 'use strict' and 'use warnings'. Don't do that in any code that you care about.
|
|
|
10-31-2009, 11:05 AM
|
#6
|
LQ 5k Club
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
|
Quote:
Originally Posted by srimal
What is meant by I got an error like For out put, I got the last user has |ssss| entry instead of ||.
Any way Many Thanks for the guide. That'll be a great base for me to go ahead.

|
What is the first line of your script? #!/bin/bash or something else? Details of bash's shopt built-in are at the GNU Bash Reference Manual.
Here's a corrected version of the script
Code:
#!/bin/bash
shopt -s extglob
PostDetail=''
while read line
do
buf="${line%%:*}"
keyword="${buf%% *}"
buf="${line#*:}"
data="${buf##*( )}"
case "$keyword" in
'userid' )
userid="$data"
;;
'userNo' )
userNo="$data"
;;
'SupName' )
SupName="$data"
;;
'PostDetail' )
PostDetail="$PostDetail|${data%%|*}"
;;
'creator' )
creator="$data"
;;
'' )
echo "$userid|$SupName|$userNo|$creator$PostDetail"
PostDetail=''
creator=''
;;
esac
done < input.txt
echo "$userid|$SupName|$userNo|$creator$PostDetail"
|
|
|
10-31-2009, 04:43 PM
|
#7
|
Member
Registered: Jul 2006
Location: Louisville, KY
Distribution: Fedora 12, Slackware, Debian, Ubuntu Karmic, FreeBSD 7.1
Posts: 443
Rep:
|
... started composing this yesterday...
ps. Kudos to katkin; that is a fairly elegant piece of shell scripting. One thing that puzzles me: Just for kicks, I changed the hashbang line to '#! /bin/sh', and the script ran correctly. If I'm not mistaken, parameter expansion is not portable to posix, yet it works here... what's up with that?
Last edited by bartonski; 10-31-2009 at 04:51 PM.
|
|
|
11-01-2009, 04:37 AM
|
#8
|
LQ 5k Club
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
|
Quote:
Originally Posted by bartonski
ps. Kudos to katkin; that is a fairly elegant piece of shell scripting. One thing that puzzles me: Just for kicks, I changed the hashbang line to '#! /bin/sh', and the script ran correctly. If I'm not mistaken, parameter expansion is not portable to posix, yet it works here... what's up with that?
|
Well, thank you bartonski, you are very kind  and you raise an interesting question ...
According to the GNU Bash Reference Manual's section on invoking bash, " If Bash is invoked with the name sh, it tries to mimic the startup behavior of historical versions of sh as closely as possible, while conforming to the posix standard as well". That's quite a juggling act because the historical Bourne Shell is a long way from POSIX compliance!
The GNU Bash Reference Manual's section on bash POSIX mode lists many differences between bash in native mode and bash in POSIX mode but it does not mention parameter expansion (except regards $PS1 and $PS2), shopt (except regards $PATH), extglob or filename expansion (except regards ~ and redirection).
Surprising? I was surprised  Why, then, does srimal's shell report "shopt: not found". Which shell is it? Will be interesting to learn what the first line of the script is.
BTW The shopt -s extglob is required for "${buf##*( )}"
Code:
c:~$ buf=' abc'
c:~$ echo "'${buf##*( )}'"
' abc'
c:~$ shopt -s extglob
c:~$ echo "'${buf##*( )}'
'abc'
|
|
|
All times are GMT -5. The time now is 01:35 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|