Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place! |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
11-27-2009, 01:11 AM
|
#1
|
|
LQ Newbie
Registered: Nov 2009
Posts: 8
Rep:
|
script to remove the lines which are having the duplicate value in 2 fields
How to write a script to remove the lines which are having the duplicate value in 1st field and last field and redirect to another file, the file name should have the current time stamp.
|
|
|
|
11-27-2009, 01:45 AM
|
#2
|
|
LQ 5k Club
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian Squeeze (server), Slackware 13.37 (netbook), Slackware64 14.0 (desktop),
Posts: 8,357
|
Please post a sample of the data, the desired output and what you have tried so far.
|
|
|
|
11-27-2009, 02:30 AM
|
#3
|
|
LQ Newbie
Registered: Nov 2009
Posts: 8
Original Poster
Rep:
|
I dint know how to proceed with writing the script as i am new to unix.
the sample data could be something like
1|a|aaaa|11
2|b|bbbb|222
3|c|cccc|333
2|d|dddd|333
3|e|aaaa|222
and the o/p i wish to get wud be
1|a|aaaa|11
2|b|bbbb|222
3|c|cccc|333
|
|
|
|
11-27-2009, 03:07 AM
|
#4
|
|
LQ Newbie
Registered: Nov 2009
Posts: 8
Original Poster
Rep:
|
also I need to preserve the order...and what if the fields i need to be checked(1st and last) for duplication are alpha numeric
|
|
|
|
11-27-2009, 03:17 AM
|
#5
|
|
Member
Registered: Apr 2009
Location: Bengaluru, India
Distribution: RHEL 5.4, 6.0, Ubuntu 10.04
Posts: 704
Rep:
|
Hello ajcapri,
I have just tried out a code....Hope its working just check
Code:
#!/bin/bash
file=$1
while read line
do
a=`echo $line | cut -d '|' -f 1`
b=`echo $line | cut -d '|' -f 4`
if [ `echo $b | grep $a` ]
then
echo $line >> temp
touch temp
fi
done < $file
Execute it this way...
Code:
sh filename.sh /path/to/the/file/where/pattern/is/stored
Hope it helps....
Cheers !!!
|
|
|
|
11-27-2009, 05:11 AM
|
#6
|
|
LQ Newbie
Registered: Nov 2009
Posts: 8
Original Poster
Rep:
|
how do i do de same in case i dont know de number of fields...in that situation how do i find de $n value for last field??
|
|
|
|
11-27-2009, 05:51 AM
|
#7
|
|
Member
Registered: Apr 2008
Location: HYD, INDIA.
Posts: 113
Rep:
|
Write below code in filename.sh
#!/bin/bash
file=$1
while read line
do
len=`echo ${#line}`
name1=`echo "${line:0:1}"`
name2=`echo "${line:$len-1:$len}"`
if [ $name1 == $name2 ]
then
echo $line >> temp
touch temp
fi
done < $file
Run this as follows :
sh filename.sh /path/to/the/file/where/pattern/is/store
Regards,
Nagendra Rednam
|
|
|
|
11-27-2009, 07:11 AM
|
#8
|
|
LQ Newbie
Registered: Nov 2009
Posts: 8
Original Poster
Rep:
|
i guess the codes u ppl have posted is fore comparing values in de first n last fields...i wat to delete records based on duplicate values in de first field and also based on last field...many thanks for ppl who ve been helping
|
|
|
|
11-27-2009, 07:15 AM
|
#9
|
|
LQ 5k Club
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian Squeeze (server), Slackware 13.37 (netbook), Slackware64 14.0 (desktop),
Posts: 8,357
|
Quote:
Originally Posted by ajcapri
i guess the codes u ppl have posted is fore comparing values in de first n last fields...i wat to delete records based on duplicate values in de first field and also based on last field...many thanks for ppl who ve been helping
|
Wot u "duplicate values in de first field"? Field wi XX wd hv dup X. K?
|
|
|
|
11-28-2009, 02:36 AM
|
#10
|
|
Member
Registered: Feb 2009
Posts: 340
Rep:
|
Try this awk script
you can put it into a file, say "awk_script" and your data in a file say "data".
BEGIN{FS="|"}
{
str=substr($NF,1,1)
if($1==str)
{
print $0
}
}
run it as follows
awk -f awk_script data
|
|
|
|
11-29-2009, 09:38 PM
|
#11
|
|
LQ Newbie
Registered: Nov 2009
Posts: 8
Original Poster
Rep:
|
by duplicates i mean..
abc|aaa|bbb|ccc
def|ddd|eee|fff
ghi|ddd|eee|ccc
abc|ggg|hhh|iii
ghi|rrr|sss|fff
for this sample data, i need the last 2 rows removed as the value in the first field ("abc" and "ghi") are already present in the previous scanned rows. Further more, i need the 3rd and lst row removed as the values present in the last field ("ccc" and "fff") are duplicates.
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 06:54 PM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|