ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Consider the following scenario. I have a file which has list of users e.g
jone
micheal
jone
jone
steve
adam
steve
Now as you can see this list has repetition as well . I need to remove repetition from this file as this file has around 100s of entries. Can i have any sample script. Please guide.
sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P' file
don't think OP will understand.
I'm not sure if there are 100 people in the WORLD who would understand..... They say that C gives you the power to write incomprehensible code. SED's pretty good at that too........
I am trying to understand that sed command (and regular expression). However, it seems that I need more time.
So far, I have found, that the script works with sed on Macintosh (most probably BSD one, sed -v or --version gives me an error). But gnu sed (on Penguin, Debian lenny and etch) version 4.1.5 does not. (even with --posix option)
$ sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P' file
I can decipher everything except the part in bold.
"[a-f]" means anything in the range of a thru f (it can also mean A thru F---it does on my system).
I assume that "[ -~]" is meant to mean everything from " " (space)to "~". After several experiments, I am finding that ranges that include more than alphas and digits can be ambiguous and unpredictable--if for no other reason than characters within a range can have a special meaning. I never seen anything about this in the books.
On the other hand, I skimmed Effective AWK Programming, and didn't see any warning, nor in Programming Perl, 3rd. Perhaps such warnings are taken for granted by the time one is ready for awk and perl ... cheers, makyo
I can't find my handy little reference but I believe [:print:] and [ -~] are the same thing.
So it would match any non control character, i.e. any ascii character not below char(32).
Last edited by /bin/bash; 03-02-2008 at 08:11 AM.
Reason: Turn off smilies.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.