Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I was wondering what command should I use to find duplicate names in a file. This is a descent sized file and I need to make sure that there are no duplicate names throughout the whole file. Thanks again for any help. Here is was I have so far and Im going to try using the sort command.
You'll need to sort the data in the file first, including the --unique flag for sort, or just sort it with no options and pipe it into uniq and use its -u flag.
It'd help if you gave a sample of the input file as it might need to be sanitised before either solution will work.
You can use the sort command with -u option to get unique records from a file.
Say you have a file called "list" with the following contents:
billy
bob
john
bob
ralph
You can see bob is in there twice. If you run "sort -u list" it will show only:
billy
bob
john
ralph
You could redirect that into a new file move it over the original.
Of course if you don't have the entire line the same on every occurrence the sort -u won't exclude it. So if your list had:
bob 10
bob 20
Both lines would be output because the second field is different.
Well the file contains only names but and I just want to make sure there are no duplicates. I dont think the sort command would work because the duplicates would still be there it just wouldnt show them. I want to see if there is any so I can later delete the duplicates.
Or that little awk-proggie (shamelessly snaffled from awks documentation):
Code:
# remove duplicate lines from unsorted data, e.g. history files,
# firewall rules, that kind of stuff
{
if (data[$0]++ == 0)
lines[++count] = $0
}
END {
for (i = 1; i <= count; i++)
print lines[i]
}
Its nicest feature is that it doesn't change the order in which
the records originally appear.
A better question is can the /etc/passwd file actually contain duplicate login names? I don't think it can, unless added manually, duplicate uid/gid sure, but I don't think the useradd/adduser command will let you dup the name.
It does allow duplicate login names. Remember that that names are only for humans, under the skin it's all done with nums (uid/gid).
Duplicate uids aren't recommended, but it can be done...
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.