Welcome to the most active Linux Forum on the web.
Go Back > Forums > Linux Forums > Linux - Newbie
User Name
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!


  Search this Thread
Old 07-19-2012, 05:45 PM   #1
LQ Newbie
Registered: Jul 2012
Posts: 3

Rep: Reputation: Disabled
Help with looking for similarities in two files

I'm using bash on Mac Terminal. I have two files with sequences that look like, for example:
File 1:

File 2:

I want to find similar sequences that are not only completely the same (like ABCDE11111112345), but also those that are the same in the first 5 or the last five characters. For example, ABCDE3453454 would count as the same as ABCDE11111112345 because the first 5 characters are both ABCDE. ADSFASDGAS34123 and ADSFAFG243234123 would count as the same before the last 5 characters are both 34123. And I want to search for all the "same" sequences from the two files.

Is there a way to do such a search?

Thank you!
Old 07-19-2012, 06:29 PM   #2
Registered: May 2009
Location: Milan, Italy
Distribution: Ubuntu, Debian, Fedora, Oracle Linux
Posts: 109

Rep: Reputation: 10
Yes, it should be a way!
Do you mean: "do exist a way with system command / bash / programs?", do you?

I think that a similar "scanner" would require programming by yourself with a deep know of regular expression!
I would suggest you Perl as program language (and in that case I would happy to help you!).

Hope this helps!
Old 07-19-2012, 10:25 PM   #3
LQ Guru
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Ubuntu MATE, Mageia, and whatever VMs I happen to be playing with
Posts: 19,229
Blog Entries: 28

Rep: Reputation: 6115Reputation: 6115Reputation: 6115Reputation: 6115Reputation: 6115Reputation: 6115Reputation: 6115Reputation: 6115Reputation: 6115Reputation: 6115Reputation: 6115
You might want to look at awk and diff and regular expressions (aka regex).

I don't know enough to tell you how to do the search--still trying to learn those--but those tools sound appropriate to this issue, if they are available on your Mac.

See man awk and man diff for more. Wikipedia has an article on regex.

More on awk:


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
generate a matrix of similarities between multiple files sasanthi Linux - Newbie 1 08-29-2011 07:06 PM
Comparing two linux files for diffirences and similarities. secondchanti Linux - Newbie 5 07-27-2010 01:37 AM
LXer: Similarities LXer Syndicated Linux News 0 06-04-2010 08:30 PM
BIOS | GRUB differences, similarities. Carsto Linux - Laptop and Netbook 11 12-05-2009 11:58 AM > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 06:58 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration