Help with looking for similarities in two files
Hello,
I'm using bash on Mac Terminal. I have two files with sequences that look like, for example:
File 1:
ABCDE11111112345
BERSD222222223453
ADSFAFG243234123
File 2:
ABCDE11111112345
ABCDE3453454
ADSFASDFF12345
ADSFASDGAS34123
I want to find similar sequences that are not only completely the same (like ABCDE11111112345), but also those that are the same in the first 5 or the last five characters. For example, ABCDE3453454 would count as the same as ABCDE11111112345 because the first 5 characters are both ABCDE. ADSFASDGAS34123 and ADSFAFG243234123 would count as the same before the last 5 characters are both 34123. And I want to search for all the "same" sequences from the two files.
Is there a way to do such a search?
Thank you!
|