LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 04-15-2016, 05:22 AM   #1
qrange
Senior Member
 
Registered: Jul 2006
Location: Belgrade, Yugoslavia
Distribution: Debian stable/testing, amd64
Posts: 1,061

Rep: Reputation: 47
ascii file similarities


I have a large group of small text files (thousands). What is the easiest, fastest way to evaluate similarity of new file to any in group?
Can a 'neural network' be trained, so that its output is the similarity to the group?
Thanks.
 
Old 04-15-2016, 05:26 AM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,359

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
Well, first I'd want to know your definition of 'similar' (in this context!) is ...
 
Old 04-15-2016, 05:49 AM   #3
qrange
Senior Member
 
Registered: Jul 2006
Location: Belgrade, Yugoslavia
Distribution: Debian stable/testing, amd64
Posts: 1,061

Original Poster
Rep: Reputation: 47
I've used Perl String::Similarity and its ok, but slow; compared it file-by-file.
Similar files are those that have some characters changed or shifted or similar
Its hard to define it, guess it needs AI.

something akin to spam filtering.

Last edited by qrange; 04-15-2016 at 05:51 AM.
 
Old 04-15-2016, 06:01 AM   #4
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,848

Rep: Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309
still unspecified. Would be nice to see what is slow. I mean probably you have a code, a script or whatever you want to speed up.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
file command sees regular ASCII text file as ASCII Java program text bbraml Linux - Software 6 08-30-2013 08:52 AM
Help with looking for similarities in two files oliviaxinw Linux - Newbie 2 07-19-2012 09:25 PM
Convert binary file in to ascii file using shell script scream Linux - Newbie 5 05-24-2011 07:59 PM
How to convert a hex file to an ASCII file? lxnbie Linux - Server 1 09-08-2010 03:49 AM
LXer: Similarities LXer Syndicated Linux News 0 06-04-2010 07:30 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:01 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration