LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-04-2002, 03:54 PM   #1
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
need *fast* algorithm for binary file search


Hi guys ...

I have to search huge binary files that consist
of a number of building blocks in no particular
order, but some of them with a fixed structure,
for strings to parse the information and am looking
for a fast search algorithm, preferably ready-to-use
code in C++ :) and return the position of the searched
for pattern within the file.

Cheers,
Tink

P.S.: Huge is > 100MB :}
 
Old 12-05-2002, 02:15 AM   #2
DavidPhillips
LQ Guru
 
Registered: Jun 2001
Location: South Alabama
Distribution: Fedora / RedHat / SuSE
Posts: 7,163

Rep: Reputation: 58
how about grep?
 
Old 12-05-2002, 02:17 AM   #3
DavidPhillips
LQ Guru
 
Registered: Jun 2001
Location: South Alabama
Distribution: Fedora / RedHat / SuSE
Posts: 7,163

Rep: Reputation: 58
grep -n string file
 
Old 12-05-2002, 01:10 PM   #4
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067

Original Poster
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
LOL ... thanks, but thanks no ... first of all,
grep works line-oriented, and I need the location
of my hits withinthe BINARY file ... :}

And no, I've looked at the source and DON'T want
to use it, would take me ages to understand it, not
to speak of modify, C++-ify and use ...

Cheers,
Tink
 
Old 12-05-2002, 04:41 PM   #5
Azrael
Member
 
Registered: Sep 2002
Location: Germany
Distribution: SuSE 8.0
Posts: 96

Rep: Reputation: 15
May be you want to have a look at string matching algorithms like Knuth-Morris-Pratt. These have a lesser complexity than the naive version, but they will take their time and of course space.
 
Old 12-05-2002, 04:46 PM   #6
llama_meme
Member
 
Registered: Nov 2001
Location: London, England
Distribution: Gentoo, FreeBSD
Posts: 590

Rep: Reputation: 30
Quote:
LOL ... thanks, but thanks no ... first of all,
grep works line-oriented, and I need the location
of my hits withinthe BINARY file ... :}
grep works fine with binary files (it doesn't try to operate on a per line basis). Not sure if you can get it to print the byte-offset of the match, but it's worth a look at the man page methinks.

the strings command might be useful (depending on what exactly you're doing, you didn't make it very clear)

Alex
 
Old 12-05-2002, 06:25 PM   #7
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067

Original Poster
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
I have a bunch of files that contain data from "foreign"
echo-sounders, and am in the process of writing a tool that
converts their heterogenous chunk into our tidy set of files :}

The files are binary, contain different objects (configuration,
sounder setup, actual acoustic data, navigational data,
annotations, ...) in no particular order, and some of them
unfortunately of varied length, too they can be very huge,
the biggest ones I saw over 120MB ... I want to split the files
into chunks in memory, and write the apropriate sections
inthe apropriate files that we use to store that kind of information
so we can analyze data that we didn't record using our own
equipment/software.


Quote:
grep works fine with binary files (it doesn't try to operate on a per line basis). Not sure if you can get it to print the byte-offset of the match, but it's worth a look at the man page methinks.
Hmm ...
it outputs gibberish 'til it hits end-of-file or a \0 ...
at least it does here. with other options the offset
doesn't seem right, either.

Quote:
(depending on what exactly you're doing, you didn't make it very clear)
Quote:
looking for a fast search algorithm, preferably ready-to-use
code in C++ :)
Cheers,
Tink

Last edited by Tinkster; 12-05-2002 at 06:31 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Fast search in Linux simeandrews Linux - Software 8 09-01-2005 07:50 PM
fast search tool. bruse Linux - Newbie 4 08-20-2005 09:43 PM
fast algorithm - permutations kev82 Programming 3 08-10-2004 07:09 AM
Binary search tree insertion in java ksgill Programming 6 02-12-2004 05:11 PM
PHP: text search in StarOffice and OpenOffice documents, how to do it fast? J_Szucs Linux - General 1 11-22-2003 06:37 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:58 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration