LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-01-2006, 12:54 AM   #1
topworld
LQ Newbie
 
Registered: Feb 2006
Posts: 21

Rep: Reputation: 15
Efficient search technique for text file of size 2 mb or more


Hi all,

If i want to implement c program that finds out user-specified number or word from the text file , having size arnd 2 mb or more..
(text file is a combination of words and numbers)

What can be efficient search-technique?

Thank you.
 
Old 04-01-2006, 02:19 PM   #2
ta0kira
Senior Member
 
Registered: Sep 2004
Distribution: FreeBSD 9.1, Kubuntu 12.10
Posts: 3,078

Rep: Reputation: Disabled
I recommend a bash script as a front end to find possible text files, then submit the files to your C program. Look at 'man find'. I'm pretty sure that can take care of the size thing. Then look at 'man file' or 'man stat'; after obtaining a list of everything > 2MB, you can use these to determine if they are text files or not. You'll have to 'grep' and/or 'sed' to get something pretty looking out of it, though.
ta0kira

Last edited by ta0kira; 04-01-2006 at 02:24 PM.
 
Old 04-01-2006, 03:06 PM   #3
Mara
Moderator
 
Registered: Feb 2002
Location: Grenoble
Distribution: Debian
Posts: 9,696

Rep: Reputation: 232Reputation: 232Reputation: 232
If the file is not sorted, and you don't have a clue on where to find the thing you're searching for, the linear search is the way to go. A C program reading data to a buffer, searching the buffer and reading new fragment is a simple and rather effective way.
 
Old 04-01-2006, 04:49 PM   #4
addy86
Member
 
Registered: Nov 2004
Location: Germany
Distribution: Debian Testing
Posts: 332

Rep: Reputation: 31
Read
http://en.wikipedia.org/wiki/String_searching_algorithm
Considering the length of the string (2M characters), a simple brute-force ( O(mn) ) is almost certainly not the fastest way.
 
Old 04-01-2006, 05:12 PM   #5
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
Your first question, as Mara noted, is whether there's any order in the file itself (is the file sorted? can you read a line at a time, or is it just a random byte stream? Etc etc)

The next question is whether you need to parse the entire file itself for each query, or whether it makes sense to index the file (as the Wikipedia article addy86 suggests).

It would be interesting to do some tests, but I think it's unlikely you could easily write a C program that would necessarily out-perform "grep" or "awk" for basic pattern matching (i.e. "search") speed and efficiency. (I'm prepared to be 100% wrong about that statement, by the way ;-))

'Hope that helps .. PSM
 
Old 04-03-2006, 01:56 AM   #6
topworld
LQ Newbie
 
Registered: Feb 2006
Posts: 21

Original Poster
Rep: Reputation: 15
Thank you all for ur help :-)

Inputs to the program are like following

1)i will take one text file as an input file in the c-prog (that has been prepared previously,and prog will use it directly)

2) User-input is word and a number like xyz.c and 15...

Now this will be like xyz.c:15.. somewhere in the 2mb file
here
-- There is no specific pattern in the text file...

now i think linear search is the last option remaining...


Thank you

Last edited by topworld; 04-03-2006 at 02:00 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Forum related question - Please help with specific search technique Jongi LQ Suggestions & Feedback 8 01-05-2006 11:07 AM
Which command to search text/phrase in file? b:z Linux - General 2 03-30-2005 09:11 PM
search for files based on file size fatrandy13 Linux - General 1 12-05-2004 10:47 PM
Need to search for *last* occurrance in a text file ericcarlson Linux - Software 1 09-11-2004 02:08 PM
How do you search for a file in text mode? cyberkid12 Linux - General 7 12-21-2002 12:42 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:25 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration