LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   find string in txt file (https://www.linuxquestions.org/questions/programming-9/find-string-in-txt-file-906922/)

Glenn_UOI 10-07-2011 07:32 AM

find string in txt file
 
Hi there ,
I am new in programming but I am stuck in a problem ...
I need to find a certain string in a txt file which is approximately 100 mb ...
I need to find , for example , "Model opel" in the txt file and put the "cursor" there so I can read the rest from there on ...
I also need to do this iteratively about 1400 times .
I need something like this ...
[code]
for (int i=o;i<1400;i++)
{
xfunction (i);
}
[\code]

I need to find that xfunction .
Please help me ...

millgates 10-07-2011 07:41 AM

In what language? C? C++? Other?
What do you mean by "putting the cursor there"? Is it a GUI application?
Why do you want to do it 1400 times? Like finding 1400 occurences of the string? or searching 1400 files?

Glenn_UOI 10-07-2011 07:59 AM

Sorry about that ...
I am new to this forum ...

I need C++ code for that .
I need to do this 1400 times because I want to find 1400 occurences of the string ( "Model %s" , date[i] ) , date[i] is an array of 1400 dates .
That is why I do this in a for loop .
As for the "cursor" matter , the .txt file looks like this :
Model 123
a
b
c
d
e
.
.
.

Model 124
f
g
h
i
.
.
.

and so on ...
There are 1400 "Models ..." with 1000 lines beneath them ( that is why the .txt file is 100 mb )
I need to place the "cursor" there because I want to make changes to the 1000 lines beneath the specific "Model ...".
I don`t want to use the _stricmp because it is time consuming .

Can you or any other help me please

millgates 10-07-2011 08:33 AM

I am still a bit confused...
You have this entire file loaded in memory? How is it stored? As an array of lines? A single char array containing the entire file?
So you basically need to scan lines until you find a line that begins with "Model", right?
Do you need a case insensitive serach?
I have never used _stricmp, but I don't think it would be more time consuming then necessary. Maybe it is the algorithm you are using?
if you have xfunction(i) that searches the file to find i-th occurence of the string and each time searches from the beginning and then you call it 1400 times in a loop you posted above, it will be inefficient.

Glenn_UOI 10-07-2011 09:01 AM

The .txt f;ile is 116222 KB
It is as follows :

bdarla: Model aerosmith-Aerosmith-01-Make_It" results :
1.aerosmith-Aerosmith-01-Make_It" 0
2.aerosmith-Aerosmith-04-One_Way_Street" 0.012394
3.depeche_mode-Music_for_the_Masses-01-Never_let_me_down_again" 0.015244
4.aerosmith-Aerosmith-07-Movin_Out" 0.016144
5.led_zeppelin-Led_Zeppelin_I-04-Dazed_And_Confused" 0.016541
6.aerosmith-Toys_In_The_Attic-06-Sweet_Emotion" 0.017774
7....
...
... (it goes on until line 1412)

bdarla: Model aerosmith-Aerosmith-02-Somebody" results :
1.aerosmith-Aerosmith-02-Somebody" 0
2.aerosmith-Aerosmith-04-One_Way_Street" 0.018489
3.aerosmith-Aerosmith-01-Make_It" 0.020151
4.led_zeppelin-Led_Zeppelin_I-04-Dazed_And_Confused" 0.020199
5.aerosmith-Aerosmith-03-Dream_On" 0.020235
6.beatles-Revolver-11-Doctor_Robert" 0.020719
7.aerosmith-Aerosmith-07-Movin_Out" 0.021579
8.aerosmith-Toys_In_The_Attic-09-You_See_Me_Crying" 0.023198
9. ...
..
.. (another 1412 lines)

and so on and so on .

I have 1412 ("bdarla: Model %s results :" , audioNames[audiodbSize] )

The file is loaded as follows :

FILE *stream;
stream=fopen ("audio.txt","r");

I do not need a case sensitive search .

I need something like this:

for (int i=0 ; i < audiodbSize ; i++ )
{
xfunction ( "bdarla: Model %s results :", audioNames[i] );
...[other commands]
...[other commands]
...[other commands]
...[other commands]
return();
}

As you can understand I am not looking for the i-th occurence but for the block of lines which starts
with the desirable bdarla: Model ...

I need to make changes to the float numbers after the names of the songs ...

johnsfine 10-07-2011 09:07 AM

Quote:

Originally Posted by Glenn_UOI (Post 4492388)
I want to find 1400 occurences of the string ( "Model %s" , date[i] ) , date[i] is an array of 1400 dates .

Can you predict the sequence of date[i] within the file? The task is easier if date[] as an array in memory either already was in the same sequence as in the file or if you can pre sort it to that sequence.

If date[i] cannot be presorted to the right sequence, it is best to copy the date[] array into an associative container before reading the file.

It sounds like you want to read through the file looking for occurrences of "Model " then look for the immediately following text in your date container. (That is tricky, but still practical if the length of that text varies and the sequence isn't known in advance). If found, then you want to do some processing on the portion of the file immediately following that match. (You were very unclear about that part of your requirement).

Quote:

Originally Posted by Glenn_UOI (Post 4492388)
I don`t want to use the _stricmp because it is time consuming .

You want to read a file and write back changes. That takes enough time per byte that fairly inefficient use of _stricmp would not make a noticeable difference.

The key to decent performance in this problem is to process the whole file in one pass. You don't want to start over at the beginning of the file for each element of date[]. Even if you had to _stricmp every element of date to the text following "Model " each time you find "Model " in the file, that would still be faster than rescanning the whole file for each element of date[]. But by putting the elements of date[] in an associative container you can improve things even more.

Low level details, such as whether to use _stricmp vs. something more efficient for scanning for "Model " etc., make little performance difference compared to the high level organization of the algorithm.

Quote:

Originally Posted by Glenn_UOI (Post 4492446)
I need to make changes to the float numbers after the names of the songs ...

Do those changes preserve the text size (number of digits) in each float?

If you intend in place modification of the file, then a change in text size is a big problem.

The usual way to modify a file is to read the original file while writing a modified file. If you want it to appear to the user as if you had really modified the file (rather than made a modified copy) then at the end you delete the input file and rename the output file to the name of the input file.

Glenn_UOI 10-07-2011 09:50 AM

Thank you for all the help ...

Thank you millgates because with your questions you made me understand my problem better and thank you johnsfine for the responce ...

A colleague of mine helped me and solved the problem ...
He used #include <vector> , #include <string> and some other commands that I need now to understand ...

Thank you all again .


All times are GMT -5. The time now is 06:47 AM.