text indexing program
Hi everybody.
I'm looking for a program (better if from command line) which can index the words in a txt file associating them with the page number: for instance pages ------ word1 - 1,4,6 word2 - 5,9,23 word3 - 7,,44,88 Thanks for any help |
How would you define pages in a plaintext file?
|
Thanks for aswering
Quote:
So, I guess that there is no way out to my original question? And If I want to index all the words in the pages I need to change the format of the file? |
There may be a control character inserted into the text file at the end of each page, in which case a program could easily determine the page number using that. There is an ASCII code for it, named FF (form feed, hex value 0C, octal 014). You can insert a FF in vim using control-L.
If your text files have this character at the end of each page, you could write a very quick Perl script to do your indexing, something like this: Code:
#!/usr/bin/perl |
Thanks a lot but ... that's too beautiful to be true!
It's my first perl script From Command line and from the directory of the perl script and txt file I run: perl perl_script.pl (I hope the extension should be right; I have checked it out). But nothing happens: for, how do I insert the file to be indexed? |
First you need to make the script file executable:
Code:
chmod 755 perl_script.pl Code:
./perl_script.pl input.txt |
Grand.
Let me offer yoy a Guiness when you come to Rome. One last thing, if you don't mind: I've saved - after a few tries - the output like this Quote:
Thanks again |
All cat does is read standard input and write to standard output. You can accomplish the same thing like this:
Code:
./perl_scrit.pl input.txt > output.txt |
Thanks a lot
|
All times are GMT -5. The time now is 07:42 PM. |