Convert text paragraph for database
I have successfully scanned and ocr'd thousands of "famous quotes". Now I got a text file:
Code:
One or more Code:
"One or more lines of text immediately followed by.","The author name on one line." |
Hi.
Code:
$ cat infile |
GNU-awk:
Code:
$ awk 'BEGIN { RS = "\n\n\n" } |
GNU Awk:
Code:
# a one liner should be less than 80 characters |
As a learning exercise I like to implement and test solutions posted by respondents to interesting problems such as this one. My test program (shown below) uses four solutions -- my own, and those already posted by firstfire, colucix, and ntubski. I constructed a test file of real-world quotations.
It is perplexing to find that the output files from the four solutions differ to some degree. Perhaps I have misunderstood the problem; perhaps there is ambiguity in the OP's problem statement. Input file ... Code:
Politics is the art of looking for trouble, finding it everywhere, Code:
#!/bin/bash Daniel B. Martin |
Quote:
Code:
One or more Code:
"One or more lines of text immediately followed by.","The author name on one line." |
Quote:
Daniel B. Martin |
Thank you very much for these answers.
I quickly noticed my question was inaccurate of how many linebreaks there is. Thanks for the comment ntubski. - so I replaced \n\n\n with \n\n\n* where appropriate. - The use of "gensub" in a script worked only after installing gawk. -- "Oikeastaan tiedämme vain, kun tiedämme vähän: tietämisen mukana kasvaa epäilys.","Goethe, Maximen und Reflexionen." |
All times are GMT -5. The time now is 01:04 AM. |