LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Bash read in variable length text records (https://www.linuxquestions.org/questions/programming-9/bash-read-in-variable-length-text-records-599989/)

lynx81191 11-15-2007 08:54 PM

Bash read in variable length text records
 
I need to be able to read in variable length text records on like a daily basis with out loosing track I know to use date and a file to store what record im on because the program will execute every day and need to process the next record in the file but how to determine what record? the field between records is going to be an extra line so what I have is a long text file with several pages worth of text and I have an additional line of white space separator so as to delineate records I would use while but the time frame for record processing needs to be a day and the variable record length creates a problem? Thanks in advance.

chrism01 11-15-2007 10:10 PM

It's a little unclear what you are asking for, especially as you have no punctuation.
IUIC, each data rec is in fact 2 recs, one with data, one empty.
All you need to do is count which rec you have reached, and as you have noted, save that value for next time in a temp file.
Use cron to run the prog each day, which then reads the temp file and uses the rec cnt therein to read to that rec in the data file and extract the next data rec, incrementing the rec cnt and saving it.
HTH

lynx81191 11-15-2007 11:06 PM

clarification
 
I apologize. Let me clarify. I have a 14 page long text flle spaced with an extra flle between records. The records consist of words of text separated by spaces. I need to pull records one at a time 1 per day and the only field delimiter I have to use is an extra blank line. I can write bash scripts but I have no idea how to process the records using the field delimiter that I have to work with. I've looked into regular expressions but am unsure what command to use and the exact structure of the regular expression that I would need. If use a 'for' loop with 'cut' I cannot use a multiple character field delimiter (there are spaces between the words in the records). And I can't use various other commands because of this so I am looking to sed and awk but have no knowledge of how to use a regular expression to accomplish this.

lynx81191 11-15-2007 11:31 PM

Quote:

Originally Posted by chrism01 (Post 2960514)
It's a little unclear what you are asking for, especially as you have no punctuation.
IUIC, each data rec is in fact 2 recs, one with data, one empty.
All you need to do is count which rec you have reached, and as you have noted, save that value for next time in a temp file.
Use cron to run the prog each day, which then reads the temp file and uses the rec cnt therein to read to that rec in the data file and extract the next data rec, incrementing the rec cnt and saving it.
HTH

#!/bin/bash
while read text
do
cat text | mail -s record user@domain.com
done < data

will not work because of delimiting fields ?

I apologize. Let me clarify. I have a 14 page long text file spaced with an extra blank line between records. The records consist of words of text separated by spaces. I need to pull records one at a time 1 per day and the only field delimiter I have to use is an extra blank line. I can write bash scripts but I have no idea how to process the records using the field delimiter that I have to work with. I've looked into regular expressions but am unsure what command to use and the exact structure of the regular expression that I would need. If use a 'for' loop with 'cut' I cannot use a multiple character field delimiter (there are spaces between the words in the records). And I can't use various other commands because of this so I am looking to sed and awk but have no knowledge of how to use a regular expression to accomplish this.

chrism01 11-17-2007 08:53 PM

Here's the basics

Code:

file=yourfile
#you want eg rec 3 from yourfile - see my previous
num1=3   

#Get rec
rec=`cat $file|head -$num1|tail -1`

#awk will use/accept any amt of whitespace
field=`echo $rec|awk '{print $3}'`
echo $field



All times are GMT -5. The time now is 11:07 AM.