Pulling text from a text file and append to CSV
Hi All,
I don't post here often, but scripting really isn't my thing but I have had an idea that may well save a lot of time ! We have a lot of staff who complete weekly timesheets on paper, and they are all returned here to HQ to process. One of the workers came up with a great idea - send in your timesheet details via SMS on a daily basis. We had a quick chat about formats and I have enough information to get started and prove the concept. I have installed and configured SMS Server Tools3 on my SIM enabled netbook and have tested it's ability to receive SMS messages in a specific directory. What I would like to do is create a script that will check the specified directory regularly, and from each file it finds, extract the information into a CSV file. The incoming format is like this: Code:
From: 447123456789 Code:
EMPLOYEE_NO:DATE:CLIENT:START_TIME:ENDTIME So could someone point in the right direction with the tools I would need for a shell script? Ideally this would also create new CSV's every monday for the next seven days. I am familiar with the linux shell, just not had to do any scripting in the past! Any advice greatly appreciated! |
Let me first say this: what you describe is probably best suited for a scripting language like python or perl for readability. I may come back and edit this post or add a response with a python script.
That said, I took this as an opportunity to teach myself a little more about using sed. The command is long, but I managed to have sed do everything you asked (I think) when it comes to saving and formatting the data. Here's the command: Code:
sed -srn '/^[0-9]+/ { s@:@,@g ; H } ; /^From:/ { s@From: (.*)$@\1,@ ; H } ; /^Sent:/ { s@Sent: (.*)$@\1,@ ; H } ; /^Received:/ { s@Received: (.*)$@\1,@ ; H } ; $ { g ; s@\n@@g ; p }' filename.txt Code:
user@localhost$ sed -srn '/^[0-9]+/ { s@:@,@g ; H } ; /^From:/ { s@From: (.*)$@\1,@ ; H } ; /^Sent:/ { s@Sent: (.*)$@\1,@ ; H } ; /^Received:/ { s@Received: (.*)$@\1,@ ; H } ; $ { g ; s@\n@@g ; p }' testdata.txt If you do not want the colons replaced by commas, delete the part of the command in blue. The idea is, you run the command on each file and redirect the output (with >>) to some central CSV file. For instance (I made some sample copies of your data file, modified them a bit, and left them all with a ".txt" extension): Code:
user@localhost$ sed -srn '/^[0-9]+/ { s@:@,@g ; H } ; /^From:/ { s@From: (.*)$@\1,@ ; H } ; /^Sent:/ { s@Sent: (.*)$@\1,@ ; H } ; /^Received:/ { s@Received: (.*)$@\1,@ ; H } ; $ { g ; s@\n@@g ; p }' *.txt Like I said, I might come along and post a python script a little later. Right now: sleep. |
Wow!
Thank you so much - that will definitely get me going :-) Have a great sleep! |
Here's a pure bash solution for the parsing and output
Code:
#!/bin/bash |
Here's a modified form of the command... Still does the same thing. It just does it in fewer characters :)
Code:
sed -srn '/^[0-9]+/ { s@:@,@g ; H } ; /^(From|Sent|Received):/ { s@[^:]*:[[:space:]]*(.*)$@\1,@ ; H } ; $ { g ; s@\n@@g ; p }' *.txt Just as an FYI, the above command triggers on four kinds of lines in the file. Lines that begin with: 1. From: 2. Sent: 3. Received: 4. <one or more digits> The From, Sent and Received lines can have any number or type of space after the colon. Anything after the space(s) will be printed at the end. The line that starts with one or more digits will be printed regardless. For the python script... it's more or less similar to catkin's shell script. This python script would be invoked as: Code:
/path/to/script.py /path/to/sms_files/*.txt Code:
#!/usr/bin/python Quote:
You just need to determine how often. Cron can schedule a task once per minute all the way up to once per year. "extract the information to a CSV" = use the command (or one of the scripts provided) and use redirection (e.g. "sed -srn <pattern> /path/to/specific/dir/*.txt >> /path/to/dir/with/file.csv") Quote:
Code:
mv /path/to/dir/with/file.csv /path/to/archive/dir/file_$(date +%Y%m%d).csv ; touch /path/to/dir/with/file.csv |
Hats off to you chaps! I had already sorted uploading the csv to Google apps spreadsheets and manipulating in the middle, all the info here will now fill the gaps for me - and more to the point will get me started in scripting which is what I really want to do to develop other items here.
Really, really appreciate your contributions - extra kudos to you both for donating some of your time to help a charity! :-) I'll post back here what I end up with! Regards Ifan |
All times are GMT -5. The time now is 07:57 PM. |