LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 08-31-2012, 03:44 AM   #1
lothario
Member
 
Registered: Apr 2004
Posts: 340

Rep: Reputation: 30
Need shell script help with a CSV file.


I have spent a lot of time extracting and formatting a very large (CSV) data file from a real-time device.
Here is a sample:
Code:
"zada68e","mv47b2","pd85sq5","fneab7","b3936k68"
"c076a4cb7","d6d1367","1b5812d6","ce8e82","ie5637"
"011b124","bc339504","color RED","dym989","8a5d78"
On      20
Off     24
Total   44
"69ad2","d363ad0","f41328","766ge1b","a29f"
"a8da68e","47db2","fd85","feab7","c936868"
"10e3","bda1b","143da56","0f19472","495bcexje"
"c55f950","nm787x","color GREEN","25329","10a9f2b"
On      15
Off     14
Total   29
Where:
  • Each data row has 5 fields.
  • Each field has double quotes around it.
  • Fields are separated by a comma.
  • Fields have variable lengths.
  • The number of data rows is unknown.
  • Every group of data rows is followed by 3 lines. On, Off and Total.
I'm not a shell script expert.
Can you help me with a basic shell script that will make the data look like this:
Code:
"zada68e","mv47b2","pd85sq5","fneab7","b3936k68"
"c076a4cb7","d6d1367","1b5812d6","ce8e82","ie5637"
"011b124","bc339504","color RED","dym989","8a5d78"
On      20
Off     24
Total   44

Date__________________color RED___________________
<new page>


"69ad2","d363ad0","f41328","766ge1b","a29f"
"a8da68e","47db2","fd85","feab7","c936868"
"10e3","bda1b","143da56","0f19472","495bcexje"
"c55f950","nm787x","color GREEN","25329","10a9f2b"
On      15
Off     14
Total   29

Date________________color GREEN___________________
<new page>
Where:
  1. After the Total line, there is a blank line.
  2. The line above the On is copied but fields 1 and 2 are replaced by Date________________
  3. Field 3 is copied as-is.
  4. Fields 4 and 5 are replaced by "_" underscores.
  5. Finally, the new page control character(s) is inserted to ensure that nothing else prints on this page.
Thank you.
 
Old 08-31-2012, 09:38 AM   #2
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541

Rep: Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065
Whenever I have to deal with stuff like this I do a "pre-edit" (with sed) then use awk to get what I need in a form I want.

I create a little sed file, something like "edit.sed" that replaces the commas with a vertical bar or a tab and gets rid of the double quotes. Something like this:
Code:
s/","/|/g     < globally replace "," with a vertical bar>
s/^"//g       < globally replace leading " with nothing>
s/"$//g       < globally replace trailing " with nothing>
and maybe a few other messy things -- this is just to get you started. The Good Thing about sed is that it's a streaming editor so every one of those directives are acted on a line-by-line basis rather than scan the entire file for each directive; that makes it quick.

One messy thing you can get rid of or alter is a blank line in your input:
Code:
s/  *$//g     < replace any trailing space with nothing >
/$^/d         < delete blank lines >
Or substitute
Code:
s/  *$//g
s/^$/Date/g
Just a hint that may help.

You run that with
Code:
sed -f edit.sed filename.csv > newfilename.csv
Why the vertical bar? Well, they don't appear in any language and they're convenient as a field separator (you could also use a tab character). If you already have tab character in your input file (not as field separators but as actual text), add something like
Code:
s/<tab>/<space>/g
to the first line ("<tab>" is an actual tab character and "<space>" is a, you know, space).

Or, if there are tabs, you could replace them with vertical bars (same as above).

When you use AWK, you simply set a field separator in the BEGIN {} section:
Code:
BEGIN {
     FS="|"
}
Or, if you decide to use tabs as field separators,
Code:
BEGIN {
     FS="\t"
}
Then you refer to your five fields as $1, $2, ..., $5.

AWK provides you programming capability; i.e., you can do what you need to do by comparing and matching strings then act on them.

Hope this helps some.

Last edited by tronayne; 08-31-2012 at 09:45 AM. Reason: Fumble finger
 
1 members found this post helpful.
Old 08-31-2012, 11:40 AM   #3
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 25,624

Rep: Reputation: 7641Reputation: 7641Reputation: 7641Reputation: 7641Reputation: 7641Reputation: 7641Reputation: 7641Reputation: 7641Reputation: 7641Reputation: 7641Reputation: 7641
Quote:
Originally Posted by lothario View Post
I have spent a lot of time extracting and formatting a very large (CSV) data file from a real-time device.

I'm not a shell script expert. Can you help me with a basic shell script that will make the data look like this:

Where:
  1. After the Total line, there is a blank line.
  2. The line above the On is copied but fields 1 and 2 are replaced by Date________________
  3. Field 3 is copied as-is.
  4. Fields 4 and 5 are replaced by "_" underscores.
  5. Finally, the new page control character(s) is inserted to ensure that nothing else prints on this page.
Thank you.
Not meaning to sound nasty, but this seems to be a recurring theme:
https://www.linuxquestions.org/quest...-field-750045/
https://www.linuxquestions.org/quest...t-file-751241/
https://www.linuxquestions.org/quest...attern-658139/
https://www.linuxquestions.org/quest...-files-644604/

Can you post what you've tried/done on your own first, before asking others to write your scripts for you?? Bash scripting tutorials are abundant, and applying what you've been given in the past should also be a good starting point:
http://tldp.org/LDP/abs/html/
http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Reading a .CSV file and then calculating average per minute basis in shell script. krishdeeps Linux - Newbie 1 04-23-2010 04:38 PM
Shell Script for CSV file comparision aravind_balan Programming 1 02-17-2009 03:33 AM
Calculate average from csv file in shell script khairilthegreat Linux - Newbie 5 11-21-2007 12:57 PM
Filtering a CSV file from web log with shell script? Micro420 Programming 8 08-22-2007 03:13 AM
Shell script to read from csv file hendemeg Programming 1 05-11-2004 08:23 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 12:02 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration