LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 08-26-2010, 09:14 PM   #1
wowsignal
LQ Newbie
 
Registered: Aug 2010
Posts: 2

Rep: Reputation: 0
How to translate data from a CSV file to a MySQL Database?


I have a CSV called source1 with data like:

Date 12:00:00 AM 12:30:00 AM 01:00:00 AM 01:30:00 AM 02:00:00 AM 02:30:00 AM 03:00:00 AM ...
07/31/10 51.52 52.48 52.64 52.8 63.36 63.84 45.6 ...
08/01/10 49.44 49.92 61.28 60.16 52.64 49.12 47.52 ...
08/02/10 49.12 48.96 48.96 50.72 60.16 58.08 51.04 ...
08/03/10 60.32 60.64 59.2 60.96 37.92 33.28 32.8 ...

I need to get this data into a MySQL database table with columns source, timestamp and data, the translated data looking like:

source timestamp data
source1 10/07/31 12:00:00 51.52
source1 10/07/31 12:30:00 52.48
source1 10/07/31 01:00:00 52.64
source1 10/07/31 01:30:00 52.8
source1 10/07/31 02:00:00 63.36
source1 10/07/31 02:30:00 63.84
source1 10/07/31 03:00:00 45.6
.
.
.
10/08/01 03:30:00 49.44
..
..
..

Also, the CSV file is updates every few hours and I need to load any new data from the file to the database without creating duplicates of data that has already been loaded to the database.

I was trying this using a shell script, but it was starting to get messy. I would appreciate advice about what language/approaches to use to get an elegant solution.

This is my first post, and any help is greatly appreciated in advance.
 
Old 08-26-2010, 10:55 PM   #2
14moose
Member
 
Registered: May 2010
Posts: 83

Rep: Reputation: Disabled
Hi -

1. Data import

There are actually many different approaches you could take, including writing a quick'n'dirty Perl or Python script.

I would recommend you take a look at "mysqlimport":
http://dev.mysql.com/doc/refman/5.0/en/mysqlimport.html

2. Avoiding duplicate data

Again, there are many different approaches you could take.

I would recommend a "two pronged" approach:

a) write a shell script that periodically renames the original datafile (e.g. "myfile.csv") to some other name (e.g. "myfile.001.csv"). The new data will be written to a (new) "myfile.csv"; your "import program" will only read from the renamed files (myfile.001.csv, myfile.002.csv, etc). You'll probably want a second script that will do the import, then roll off the processed file (for example, move it into a "done/" folder).

b) Another way to avoid "duplicates" is possibly to make the combination of date/time and sequence number (if your data has a sequence number) as the primary key. Primary keys, by definition, won't allow duplicates. If you try, you'll get a SQL error on "insert".

Again - you've got many different options. Take my suggestions as a starting point.

Good luck!
 
Old 08-27-2010, 05:50 AM   #3
wowsignal
LQ Newbie
 
Registered: Aug 2010
Posts: 2

Original Poster
Rep: Reputation: 0
Thanks 14moose

Never came across mysqlimport before, but it looks extremely useful.

Each file that I download into the application (approximately every hour) will have a copy of the previous data, and some new entries at the bottom, hence the need for duplication avoidance. The idea of using MySQL to reject duplicates sounds appealing, as this will save me programming time. I will have a sourceid to identify the source of the data, and at date/time to identify when the data is from. There should not be two rows with the same "sourceid" and "date/time" entry in the table. I will attempt to set up a combination of these columns as primary key.

Never used perl, but I have some PHP experience. I'll try and give it a go with that unless anyone can think of a good reason not to do it in PHP.
 
Old 08-27-2010, 09:34 AM   #4
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 14,647

Rep: Reputation: 2575Reputation: 2575Reputation: 2575Reputation: 2575Reputation: 2575Reputation: 2575Reputation: 2575Reputation: 2575Reputation: 2575Reputation: 2575Reputation: 2575
Quote:
Originally Posted by wowsignal View Post
Thanks 14moose

Never came across mysqlimport before, but it looks extremely useful.

Each file that I download into the application (approximately every hour) will have a copy of the previous data, and some new entries at the bottom, hence the need for duplication avoidance. The idea of using MySQL to reject duplicates sounds appealing, as this will save me programming time. I will have a sourceid to identify the source of the data, and at date/time to identify when the data is from. There should not be two rows with the same "sourceid" and "date/time" entry in the table. I will attempt to set up a combination of these columns as primary key.

Never used perl, but I have some PHP experience. I'll try and give it a go with that unless anyone can think of a good reason not to do it in PHP.
No, PHP will work just fine. However, here's a little Perl-snippet to get you going, in case you want that path:
Code:
#!/usr/bin/perl

# The DataBaseInterface module.
use DBI;

# Set database variables here
$dbh = DBI->connect("DatabaseName", "DB User ID", "DB Password",{ RaiseError => 1})      # Open the database
  or die "Could not connect to database! $DBI::errstr";

open (INFILE, "/path/to/your.csv") or die "Can't open file!";
while (<INFILE>)
  {
   @words = split(',', $_);
   $var1 = $words[0];  # First data field in CSV
   $var2 = $words[1];  # Second, etc....
   $dbh->do("INSERT INTO TABLENAME values (\"$var1\",\"$var2\"");
  }
Pretty straightforward, and very generic. All you've got to change is the table/db names, and the # of variables, and you can import ANY csv to any MySQL table.
 
Old 08-27-2010, 10:22 AM   #5
Wim Sturkenboom
Senior Member
 
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Slackware 10.1/10.2/12, Ubuntu 12.04, Crunchbang Statler
Posts: 3,786

Rep: Reputation: 282Reputation: 282Reputation: 282
You can also look at the MySQL statement load data infile if you want to do it using MySQL statements (e.g. from within PHP)
 
Old 09-01-2010, 11:57 PM   #6
stevephillips79
LQ Newbie
 
Registered: Aug 2010
Posts: 2

Rep: Reputation: 0
Hello,
You have to make sure that you provide structure information along with you CSV data. The first line of the csv file should contain the list of field names in order to deliver necessary database structure information. Here are two samples of valid CSV data. Thank you.

[crabboy] removed spam [/crabboy]

Last edited by crabboy; 09-06-2010 at 09:33 PM.
 
Old 09-06-2010, 09:34 PM   #7
crabboy
Moderator
 
Registered: Feb 2001
Location: Atlanta, GA
Distribution: Slackware
Posts: 1,823

Rep: Reputation: 120Reputation: 120
stevephillips79: The forum rules do not permit advertising. Please visit http://www.linuxquestions.org/advertising/ for more information on advertising. Feel free to contact the forum admin if you have any questions about this policy.
 
  


Reply

Tags
automation, csv, mysql, shell script


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Importing a csv file into a mysql database kaplan71 Linux - Server 13 09-18-2009 06:13 PM
Need help on writing data to csv file tucs_123 Linux - Newbie 12 12-29-2008 03:42 AM
Processing data from a 'foreign' database with mysql, or tools to pre-process data. linker3000 Linux - Software 1 08-14-2007 08:36 PM
help extracting data from csv file willinusf Linux - General 10 10-27-2006 09:10 PM
need to get data from xml file to MySQL database, and then use php to access Armon Linux - General 1 01-18-2006 02:54 PM


All times are GMT -5. The time now is 03:57 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration