How to translate data from a CSV file to a MySQL Database?
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Also, the CSV file is updates every few hours and I need to load any new data from the file to the database without creating duplicates of data that has already been loaded to the database.
I was trying this using a shell script, but it was starting to get messy. I would appreciate advice about what language/approaches to use to get an elegant solution.
This is my first post, and any help is greatly appreciated in advance.
Again, there are many different approaches you could take.
I would recommend a "two pronged" approach:
a) write a shell script that periodically renames the original datafile (e.g. "myfile.csv") to some other name (e.g. "myfile.001.csv"). The new data will be written to a (new) "myfile.csv"; your "import program" will only read from the renamed files (myfile.001.csv, myfile.002.csv, etc). You'll probably want a second script that will do the import, then roll off the processed file (for example, move it into a "done/" folder).
b) Another way to avoid "duplicates" is possibly to make the combination of date/time and sequence number (if your data has a sequence number) as the primary key. Primary keys, by definition, won't allow duplicates. If you try, you'll get a SQL error on "insert".
Again - you've got many different options. Take my suggestions as a starting point.
Never came across mysqlimport before, but it looks extremely useful.
Each file that I download into the application (approximately every hour) will have a copy of the previous data, and some new entries at the bottom, hence the need for duplication avoidance. The idea of using MySQL to reject duplicates sounds appealing, as this will save me programming time. I will have a sourceid to identify the source of the data, and at date/time to identify when the data is from. There should not be two rows with the same "sourceid" and "date/time" entry in the table. I will attempt to set up a combination of these columns as primary key.
Never used perl, but I have some PHP experience. I'll try and give it a go with that unless anyone can think of a good reason not to do it in PHP.
Never came across mysqlimport before, but it looks extremely useful.
Each file that I download into the application (approximately every hour) will have a copy of the previous data, and some new entries at the bottom, hence the need for duplication avoidance. The idea of using MySQL to reject duplicates sounds appealing, as this will save me programming time. I will have a sourceid to identify the source of the data, and at date/time to identify when the data is from. There should not be two rows with the same "sourceid" and "date/time" entry in the table. I will attempt to set up a combination of these columns as primary key.
Never used perl, but I have some PHP experience. I'll try and give it a go with that unless anyone can think of a good reason not to do it in PHP.
No, PHP will work just fine. However, here's a little Perl-snippet to get you going, in case you want that path:
Code:
#!/usr/bin/perl
# The DataBaseInterface module.
use DBI;
# Set database variables here
$dbh = DBI->connect("DatabaseName", "DB User ID", "DB Password",{ RaiseError => 1}) # Open the database
or die "Could not connect to database! $DBI::errstr";
open (INFILE, "/path/to/your.csv") or die "Can't open file!";
while (<INFILE>)
{
@words = split(',', $_);
$var1 = $words[0]; # First data field in CSV
$var2 = $words[1]; # Second, etc....
$dbh->do("INSERT INTO TABLENAME values (\"$var1\",\"$var2\"");
}
Pretty straightforward, and very generic. All you've got to change is the table/db names, and the # of variables, and you can import ANY csv to any MySQL table.
Hello,
You have to make sure that you provide structure information along with you CSV data. The first line of the csv file should contain the list of field names in order to deliver necessary database structure information. Here are two samples of valid CSV data. Thank you.
stevephillips79: The forum rules do not permit advertising. Please visit http://www.linuxquestions.org/advertising/ for more information on advertising. Feel free to contact the forum admin if you have any questions about this policy.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.