LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Combine files using shell scripting (https://www.linuxquestions.org/questions/linux-newbie-8/combine-files-using-shell-scripting-401664/)

ccray 01-10-2006 11:49 AM

Combine files using shell scripting
 
I am not a programmer and don't have a lot of experience scripting (only some editing of existing scripts). I have two csv files that both contain the same "header columns" or first lines that need to be combined into one csv file keeping the one header column and deleting the other.

I also may need to change the order of some "columns" of csv data.

Any help is appreciated. I don't think this should be a lot of code but am not sure what to do.

Example of csv data below:

file1.csv
ID,Password,Last Name,First Name,Category
bsmith,pass,Smith,Bob,Spring

file2.csv
ID,Last Name,First Name,Password,Category
jsmith,Smith,Jake,pass,Summer

MensaWater 01-10-2006 12:02 PM

Several ways to do this but if the header is always the same I'd just do:

Code:

echo "ID,Password,Last Name,First Name,Category" >new.csv

grep -v "ID,Password,Last Name,First Name,Category" file1.csv >>new.csv

grep -v "ID,Password,Last Name,First Name,Category" file2.csv >>new.csv

This would put the header in first then put in only the lines OTHER than the header from each of the files. The new file would be "new.csv.

Matir 01-10-2006 12:47 PM

It appears that his format changes from file to file. PERL is probably the best answer for a script to handle this (using hashes).

pljvaldez 01-10-2006 04:01 PM

A "cheater" way of doing this could also be to import both files to separate open office spreadsheets, adjust the columns as needed, copy the rows over to combine the files, and then export it as a CSV. I know, not a script and not very elegant, but effective none the less...

ccray 01-11-2006 09:07 AM

Manual method
 
Yes. That has been the way I have done it in the past and it just adds time I don't have. I am trying to get automate these sorts of manual monkey tasks. I am going to see if I can edit the script that creates the originals to printout the data in the same order. That will allow me to use the suggestion above.

If anyone has a quick change columns suggestions in case I run into problems, those are appreciated as well.

Matir 01-11-2006 09:20 AM

I wrote a program similar to what you need for my unix class a year ago, using bash, awk, and sed. It was a clunker. I honestly think PERL is your best option. Use the first line to get the re-ordering, then use split and print to reorder the lines.

ccray 01-11-2006 09:37 AM

I haven't written perl from scratch, just manipulated it. Do you have a sample or piece of code to work from?

MensaWater 01-11-2006 09:53 AM

Perl certainly has a lot of power but for what you originally wrote my solution would work fine and is a 3 line scipt already done.

I see in latter posts some discussion of reordering the columns which wasn't part of your original. You can do this with the awk command since its a comma separated file by using comma as the delimiter.

For example:
Code:

awk -F, '{print $2","$1","$4","$3","$5}' filename
Would print your original records that have format:
ID,Password,Last Name,First Name,Category

with new format:
Password,ID,First Name,Last Name,Category.

The -F tells awk which character to use as delimiter. The $1 through $5 are the 5 fields. You have to put quotes around the commas in the print statement as you want a literal comma to appear in the output (without the quotes it would just put a space between them).

You of course can change the order anyway you want.

Here again simple shell script with 3 lines would suffice.

Code:

echo "Password,ID,First Name,Last Name,Category" >new.csv

awk -F, '{print $2","$1","$4","$3","$5}' file1.csv |grep -v  "Password,ID,First Name,Last Name,Category" >>new.csv

awk -F, '{print $2","$1","$4","$3","$5}' file1.csv |grep -v "Password,ID,First Name,Last Name,Category" >>new.csv


Matir 01-11-2006 10:57 AM

Quote:

Originally Posted by jlightner
Perl certainly has a lot of power but for what you originally wrote my solution would work fine and is a 3 line scipt already done.

I see in latter posts some discussion of reordering the columns which wasn't part of your original. You can do this with the awk command since its a comma separated file by using comma as the delimiter.
...

This is actually quite an elegant solution, though his original post did have columns in different orders and did (briefly) mention reordering them. :)

MensaWater 01-11-2006 11:01 AM

Thanks.

I stand (sit actually) corrected. His post did mention possibly needing to reorder the columns. Not only that I'd completely missed the fact that he had a slightly different order in csv1 than in csv2.

He'd just have to change the awk numbering for the two files to get them the same. In my code I'd put the same numbering in both awk lines but that wouldn't work. (e.g. $2 in csv1 is $4 in csv2).

Matir 01-11-2006 11:14 AM

Somewhat odd reports he seems to be dealing with, of course. :) Out of curiousity, what part of the ATL are you from? (Don't mean to get side-tracked, just noticed you're from Atlanta)

MensaWater 01-11-2006 11:45 AM

Not that bad - the only thing that is bad is it seems to imply he has passwords stored in non-encrypted files. Hackers would love to find those csv files.

As for me I'm in Cobb county. How about you? Also are you a member of ALE?

Matir 01-11-2006 11:53 AM

You know, I never noticed the PASSWORD field in there. We can hope they're crypted passwords. :)

I live in North DeKalb. I've never managed to make it to ALE, though I would really like to. Unfortunately, being a college student, I have a class until 7:10 on tuesdays and thursdays.

ccray 01-12-2006 09:13 AM

Actually, the passwords are somewhat irrelevant in that they are generated for both sets of accounts but are overridden by the server referencing LDAP. One of the files generated does have relevant passwords but these don't have much value to anyone as they are mainly a way of getting a user viewpoint for an administrator who by default can test from this vantage point.

Thanks for the heads up though. I am going to see how far I can get with all the suggestions. From what I do know I think they should do the trick.

I am not a member of ALE (need to wikipedia that one). The code level work I do is somewhat a sideline. I am working to grow more and more into managerial roles. This sort of thing gives me a better understanding of the dark cubicle world.:)

I have to say this first experience using this group has been far better than other forums I have tried.

MensaWater 01-12-2006 11:14 AM

ALE is a "Atlanta Linux Enthusiasts" a metro Atlanta Linux Users Group (LUG). My question was directed at Matir as we both happen to be in the the Atlanta area. There likely is a LUG in your area and its sometimes nice to go and hear what others are doing.


All times are GMT -5. The time now is 04:49 PM.