ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I am trying to create a single csv file, using multiple .txt files as input. The output format should be: Column1 - name of .txt file, Column2 - contents of txt file.
I'm assuming the contents of the text files is rather small.
If that is the case, then a simple approach would be:
Code:
for file in <list>; do
contents=`cat $file`
echo "$file,$contents" >> csvfile
done
where <list> is a list of the text files.
That list could be the result of an 'ls' command, or the contents of yet another file containing a list of file names, or an explicitly entered list or pattern.
If your text files contain multiple lines, then you may need to translate them with into single lines if you want only two columns in your csv file. You can do that on the fly in the above loop if desired.
Thanks raconteur, yes you are correct, the contents of the text files are small, up to 4 or 5 lines max, but may contain leading spaces and content separated by blank lines. However, when I tried to run your code (firstly created a file called list - ls > list) the output I got seemed to be just the contents of the file called list,
list,xyz
abc
def
ghi
where abc, xyz etc are the names of the files and the contents of list
You may want add quotes around the second field. This is usually done for text fields in csv files. CSV files don't normally have fields with newlines in them. Also, it begs the question how embedded quotes are handled. I tried an experiment, exporting a two record csv file from oocalc that contained quotes in the second column.
Code:
"file1","This is a test. How will it export a csv file that contains “embedded quotes?” I will also open up the file and insert newline characters if need be."
"file2","Line1 Line2 Line3 Line4"
"File 3","This is the last line. "
I did add newlines to the csv file and when I reloaded them into oocalc, they were stripped out.
A csv file with a multiline field would probably not be very portable. You might consider using an xml format instead.
If the purpose is to create these files from a script, this is often done in bash using HERE documents. A HERE document can even contain bash variables that are expanded before the file is written.
Code:
#!/usr/bin/env bash
MAX=300
cat >afile1.conf <<EOF
[general]
max = $MAX
min = 100
EOF
cat >anotherFile.conf <<EOF
This is the second file.
Second line of second file.
I need some sleep because this is the extent of
my imagination.
EOF
You need an extra 'cat' in the first line eg if your input list of files is in_list.txt, then
Code:
for file in `cat in_list.txt`
do
contents=`cat $file`
echo "$file,$contents" >> csvfile
done
Thanks Chrism01, that works fine if there is just a single line of text in the txt file, however I think the blank lines in the text files are causing me some problems - the output csv file has the newline in the first column if there are more than 1 line.....my problem being that there are hundreds of text files, and I will have to recreate them again from the csv file (using awk -F, '{ print $2 > $1 }' myfile.csv) so I need the first column to only contain the names of the txt files.
If you are going for a generalised soln (as alluded to by jschiwal) and yourself here, use a more powerful lang eg Perl.
If the input files can have newlines, blank lines, extra quotes (or not) etc, that's my recommendation.
Bash would get very messy.
I'm assuming you'd want to cvt newlines inside (input) files to spaces or something in the output file?
If you are going for a generalised soln (as alluded to by jschiwal) and yourself here, use a more powerful lang eg Perl.
If the input files can have newlines, blank lines, extra quotes (or not) etc, that's my recommendation.
Bash would get very messy.
I'm assuming you'd want to cvt newlines inside (input) files to spaces or something in the output file?
Hmm.... I'm thinking my best bet would be to just remove newlines from the input files (using perl -pi -e 's/\n//g' *.txt or something like that), then your
<for file in `cat in_list.txt`
do
contents=`cat $file`
echo "$file,$contents" >> csvfile
done>
Cheers Tink, it might not be the prettiest looking line of awk written ;-) but it will do the trick too - once I removed the newlines which were causing the text to spread across different cells.
Cheers Tink, it might not be the prettiest looking line of awk written ;-) but it will do the trick too - once I removed the newlines which were causing the text to spread across different cells.
Thanks a lot guys for the suggestions
-richmur!
My bad ... I forgot to put something in th begin section
that I was actually using in my test-runs ...
In fact, the whole NF loop with the printf's wouldn't
make any sense w/o the missing FS="\n". And you could
of course go and choose a less ugly RS, for me a "\x04"
worked just fine, but I don't know the data you're
dealing with .... ;}
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.