Create 1 csv file from multiple txt files
Hi,
I am trying to create a single csv file, using multiple .txt files as input. The output format should be: Column1 - name of .txt file, Column2 - contents of txt file. Thanks a lot! |
I'm assuming the contents of the text files is rather small.
If that is the case, then a simple approach would be: Code:
for file in <list>; do That list could be the result of an 'ls' command, or the contents of yet another file containing a list of file names, or an explicitly entered list or pattern. If your text files contain multiple lines, then you may need to translate them with into single lines if you want only two columns in your csv file. You can do that on the fly in the above loop if desired. |
Thanks raconteur, yes you are correct, the contents of the text files are small, up to 4 or 5 lines max, but may contain leading spaces and content separated by blank lines. However, when I tried to run your code (firstly created a file called list - ls > list) the output I got seemed to be just the contents of the file called list,
list,xyz abc def ghi where abc, xyz etc are the names of the files and the contents of list |
You need an extra 'cat' in the first line eg if your input list of files is in_list.txt, then
Code:
for file in `cat in_list.txt` |
You may want add quotes around the second field. This is usually done for text fields in csv files. CSV files don't normally have fields with newlines in them. Also, it begs the question how embedded quotes are handled. I tried an experiment, exporting a two record csv file from oocalc that contained quotes in the second column.
Code:
"file1","This is a test. How will it export a csv file that contains “embedded quotes?” I will also open up the file and insert newline characters if need be." A csv file with a multiline field would probably not be very portable. You might consider using an xml format instead. If the purpose is to create these files from a script, this is often done in bash using HERE documents. A HERE document can even contain bash variables that are expanded before the file is written. Code:
#!/usr/bin/env bash |
Quote:
|
If you are going for a generalised soln (as alluded to by jschiwal) and yourself here, use a more powerful lang eg Perl.
If the input files can have newlines, blank lines, extra quotes (or not) etc, that's my recommendation. Bash would get very messy. I'm assuming you'd want to cvt newlines inside (input) files to spaces or something in the output file? |
Or you could find a very unlikely record separator and (ab-)use awk ...
Code:
awk 'BEGIN{RS="123@^~456"}{ printf "%s,", FILENAME; for (i=1; i<=NF; i++){printf "%s ", $i}; printf "\n" }' *txt > list.csv Cheers, Tink |
Quote:
<for file in `cat in_list.txt` do contents=`cat $file` echo "$file,$contents" >> csvfile done> will hopefully do the trick - cheers |
Quote:
Thanks a lot guys for the suggestions -richmur! |
Quote:
that I was actually using in my test-runs ... Code:
awk 'BEGIN{RS="\x04";FS="\n"}{ printf "%s,", FILENAME; for (i=1; i<=NF; i++){printf "%s ", $i}; printf "\n" }' *txt > list.csv make any sense w/o the missing FS="\n". And you could of course go and choose a less ugly RS, for me a "\x04" worked just fine, but I don't know the data you're dealing with .... ;} Cheers, Tink |
All times are GMT -5. The time now is 08:47 AM. |