LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   How to filter multiple csv files according to a complete rows (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-filter-multiple-csv-files-according-to-a-complete-rows-4175539717/)

Haba2015 04-14-2015 03:45 PM

How to filter multiple csv files according to a complete rows
 
Hi Everyone, I have multiple csv files(>100). They are rain-gauge stations files for precipitation measurement. In these files, the numbers of stations are not equal(i.e. there are missing stations). I want only the stations that are present in all the files. The files have unique station id in column #3. I want to ask if this is possible in Linux?

It may be something along: for h in *.cvs; do sed '?????' $h > rippe_$h && mv rippe_$h $h.xls ; done

Lnthink 04-14-2015 04:27 PM

You've got the right idea...

for i in *.cvs
do
cut -f3 -d"," $i
done | sort | uniq

This gives you a unique list of all the items in column 3 in all of the .cvs files in the current directory, assuming that you file delimiters are commas ",".

I just realized - this should also work:
cat *.cvs | cut -f3 -d"," | sort | uniq

As well this should work too:
cut -f3 -d"," *.cvs | sort | uniq

-- note: the uniq command (filters for uniqueness) doesn't work with unsorted data, thus the sort command in front of it.

Hope that these options help get you along in your task,
this should give you the list of all stations in all of the files that have reported/recorded data in this dump set of .cvs files.


All times are GMT -5. The time now is 03:46 AM.