How to filter multiple csv files according to a complete rows
Hi Everyone, I have multiple csv files(>100). They are rain-gauge stations files for precipitation measurement. In these files, the numbers of stations are not equal(i.e. there are missing stations). I want only the stations that are present in all the files. The files have unique station id in column #3. I want to ask if this is possible in Linux?
It may be something along: for h in *.cvs; do sed '?????' $h > rippe_$h && mv rippe_$h $h.xls ; done |
You've got the right idea...
for i in *.cvs do cut -f3 -d"," $i done | sort | uniq This gives you a unique list of all the items in column 3 in all of the .cvs files in the current directory, assuming that you file delimiters are commas ",". I just realized - this should also work: cat *.cvs | cut -f3 -d"," | sort | uniq As well this should work too: cut -f3 -d"," *.cvs | sort | uniq -- note: the uniq command (filters for uniqueness) doesn't work with unsorted data, thus the sort command in front of it. Hope that these options help get you along in your task, this should give you the list of all stations in all of the files that have reported/recorded data in this dump set of .cvs files. |
All times are GMT -5. The time now is 03:46 AM. |