Long time lurker, first time I haven't been able to easily search for my answer!
I have a text file in this format:
Where any character value can only be A,B,C,D or -
For each column (not row), I would like to calculate the highest number of repeat characters (A,B,C,D only).
An output for the above example would be:
I have written this very clunky script, but am unhappy with the speed.
Could anyone suggest a faster way of doing this?
# begin loop here from 1 to RowLength
for (( n=1; n<=$RowLength; n++ ))
INPUT=`cut -c $n $TargetFile` # Cut input to a single character, starting column n
A=$(echo $INPUT | tr -dc 'A' | wc -c) # count number of A,B,C,D in this column
B=$(echo $INPUT | tr -dc 'B' | wc -c)
C=$(echo $INPUT | tr -dc 'C' | wc -c)
D=$(echo $INPUT | tr -dc 'D' | wc -c)
ABCD=`echo -e "$A\n$B\n$C\n$D" | sort -n | tail -1`
Many thanks for any help!