characters count in a string

powah · 07-19-2007, 09:29 AM

How to count all the characters in a string?
e.g.
input string is "test".
Output will be like this:
e : 1
s : 1
t : 2

P.S.
I know how to count a certain character (X) within a string with perl by using the tr/// function like so:
$string="ThisXlineXhasXsomeXx'sXinXit":
$count = ($string =~ tr/X//);
print "There are $count Xs in the string";

However, I am trying to find out what to do next.

ghostdog74 · 07-19-2007, 10:07 AM

Code:

echo "test" | awk 'BEGIN{ FS=""}
{
  for(i=1;i<=NF;i++){
	array[$i]++
  }
}
END { for ( i in array ) print i ": "array[i] }'

output:

Code:

 # ./test.sh
e: 1
s: 1
t: 2

makyo · 07-19-2007, 12:42 PM

Hi.

With shell commands, will handle multiple arguments, files as well as strings:

Code:

#!/bin/sh

# @(#) s1       Demonstrate splitting strings, counting characters.

set -o nounset
echo
echo "GNU bash $BASH_VERSION" >&2
sed --version | head -1 >&2
sort --version | head -1 >&2
echo

F=${1?" $0: must supply an argument"}

j=0
for F in $*
do
  (( j++ ))
  if [ ! -f "$F" ]
  then
    echo " note - :$F: is not a file, treated as string."
    command="echo"
  else
    command="cat"
  fi

  echo
  echo " Contents of item $j is:"
  $command $F |
  nl

  echo
  echo " Character counts are:"
  $command $F |
  sed -e 's/\(.\)/\1\n/g' |
  sort |
  uniq -c
done

exit 0

producing:

Code:

% ./s1 test data1

GNU bash 2.05b.0(1)-release
GNU sed version 4.1.2
sort (coreutils) 5.2.1

 note - :test: is not a file, treated as string.

 Contents of item 1 is:
     1  test

 Character counts are:
      1
      1 e
      1 s
      2 t

 Contents of item 2 is:
     1  Now is the time
     2  for all good men
     3  to come to the aid
     4  of their country.

 Character counts are:
      4
     12
      1 .
      1 N
      2 a
      2 c
      2 d
      6 e
      2 f
      1 g
      3 h
      4 i
      2 l
      3 m
      2 n
      9 o
      3 r
      1 s
      7 t
      1 u
      1 w
      1 y

If you use this on longish files, you could adapt the awk script from ghostdog74 in place of the group of sed - sort - uniq commands. See man pages for details ... cheers, makyo

bulliver · 07-19-2007, 02:03 PM

Quote:

$string="ThisXlineXhasXsomeXx'sXinXit":
$count = ($string =~ tr/X//);
print "There are $count Xs in the string";

However, I am trying to find out what to do next.

Code:

# psuedocode

lettercount = Hash
for char in string
    if char in lettercount
        lettercount[char]++
    else
        lettercount[char] = 1

Basically, create a hash to hold the lettercounts (key is the char, value is the count). Loop over the string. If the character is in the hash, add 1 to its value. If it is not in the hash, add it with a value of 1...