LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Detect length of line in file (http://www.linuxquestions.org/questions/programming-9/detect-length-of-line-in-file-605523/)

craigjward 12-09-2007 12:35 AM

Detect length of line in file
 
Hello,
I am writing a script, and I want to detect the amount of characters of each line in a file, is there a command I can use to do this?

ghostdog74 12-09-2007 01:01 AM

hint: man wc. There's option to count characters.
how to count for each line in file : use loops , while or for

gnashley 12-09-2007 12:46 PM

If you are using 'read' to get each line, then you could get the line -length internally:
Code:

cat file | while read LINE ; do
THISLINE=$LINE
no_of_chars=${#THISLINE}
echo "This line has $no_of_chars characters"
done

Actually, you don't have to be using read, but this is an easy way to separate each line into a variable value.
In fact you can do the whole thing in BASH without using any external programs at all. Here's the full code for a mostly complete implementation of wc is pure shell. The real meat that you need is fairly short, but this code includes line, word and character counting:
Code:

#!/bin/bash
# copyright 2007 Gilbert Ashley <amigo@ibiblio.org>
# BashTrix wc is an implementation of the 'wc' command
# written in pure shell. Most wc options are supported.
VERSION=0.2

# show program usage
show_usage() {
echo
echo "Usage: ${0##/*} [OPTION]... [FILE]..."
echo "${0##/} -[m|w|l|L] FILE"
echo "  or: (cat|echo) | ${0##/*} [OPTION]... "
echo "Print newline, word, and byte counts for each FILE"
echo "and a total line if more than one FILE is specified."
# echo "With no FILE, or when FILE is - read standard input." #conflicts
echo "  -m, --chars            print the character counts"
echo "  -l, --lines            print the newline counts"
echo "  -L, --max-line-length  print the length of the longest line"
echo "  -w, --words            print the word counts"
echo "      --help    display this help and exit"
echo "      --version  output version information and exit"
exit
}
show_version() {
echo ${0##/*}" (BashTrix) $VERSION"
echo "Copyright 2007 Gilbert Ashley <amigo@ibiblio.org>"
echo "This is free software written in pure POSIX shell."
exit
}

# Minimum number of arguments needed by this program
MINARGS=1
# show usage if '-h' or  '--help' is the first argument or no argument is given
case $1 in
        ""|"-h"|"--help") show_usage ;;
        "--version") show_version ;;
esac
# get the number of command-line arguments given
ARGC=$#
# check to make sure enough arguments were given or exit
if [[ $ARGC -lt $MINARGS ]] ; then
 echo "Too few arguments given (Minimum:$MINARGS)"
 echo
 show_usage
fi

for WORD in "$@" ; do
        case $WORD in
                -*)  true ;
                        case $WORD in
                                -m) COUNT_CHARS=1 ; shift ;;
                                -w) COUNT_WORDS=1 ; shift ;;
                                -l) COUNT_LINES=1 ; shift ;;
                                -L) MAX_LINE_LENGTH=1 ; shift ;;
                                --help) show_usage ;;
                                --version) show_version ;;
                                # -) READ_STDIN=1 ; shift ;;
                                -*) echo "Unrecognized argument" ; show_usage ;;
                        esac
                ;;
        esac
done

# function _freq counts the number of matches
# of PATTERN in PARSESTRING and returns FREQ
# example usage: _freq $PATTERN $PARSESTRING
function _freq() { FREQ=0
! [[ $PATTERN ]] && PATTERN=$1
! [[ $PARSESTRING ]] && PARSESTRING=$2
while [[ $PARSESTRING != "" ]] ; do
        case $PARSESTRING in
                *$PATTERN*) (( FREQ++ )) ;
                        PARSESTRING=${PARSESTRING#*${PATTERN}} ;;
                *) PARSESTRING="" ;;
        esac
done
echo $FREQ
}

# convert tabs to spaces
function uniform_white() {
while read GAGA ; do
 echo $GAGA
done
}

# function _line_word_count counts the words in a line
function _line_word_count() { LINE_WORD_COUNT=0
# turn all TABS into single spaces
STRING=$(echo $1 | uniform_white)
# strip off leading spaces
SEP=" "
while [[ ${STRING:0:1} = $SEP ]] ; do
 STRING=${STRING:1}
done
# strip trailing spaces
OFFSET=$(( ${#STRING} - 1 ))
while [[ ${STRING:$OFFSET:1} = $SEP ]] ; do
 # remove one CHAR from the STRING
 STRING=${STRING:0:$OFFSET}
 # decrement the OFFSET by one for the removed character
 (( OFFSET-- ))
done
PARSESTRING=$STRING
# count the number of spaces
_freq " " $PARSESTRING 1> /dev/null
# the number of words is spaces +1 except for blank lines
if [[ "$STRING" != "" ]] ; then
 LINE_WORD_COUNT=$(( $FREQ + 1 ))
fi
}

# function _line_char_count counts the characters in a line UNUSED
function _line_char_count() { LINE_CHAR_COUNT=0
#PARSESTRING=$(echo $1 | uniform_white)
PARSESTRING="$1"
while [[ $PARSESTRING != "" ]] ; do
        # read one character
        FC=${PARSESTRING:0:1}
        # advance the poiter one character
        PARSESTRING=${PARSESTRING:1}
        (( LINE_CHAR_COUNT++ ))
done
# add an extra character for the end of line CHAR
(( LINE_CHAR_COUNT++ ))
}

if [[ $# -gt 0 ]] ; then
        FILE_COUNT=0
        TOTAL_LINE_COUNT=0
        TOTAL_WORD_COUNT=0
        TOTAL_CHAR_COUNT=0
        TOTAL_LONGEST_LINE=0
        while [[ $# -gt 0 ]] ; do
                # count number of input files
                (( FILE_COUNT++ ))
                FILE_NAME="$1"
                if [ ! -r "$1" ] ; then
                        echo "Cannot find file $1" 1>&2
                        exit 1
                else
                        FILE_LINE_COUNT=0
                        FILE_WORD_COUNT=0
                        FILE_CHAR_COUNT=0
                        FILE_LONGEST_LINE=0
                        LINE=
                        IFS=
                        while read LINE ; do
                          # add the curent line to the line counter
                          (( FILE_LINE_COUNT++ ))
                          # capture the text of the line
                          STRING="$LINE"
                          # count the words in this line
                          if [[ $COUNT_WORDS ]] ; then
                            _line_word_count $STRING
                            FILE_WORD_COUNT=$(( $FILE_WORD_COUNT + $LINE_WORD_COUNT ))
                          fi
                          # count the characters in this line
                          if [[ $COUNT_CHARS ]] ; then
                            LINE_CHAR_COUNT=$(( ${#LINE} + 1 ))
                            FILE_CHAR_COUNT=$(( $FILE_CHAR_COUNT + $LINE_CHAR_COUNT ))
                          fi
                          if [[ $MAX_LINE_LENGTH ]] ; then
                                LINE_CHAR_COUNT=$(( ${#LINE} ))
                                if [[ $LINE_CHAR_COUNT -gt $FILE_LONGEST_LINE ]] ; then
                                        FILE_LONGEST_LINE=$LINE_CHAR_COUNT
                                fi
                          fi
                        # go to next LINE
                        done <"$1"
                        TOTAL_LINE_COUNT=$((TOTAL_LINE_COUNT + $FILE_LINE_COUNT))
                        TOTAL_WORD_COUNT=$(( $TOTAL_WORD_COUNT + $FILE_WORD_COUNT ))
                        TOTAL_CHAR_COUNT=$(( $TOTAL_CHAR_COUNT + $FILE_CHAR_COUNT ))
                fi
               
                if [[ $FILE_LONGEST_LINE -gt  $TOTAL_LONGEST_LINE ]] ; then
                                TOTAL_LONGEST_LINE=$FILE_LONGEST_LINE
                fi
               
                FILE_OUTPUT="$(echo $FILE_LINE_COUNT $FILE_WORD_COUNT $FILE_CHAR_COUNT \
                                $FILE_LONGEST_LINE $FILE_NAME | uniform_white)"
                echo $FILE_OUTPUT
        # go to next FILE in $@
        shift
        done
       
        # detect multiple input files so the totals can be shown
        if [[ $FILE_COUNT -gt 1 ]] ; then
        TOTAL_OUTPUT="$(echo $TOTAL_LINE_COUNT $TOTAL_WORD_COUNT $TOTAL_CHAR_COUNT \
                        $TOTAL_LONGEST_LINE total | uniform_white)"
        echo $TOTAL_OUTPUT
        fi
        exit
else
        # accept piped-in input only if "$@" is empty  ( $#=0 )
        # elif [[ $READ_STDIN ]] would enforce the POSIX
        # piped input is presumed to be separated into lines already
        FILE_LINE_COUNT=0
        FILE_WORD_COUNT=0
        FILE_CHAR_COUNT=0
        FILE_LONGEST_LINE=0
        IFS=
        while read LINE ; do
                # capture the text of the line
                STRING="$LINE"
                #increment the line counter
                (( FILE_LINE_COUNT++ ))
                # count hte words in a line
                if [[ $COUNT_WORDS ]] ; then
                        _line_word_count $STRING
                        TOTAL_WORD_COUNT=$(( $TOTAL_WORD_COUNT + $LINE_WORD_COUNT ))
                fi
                # count the characters in a line
                if [[ $COUNT_CHARS ]] ; then
                        LINE_CHAR_COUNT=$(( ${#LINE} +1 ))
                        TOTAL_CHAR_COUNT=$(( $TOTAL_CHAR_COUNT + $LINE_CHAR_COUNT ))
                fi
                # find the length of the longest line
                if [[ $MAX_LINE_LENGTH ]] ; then
                        LINE_CHAR_COUNT=$(( ${#LINE} ))
                        if [[ $LINE_CHAR_COUNT -gt $FILE_LONGEST_LINE ]] ; then
                                FILE_LONGEST_LINE=$LINE_CHAR_COUNT
                        fi
                fi
                # go to the next LINE
                shift
        done
        TOTAL_LINE_COUNT=$((TOTAL_LINE_COUNT + $FILE_LINE_COUNT))
        TOTAL_WORD_COUNT=$(( $TOTAL_WORD_COUNT + $FILE_WORD_COUNT ))
        TOTAL_CHAR_COUNT=$(( $TOTAL_CHAR_COUNT + $FILE_CHAR_COUNT ))
        TOTAL_LONGEST_LINE=$FILE_LONGEST_LINE
        SHOW_TOTALS=1
        shift
        # printf "%-5s %-20s %s\n" "hello" "you" there
        TOTAL_OUTPUT="$(echo $TOTAL_LINE_COUNT $TOTAL_WORD_COUNT $TOTAL_CHAR_COUNT \
                        $TOTAL_LONGEST_LINE total | uniform_white)"
        TOTAL_OUTPUT="$(printf "%s %-2s %-2s %-2s %s\n" $TOTAL_LINE_COUNT $TOTAL_WORD_COUNT \
                $TOTAL_CHAR_COUNT $TOTAL_LONGEST_LINE total)"
       
        echo $TOTAL_OUTPUT
        exit
fi

exit $ERROR


craigjward 12-09-2007 02:28 PM

Thanks,

Essentially, I am making a sliding puzzle type game in terminal using an ascii art file. I am trying to check the length of each line in the file, to make sure that the file has the right dimensions... this is as far as i have gotten. Later I will need to take the length of each line and divide it by 3 as well as divide the number of lines by 3 so that I have 9 panels to work with. In my code so far, it seems as if I am getting stuck in my loop.. can anyone tell me where my problem is?

here is my code so far...

Code:

#!/bin/bash

CURRENTLINE=
CURRENTLINELENGTH=0
MAXLINELENGTH=0
PUZZLEFILE=


echo "Craig Ward's Puzzle Game"
echo "------------------------"

if [ $# -gt 1 ]; then
        echo "Usage: puzzle [filepath]"
        exit 1
elif [ $# -eq 1 ]; then
        echo "Checking Puzzle File $1"
        MAXLINELENGTH= wc -L $1

        for i in wc -l $1; do
                CURRENTLINE= sed -n '$ip' $1
                CURRENTLINELENGHTH= wc -m $CURRENTLINE
                if [ $CURRENTLINELENGTH -lt $MAXLINELENGTH ]; then
                        echo "Puzzle File Corrupt. Ending Script."
                        exit 1
                fi
        done
        PUZZLEFILE= $1

elif [ $# -lt 1 ]; then
        echo "No puzzle file specified. Using default puzzle"
fi

Thanks again

chrism01 12-09-2007 07:32 PM

Use either of these techniques to assign cmd results to a variable
var=$(wc -l $1)
or
var=`wc -l $1`
Also, you don't seem to be assigning to $ip

ntubski 12-10-2007 02:29 AM

You have a few problems in your code. I suggest you test out code fragments at the command line to see how it works:
Code:

$ MAXLINELENGTH= wc -L sample-file
22 sample-file
$ echo $MAXLINELENGTH

$ MAXLINELENGTH=wc -L sample-file
bash: -L: command not found
$ MAXLINELENGTH=$(wc -L sample-file)
$ echo $MAXLINELENGTH
22 sample-file
$ MAXLINELENGTH=$(wc -L <sample-file)
$ echo $MAXLINELENGTH
22

I think you wanted i to go from 1 to the number of lines in the file, however the form of for loop you used iterates over the list of words after the in keyword.
Code:

$ for i in wc -l sample-file; do echo $i; done
wc
-l
sample-file
$ for ((i=1; "$i" <= 5; i++)); do echo $i; done
1
2
3
4
5

lastly instead of
Code:

CURRENTLINE= sed -n '$ip' $1
you want
Code:

CURRENTLINE=$(sed -n "${i}p" $1)
Note the double quotes, single quotes will prevent variables from being substituted, the ${} syntax is needed to prevent the shell from looking for a variable called ip.

After you've figured that out, see if you can rewrite the loop with the read function, it will be easier to understand (and probably more efficient). Useful resource: http://tldp.org/LDP/abs/html/

bigearsbilly 12-10-2007 04:00 AM

Quote:

can anyone tell me where my problem is?
yes - trying to do something this complex in shell script ;)


maybe you should make your functions a bit smaller?

a function does one thing only and should be around about 30 lines of code.

craigjward 12-10-2007 09:37 AM

Thanks a bunch!
I'm well on my way now!

craigjward 12-10-2007 10:07 AM

I'm sorry for being so frusterating.. but how would I break a line in to three equal parts and cat them in to three separate files?

also, how would i go abouts randomizing the order of a sequence of numbers from 1-9?

osor 12-11-2007 11:13 PM

Quote:

Originally Posted by craigjward (Post 2986039)
how would I break a line in to three equal parts and cat them in to three separate files?

The following will result in the three files foo0, foo1, and foo2.
Code:

#!/bin/bash

PREFIX=foo # The prefix of the resulting files

read LINE

no_chars=$((${#LINE}+1)) # The initial number of characters, including newline
total=$((no_chars + 2 * no_chars % 3)) # Up to the next multiple of 3
bytes=$((total/3)) # Bytes per file

echo "$LINE" | split "-b$bytes" -da1 - $PREFIX

Quote:

Originally Posted by craigjward (Post 2986039)
how would i go abouts randomizing the order of a sequence of numbers from 1-9?

Code:

#!/bin/bash

for ((i = 1; i <=9; i++)); do
        printf "${RANDOM}\t${i}\n"
done | sort | cut -f2



All times are GMT -5. The time now is 07:04 AM.