[shell] advanced string parsing

grail · 10-22-2011, 12:26 PM

Quote:

However, the OP wants to convert to RPN, so he cannot lose the operators.

Fair enough but his example does not seem to show that

Unfortunately you can't perform the bash substitution to get rid of the numbers as a asterisk
returns all files and folders in the current directory

crts · 10-22-2011, 12:54 PM

@OP: No offense, but I am getting the impression that you have yet to realize the complexity of the task that lies ahead. Do you have a rough idea of how you want to convert to RPN? I am asking because from the examples you seem to dismiss the operators. I also see that you are only using one array. But you will need at least two; one of it will function as a stack.
Here is some good pseudo-code for the shunting-yard algorithm:
http://en.wikipedia.org/wiki/Shunting-yard_algorithm

Am I assuming correctly, that at the moment you intend to only implement +/- operation and simply append the operators at the end of the array?

JoseUP · 10-22-2011, 01:49 PM

anybody dont know where is problem?

crts: its ok, I really dont know much about shell and this task is work beyond me. but i think i used bad idea when i used "rpn calculator". i simply do string calculator with basic arithmetic operations.

"Am I assuming correctly, that at the moment you intend to only implement +/- operation and simply append the operators at the end of the array?"
-First step is parse string to operators and operands to the two arrays. then I try to calculate result with it.

catkin · 10-22-2011, 05:43 PM

Quote:

Originally Posted by JoseUP

this is my code, i only fill variable, sorry if i misunderstood code

The problem is REPLY="20+10+15" which should be s="20+10+15". The contents of $s are fed to the read by <<< "$s". $REPLY is the default variable that read reads into.

Nominal Animal · 10-22-2011, 10:04 PM

If I may butt in? If you need an RPN calculator, then your input string should be of form

Code:

 20 10 15 + +

Let us assume you need a simple calculator which does not support proper operator precedence, but always computes left to right. In other words, given string 10+5*2 it actually computes (10+5)*2. Such a calculator is very simple, and requires exactly one stack, one accumulator (a running result value), and one temporary value.

Assuming the input expression is given at the command line, the first step is to construct the stack. I'm assuming Bash shell:

Code:

#!/bin/bash

set -f
stack=(`echo "$*" | tr -s '\t\n\v\f\r ' '      ' | sed -e 's| *\([-+/*]\) *| \1 |g'`)

echo "Stack:" "${stack[@]}" >&2

The set -f disables pathname expansion, so * stays * instead of being expanded to the list of files in current directory. set +f would turn it back on, but there is no reason to turn it back on in this script. It is only effective for the duration of this script; the caller is not affected at all.

The stack line constructs an array containing of each number and operator separately. I used backticks instead of the preferred $(...) notation for clarity only -- the preferred form being stack=($(...)) which I think is a bit harder to parse. The tr will combine all consecutive whitespace to a single normal space, and the sed expression makes sure there is a space before and after each operator (+ - / *). Thus, each number and each operator will be a separate item in the array.

The last echo command is for illustation only. It will output the stack to standard error. You can safely omit it, if you don't like the verbose output.

The leftmost value may be preceded by a plus or minus sign. We could handle it specially, but it is easiest just to start with a logical zero -- as if the expression started with 0 + . It will not change the result, but it does shorten the code.

Code:

result=0
stack=('+' "${stack[@]}")

Next, we need a loop, processing the stack until the stack is empty.

Code:

while [ ${#stack[@]} -gt 0 ]; do

Now, the leftmost item in the stack must be an operator, so we pop that. (If it is not, the expression is malformed. You'll see in the next snippet how we deal with unary operators + and -.)

Code:

    operator=${stack[0]}
    unset stack[0]
    stack=("${stack[@]}")

Next, we pop the value. There may be unary operators ("signs") + and/or - preceding it; they are logically part of the value. So, we combine them into the value.

Code:

    sign=1
    while [ "${stack[0]}" = "-" ] || [ "${stack[0]}" = "+" ]; do
        [ "${stack[0]}" = "-" ] && sign=$[ -sign ]
        unset stack[0]
        stack=("${stack[@]}")
    done

    if [ ${#stack[@]} -lt 1 ]; then
        echo "Expression cannot end with a mathematical operator!" >&2
        exit 1
    fi

    value=$[ sign * (${stack[0]} -0) ] || exit $?
    unset stack[0]
    stack=("${stack[@]}")

    echo "result=$result, operator=$operator, value=$value" >&2

The echo here outputs the current result, the operator, and the popped value to standard error. Again, you can remove it if you don't like the verbose output. The if clause makes sure there is a value in the stack. If you want, you can omit that; the value expression calculates "nothing" as zero.

Next, we need to apply the operator to the current result and the popped value. A case statement is perfect for this:

Code:

    case "$operator" in
    '+') result=$[ result + value ] || exit $? ;;
    '-') result=$[ result - value ] || exit $? ;;
    '*') result=$[ result * value ] || exit $? ;;
    '/') result=$[ result / value ] || exit $? ;;
    *)   echo "$operator: Not a recognized mathematical operator." >&2
         exit 1
         ;;
    esac
done

Note that the last case acts as a catchall, and will catch e.g. numbers, if the expression is malformed.

After the above, there is nothing else to do, but output the result:

Code:

echo "$result"

I haven't really tested the above well, and it's pretty late on a Saturday night, so it might very well contain errors.. but it does seem to work for me. There are a few obvious enhancements, though. For example, you could use a sed expression to eliminate unary operators (signs), simplifying the code quite a bit.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

A true RPN calculator is in some ways much simpler. You usually use one stack for values, and another for operators. Note that you need some special handling for + and - operators, namely combine all except the initial one into the value. In other words, --+-+5 should be parsed as - (5) or - (+5) or + (-5) (just as value -5 at the beginning of an expression). Otherwise you'd need to be able to somehow differentiate between '-' the substraction operator, and '-' the negation operator.

The loop in an RPN calculator pops off one operator from the operator stack, and applies it to the top of the value stack. +, -, * and / pop off two values, apply the operator to them, and push the result back in. (Certain other operators, like negation (unary -), would only modify the topmost value.)

(When only one stack is used, you start at the bottom of the stack, skipping values. Each operator you encounter apply to the values immediately below your current position in the stack. All items in the stack below your current position will always be numbers. In computer programs, it is usually simpler and more efficient to use two separate stacks instead; one will contain numeric values, and the other operators. In the single stack model, you can move all numbers before any operators, without changing the result; this is why the two-stack model always gives the same results.)

When the operator stack is empty, the value stack should have a single result value. If not, the expression was faulty. If the expression was faulty, you may also run out of values when applying the operators.

As others have pointed out, there are very efficient algorithms for converting normal mathematical expressions to RPN -- search for "infix to postfix".

Hope this helps.

grail · 10-22-2011, 11:40 PM

Well NA as usual cleared up a lot for me

So here is another look at the same idea

Code:

#!/bin/bash

set -f

result=0
mal_test='^[\/*]|[\/*+-]$|[\/*+-][\/*]|[^0-9\/*+-]'

num=(${1//[\/*+-]/ })
ops=(${1//[0-9]/ })

if (( ${#ops[*]} > ${#num[*]} )) || [[ $1 =~ $mal_test ]]
then
    echo "Malformed expression"
    exit 1
fi

if (( ${#ops[*]} != ${#num[*]} ))
then
    (( result += ${num[0]} ))
    num=(${num[*]:1})
fi

for i in ${!num[*]}
do
    if (( ${#ops[i]} == 2 ))
    then
        (( num[i] *= ${ops[i]:(-1)}1 ))
        ops[i]=${ops[i]:0:1}
    fi
    (( result ${ops[i]}= ${num[i]} ))
done

echo $result

JoseUP · 10-25-2011, 10:42 AM

I am pretty big donk, I dont understood what you want and give you too little information every time. I want do string calculator which support operator precedence. But I will understand if you gave up it

but thanks, codes what you writed here helps me a lot

grail · 10-26-2011, 03:52 AM

Well precedence is a pretty big thing not to include and requires a quite larger look into the line prior to manipulation.
Maybe based on what everyone has given you, see how you go and come back if you need more help?

Not sure if anyone asked, but this is not homework is it?