LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   BASH string manipulation help (http://www.linuxquestions.org/questions/linux-newbie-8/bash-string-manipulation-help-670627/)

ptml 09-17-2008 05:05 PM

BASH string manipulation help
 
The expression:

expr match "now is the the time" ".*the"

Returns 14, the "e" in the second "the"

What is the expression to return the position of the FIRST "the" (any character is fine).

Thanks,
Paul

unSpawn 09-17-2008 07:01 PM

"expr index"?: i="now is the the time"; n=`expr index "now is ze the time" "t"`; echo "${i:${n}:1} -> $n"

ptml 09-18-2008 12:43 AM

Ya, I went down that road and found that is great for finding the first instance of a single letter, but doesn't work for a string search. Thanks for taking a shot at it!

Cheers,
Paul

ghostdog74 09-18-2008 12:55 AM

Code:

# s="now is the the time"
# echo ${s%%e*}
now is th
# s=${s%%e*}
# echo "Index of  first e is $(( ${#s} + 1 ))"
Index of  first e is 10


Ashok_mittal 09-18-2008 03:00 AM

Quote:

Originally Posted by ptml (Post 3283597)
The expression:

expr match "now is the the time" ".*the"

Returns 14, the "e" in the second "the"

What is the expression to return the position of the FIRST "the" (any character is fine).

Thanks,
Paul

expr match will return the no of letters in the matched string that why your substring ".*the" returns the no of characters till "the" that is 14.

use this
code:
Code:

echo `expr index "now is the the time" "[a-z]*the"`
it will return 8 as the first occurrence of 't' is at 8

Kenhelm 09-19-2008 11:14 AM

Reversing the line overcomes the limitation of expr to matching strings from the start of the line:-
Code:

r=$(echo "now is the the time" | rev)  # Gives "emit eht eht si won"
p=$(expr match "$r" ".*\beht\b")
if [ $p -eq 0 ];then
  echo 0
else
  expr length "$r" + 1 - $p
fi

Using sed is better:-
Code:

expr length "$(echo "now is the the time" | sed -n 's/\bthe\b.*/t/p')"
The first 'the' to the end of the line is replaced with 't'.
\b represents a word boundary (to avoid matches with words like 'then')
-n is so that 0 is returned if there isn't a 'the' in the line.

EDIT: A shorter version of the first method
Code:

r=$(echo "now is the the time" | rev)
expr length match "$r" ".*\beh\(t\b.*\)"


ptml 09-20-2008 12:21 AM

Bravo. Never in a million years would I have thought of reversing the string. Makes perfect sense. What would make more sense is for "expr" to include a flag to seek the first occurrence of something! Thanks for your insight.

philluder 08-26-2010 05:58 PM

string manipulation
 
If you use bash 4.x you can source the oobash. A string lib written in bash with oo-style. You only need to source the file...and running. There is a help system embedded, so you can ask for the description of every "method" :

http://sourceforge.net/projects/oobash/

String is the constructor function:

>String a abcda

>a.indexOf a

>0

>a.lastIndexOf a

>4

>a.indexOf da

>3

There are many "methods" more to work with strings in your scripts:

-base64Decode
-base64Encode
-capitalize
-center
-charAt
-concat
-contains
-count
-endsWith
-equals
-equalsIgnoreCase
-reverse
-hashCode
-indexOf
-isAlnum
-isAlpha
-isAscii
-isDigit
-isEmpty
-isHexDigit
-isLowerCase
-isSpace
-isPrintable
-isUpperCase
-isVisible
-lastIndexOf
-length
-matches
-replaceAll
-replaceFirst
-startsWith
-substring
-swapCase
-toLowerCase
-toString
-toUpperCase
-trim
-zfill

catkin 08-26-2010 07:38 PM

Alternatively, using parameter substitution
Code:

c@CW9:~$ string='now is the time'
c@CW9:~$ buf=the${string#*the}
c@CW9:~$ echo $(( ${#string} - ${#buf} + 1 ))



All times are GMT -5. The time now is 01:43 AM.