LinuxQuestions.org - [SOLVED] Translating part of a line

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - Translating part of a line (https://www.linuxquestions.org/questions/programming-9/translating-part-of-a-line-928327/)

Translating part of a line

I want to change the first character of each line to upper case.

Sample input:

Quote:

tom
dick
harry

Desired output:

Quote:

Tom
Dick
Harry

The brute force technique uses cut to make two work files, tr one of them to upper case, and paste to arrive at the desired result. Googling in search of a more direct method I came upon this one:

Code:

sed 's/$[a-z]$$[a-zA-Z0-9]*$/\u\1\2/g' $InFile

It works but I don't understand it. Attempting to simplify this one-liner I tried this:

Code:

sed 's/$[a-z]$$.*$/\u\1\2/' $InFile

This also works but why?!? The magic may lie in the \u but what does that mean?

Daniel B. Martin

It is an escape sequence meaning upper case.

For less confusing solution:

Code:

awk 'BEGIN{ FS=""; OFS="" }{ $1=toupper($1); print $0 }' input

The sed code you posted will capitalize the first letter of every word, if you just need the first letter of a line, it can be simpler:

Code:

sed 's/^./\u&/' $InFile

Reference for \u

Hi.

Quote:

Originally Posted by ntubski (Post 4597700)

The sed code you posted will capitalize the first letter of every word, if you just need the first letter of a line, it can be simpler:

Code:

sed 's/^./\u&/' $InFile

Reference for \u

Even if you want to capitalize the first letter of every word, it can be simpler

Code:

$ echo abba dav | sed 's/\b./\u&/g'

Abba Dav

# add brackets to show what matched

$ echo abba dav | sed 's/\b./[\u&]/g'

[A]bba[ ][D]av

This work because non-alphabetic characters (particularly space) can not be capitalized.
Similarly, you can capitalize last character of each word

Code:

$ echo abba dav | sed 's/.\b/[\u&]/g'

abb[A][ ]da[V]

Excerpt from `info sed escapes':

Quote:

`\b'
Matches a word boundary; that is it matches if the character to
the left is a "word" character and the character to the right is a
"non-word" character, or vice-versa.
`\B'
Matches everywhere but on a word boundary; that is it matches if
the character to the left and the character to the right are
either both "word" characters or both "non-word" characters.

Note that these escape sequences as well as case conversion ones (\u etc) are all GNU extensions.

Thank you, H_TexMex_H, ntubski, and firstfire for useful contributions to this thread. Double thanks to ntubski for providing a pointer to the place where I can learn more on this subject.

I am in awe of the amazing range of operations of which sed is capable.

We can mark this thread SOLVED!

Daniel B. Martin

bash-only solutions (version 4+), using parameter substitution:

Code:

while read line; do

        echo "${line^}" >>tempfile.txt

done <origfile.txt



[[ -n "$line" ]] && echo "${line^}" >>tempfile.txt        #processes the final line if there's no ending newline in the text



mv -f tempfile.txt origfile.txt && rm -f tempfile.txt

Or store the text in an array first, with mapfile:

Code:

mapfile -t lines <origfile.txt

printf "%s\n" "${lines[@]^}" >tempfile.txt

mv -f tempfile.txt origfile.txt && rm -f tempfile.txt