LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Translating part of a line (http://www.linuxquestions.org/questions/programming-9/translating-part-of-a-line-928327/)

danielbmartin 02-08-2012 10:54 AM

Translating part of a line
 
I want to change the first character of each line to upper case.

Sample input:
Quote:

tom
dick
harry
Desired output:
Quote:

Tom
Dick
Harry
The brute force technique uses cut to make two work files, tr one of them to upper case, and paste to arrive at the desired result. Googling in search of a more direct method I came upon this one:
Code:

sed 's/\([a-z]\)\([a-zA-Z0-9]*\)/\u\1\2/g' $InFile
It works but I don't understand it. Attempting to simplify this one-liner I tried this:
Code:

sed 's/\([a-z]\)\(.*\)/\u\1\2/' $InFile
This also works but why?!? The magic may lie in the \u but what does that mean?

Daniel B. Martin

H_TeXMeX_H 02-08-2012 11:14 AM

It is an escape sequence meaning upper case.

For less confusing solution:

Code:

awk 'BEGIN{ FS=""; OFS="" }{ $1=toupper($1); print $0 }' input

ntubski 02-08-2012 10:47 PM

The sed code you posted will capitalize the first letter of every word, if you just need the first letter of a line, it can be simpler:
Code:

sed 's/^./\u&/' $InFile
Reference for \u

firstfire 02-09-2012 01:55 AM

Hi.

Quote:

Originally Posted by ntubski (Post 4597700)
The sed code you posted will capitalize the first letter of every word, if you just need the first letter of a line, it can be simpler:
Code:

sed 's/^./\u&/' $InFile
Reference for \u

Even if you want to capitalize the first letter of every word, it can be simpler
Code:

$ echo abba dav | sed 's/\b./\u&/g'
Abba Dav
# add brackets to show what matched
$ echo abba dav | sed 's/\b./[\u&]/g'
[A]bba[ ][D]av

This work because non-alphabetic characters (particularly space) can not be capitalized.
Similarly, you can capitalize last character of each word
Code:

$ echo abba dav | sed 's/.\b/[\u&]/g'
abb[A][ ]da[V]

Excerpt from `info sed escapes':
Quote:

`\b'
Matches a word boundary; that is it matches if the character to
the left is a "word" character and the character to the right is a
"non-word" character, or vice-versa.
`\B'
Matches everywhere but on a word boundary; that is it matches if
the character to the left and the character to the right are
either both "word" characters or both "non-word" characters.
Note that these escape sequences as well as case conversion ones (\u etc) are all GNU extensions.

danielbmartin 02-09-2012 01:27 PM

Thank you, H_TexMex_H, ntubski, and firstfire for useful contributions to this thread. Double thanks to ntubski for providing a pointer to the place where I can learn more on this subject.

I am in awe of the amazing range of operations of which sed is capable.

We can mark this thread SOLVED!

Daniel B. Martin

David the H. 02-11-2012 06:07 AM

bash-only solutions (version 4+), using parameter substitution:

Code:

while read line; do
        echo "${line^}" >>tempfile.txt
done <origfile.txt

[[ -n "$line" ]] && echo "${line^}" >>tempfile.txt        #processes the final line if there's no ending newline in the text

mv -f tempfile.txt origfile.txt && rm -f tempfile.txt

Or store the text in an array first, with mapfile:

Code:

mapfile -t lines <origfile.txt
printf "%s\n" "${lines[@]^}" >tempfile.txt
mv -f tempfile.txt origfile.txt && rm -f tempfile.txt



All times are GMT -5. The time now is 07:06 PM.