LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Replace character with hex value of my choosing from shell cmd line? (Solaris 5, ksh) (http://www.linuxquestions.org/questions/programming-9/replace-character-with-hex-value-of-my-choosing-from-shell-cmd-line-solaris-5-ksh-885473/)

discomurder 06-09-2011 04:17 PM

Replace character with hex value of my choosing from shell cmd line? (Solaris 5, ksh)
 
Hi everyone. I have a rather odd task I am trying to accomplish.

I am bouncing a file across platforms (windows->solaris->mainframe), and the file is starting out with a "special" character (the registered trademark "circle R") in some of the records. This character is not in the EBCDIC character set on the MF, so it is unrecognizable.

The MF developer I am working with asked if it is possible to replace the character with a specific hex value (AF) before it gets to the MF.

I was putzing around with sed, tr, etc. on the ksh command line, hoping to find an easy way to get one of them to substitute hex instead of ASCII. I have found that the usual shell utilities recognize the trademark character, so homing in on what to replace is solved. But I cannot get anything to actually substitute in the hex sequence I want. E.g. I was thinking something like...

>cat special_file | sed 's/R/\0xAF/g'

But my version of sed does not seem to have hex "editing" capability.

Any thoughts? Thanks in advance.

orgcandman 06-09-2011 06:19 PM

man tr

David the H. 06-09-2011 06:22 PM

You might try using iconv instead. This is just the kind of situation it was designed for.
Code:


iconv -f UTF-8 -t EBCDIC-US//TRANSLIT file

The TRANSLIT option means it will attempt to replace unsupported characters with supported substitutes. There's also an IGNORE option, which will simply drop them instead.

jschiwal 06-09-2011 06:36 PM

The command you posted changes capital R's to 0xAF. Was that just a problem posting it here?

I think you need to use `\xAF' instead of `\0xAF' in sed.

The tr command can also do what you want as a previous poster has stated.

discomurder 06-10-2011 12:39 PM

Thanks to David and jschiwal for the real replies. It appears that my installation of Solaris does not have the EBCDIC character set for iconv. MY /usr/lib/iconv/iconv_data file is as follows:
-----------------------------------------------------------------------------
646 8859 8859.646 8859.646.t
646de 8859 646de.8859 646de.8859.t
646da 8859 646da.8859 646da.8859.t
646en 8859 646en.8859 646en.8859.t
646fr 8859 646fr.8859 646fr.8859.t
646it 8859 646it.8859 646it.8859.t
646es 8859 646es.8859 646es.8859.t
646sv 8859 646sv.8859 646sv.8859.t
8859 646 8859.646 8859.646.t
8859 646de 8859.646de 8859.646de.t
8859 646da 8859.646da 8859.646da.t
8859 646en 8859.646en 8859.646en.t
8859 646fr 8859.646fr 8859.646fr.t
8859 646it 8859.646it 8859.646it.t
8859 646es 8859.646es 8859.646es.t
8859 646sv 8859.646sv 8859.646sv.t
----------------------------------------------------------------------------------

So I could not get iconv to work. And you are correct, sed does not need \0x, it just needs \x (I just typed that command off the top of my head). Unfortunately, it appears that the variant of sed on our system does not have hex substitution in the first place.

I did manage to get tr to work, at least on the Solaris side. After running the file through tr and looking at a hex dump (via od) of the result, it appears that the desired hex value was substituted in (if one uses the correct octal equivalent in the tr command). But when I transfer the file to the mainframe and look at the raw hex there, it seems that the file is being modified by the transfer. I guess if I'm ftp'ing in ASCII mode, there's no reason to expect the non-printable characters to make it through intact.

Anyway, thanks again for the real replies. The experimentation continues...

jschiwal 06-11-2011 06:03 PM

Looking in the sed info file, I see it is a GNU extension.
I also looked in the 1p manpage for the `tr' command. Then I was just lookin at the posix man page. You should be able to translate the character if you use octal codes. 0xAF = \0257.


All times are GMT -5. The time now is 12:18 PM.