LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 12-01-2012, 07:40 PM   #1
romagnolo
Member
 
Registered: Jul 2009
Location: Montaletto
Distribution: Debian GNU/Linux
Posts: 107

Rep: Reputation: 5
Bash: $'\x00' -- What is this?


Our super-modern commerce-oriented web search engines are simply too stupid to allow programmers to search information about specific sequences of symbols... Well, this is why I search for a solution here.

I came across a totally new, never-seen-before Bash expansion or substitution; this is constituted of sequences of characters like
Code:
$'\x00\x10\x20'
What it does is the replacement of the UTF-8 bytes sequence with the represented character.

I've never seen this in Bash's manual (and I'm leaned to think this is not described there at all) nor everywhere else.

Do you have any information on this construct, such as its name, or a document where this is described?
 
Old 12-02-2012, 02:51 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Those are hexadecimal eight-bit characters that can be used by many programs. Have a look at man ascii for their specific meaning.

This from the bash man page;
Code:
Words of the form $'string' are treated specially.  The word expands to
       string, with backslash-escaped characters replaced as specified by  the
       ANSI  C  standard.  Backslash escape sequences, if present, are decoded
       as follows:
              \a     alert (bell)
              \b     backspace
              \e
              \E     an escape character
              \f     form feed
              \n     new line
              \r     carriage return
              \t     horizontal tab
              \v     vertical tab
              \\     backslash
              \'     single quote
              \"     double quote
              \nnn   the eight-bit character whose value is  the  octal  value
                     nnn (one to three digits)
              \xHH   the  eight-bit  character  whose value is the hexadecimal
                     value HH (one or two hex digits)
              \cx    a control-x character

       The expanded result is single-quoted, as if the  dollar  sign  had  not
       been present.
Example:
Code:
# using hexadecimal values
$ echo  $'\x21\x22\x23'
!"#

# using octal values:
$ echo '!~' | tr '\041\176' 'X'
XX
PS: They are not UTF-8 specific.
 
1 members found this post helpful.
Old 12-02-2012, 07:39 AM   #3
romagnolo
Member
 
Registered: Jul 2009
Location: Montaletto
Distribution: Debian GNU/Linux
Posts: 107

Original Poster
Rep: Reputation: 5
Thank you, that's very useful. I didn't know the we had a man ascii too.
If only there was an effective way to parse across man pages..
 
Old 12-02-2012, 08:35 AM   #4
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
You will note that the first character is 0x00 - which is null. The sequence is the hex representation of a specific UTF32 glyph. Interpretation of the sequence is under the control of the font being used by the terminal emulator...
 
Old 12-02-2012, 08:51 AM   #5
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Quote:
Originally Posted by romagnolo View Post
Thank you, that's very useful. I didn't know the we had a man ascii too.
If only there was an effective way to parse across man pages..
You do know that you can search all the man pages present on your box?
Code:
$ man -k utf-8
utf8 (7)             - an ASCII compatible multibyte Unicode encoding
FcStrCmp (3)         - compare UTF-8 strings
FcStrCmpIgnoreCase (3) - compare UTF-8 strings ignoring case
FcStrStr (3)         - locate UTF-8 substring
FcStrStrIgnoreCase (3) - locate UTF-8 substring ignoring ASCII case
FcUcs4ToUtf8 (3)     - convert UCS4 to UTF-8
FcUtf8Len (3)        - count UTF-8 encoded chars
FcUtf8ToUcs4 (3)     - convert UTF-8 to UCS4
utf-8 (7)            - an ASCII compatible multibyte Unicode encoding
uxterm (1)           - X terminal emulator for Unicode (UTF-8) environments

$ man -k ascii
aaxine (1)           - an ASCII art video player
ascii (7)            - the ASCII character set encoded in octal, decimal, and hexadecimal
asciitopgm (1)       - convert ASCII graphics into a portable graymap
asctime (3)          - transform date and time to broken-down time or ASCII
asctime_r (3)        - transform date and time to broken-down time or ASCII
ctime (3)            - transform date and time to broken-down time or ASCII
.
.
.
strtold (3)          - convert ASCII string to floating-point number
toascii (3)          - convert character to ASCII
utf-8 (7)            - an ASCII compatible multibyte Unicode encoding
utf8 (7)             - an ASCII compatible multibyte Unicode encoding
man man for details (Yes, there's a man page for the man page ).
 
2 members found this post helpful.
Old 12-03-2012, 03:26 AM   #6
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,362

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
... and if you don't like computerese eg switches (-k), there normally is an alias 'apropos=man -k', thus
Code:
apropos utf8
returns the same answers
 
  


Reply

Tags
bash



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Can't inject x00 value with bash-printf using string format vulnerability in x64 kaos_npc Programming 5 05-27-2014 02:14 PM
LXer: Share And Discover Cool Bash Tricks With Bash One-Liners LXer Syndicated Linux News 0 01-30-2012 09:50 AM
Bash problem : -bash: [: /bin/bash: unary operator expected J.A.X Linux - Software 1 09-22-2011 05:52 AM
[SOLVED] Using a long Bash command including single quotes and pipes in a Bash script antcore Linux - General 9 07-22-2009 11:10 AM
why did bash 2.05b install delete /bin/bash & "/bin/sh -> bash"? johnpipe Linux - Software 2 06-06-2004 06:42 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 10:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration