"
[:print:]" is a
character class. It contains a pre-defined list of common characters used in regex tests. Let's see what it's supposed to contain (as taken from the grep info page):
Code:
`[:print:]'
Printable characters: `[:alnum:]', `[:punct:]', and space.
So lets also look up
[:alnum:] and
[:punct:]:
Code:
`[:alnum:]'
Alphanumeric characters: `[:alpha:]' and `[:digit:]'; in the `C'
locale and ASCII character encoding, this is the same as
`[0-9A-Za-z]'.
`[:punct:]'
Punctuation characters; in the `C' locale and ASCII character
encoding, this is `! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \
] ^ _ ` { | } ~'.
(You can look up
[:alpha:] and "
[:digit:]" on your own if you need to, but they're pretty self-explanatory.)
There are two things clear from this. First, the results are dependent to some degree on your current locale, so be sure to specifically set your script to the C locale first if you want it to be consistent. And second, that these are
fixed definitions. You can't
exclude a character that's in a set. If none of the pre-defined sets work for you, you have to set up your own, customized list of characters in a regex bracket expression.
On the other hand, if you want to expand a list, you can also combine character classes, either with each other or with individual characters. To create a list of just the characters you want you need to do something like this:
Code:
$'[-\"\'!#%&()*+,/:;<=>@_`|~ [:alnum:]]'
Note also that there are some limitations as to the position certain characters can be in when included in a bracket expression ('
]' needs to be first, '
^' can't be first, and '
-' needs to be first or last), and you have to ensure that anything that can be interpreted by the shell is properly escaped as well. In this case the
$'' quoting style (not posix, but supported by most by modern shells) can be very handy, as it expands certain backslash escapes into their literal equivalents. See the bash man page for details. (Actually, I'm not completely sure I built the above string correctly. It still needs to be tested.)
If you're using
tr though, it can be a bit easier, as you don't need to build a full [] bracket expression, just a list of the raw characters.