As I said before, I'm guessing the problem is with the encoding of the file
. Most text editors have features to auto-detect the encoding, but the shell display is UTF-8 and most of the simpler tools are generally UTF-8 or ascii only. If your file isn't UTF-8, then it won't display properly in the terminal.
$ echo "foo§bar" > file.txt
$ file file.txt
file.txt: UTF-8 Unicode text
$ cat file.txt
$ iconv -f UTF-8 -t ISO-8859-1 file.txt > file2.txt
$ file file2.txt
file2.txt: ISO-8859 text
$ cat file2.txt
And I said that cut
can't handle them as delimiters
, not that it couldn't process text containing them. Whenever I try to use anything other than ascii in the -d
option, I get an error saying that "the delimiter must be a single character", which demonstrates that it won't accept multi-byte delimiting characters.
Edit: The cut
info page lists this (non-functional) option:
`-n' Do not split multi-byte characters (no-op for now).
This tells me that it's not currently able to distinguish between single- and multi-byte characters, but that they intend to include that feature in the future.