Windows 7 txt file to Linux conversion problems
Hi,
I'm having a lot of trouble using a .txt file created by Microsoft Office Word in Linux.
I save the file in Windows Word as a txt file and select save as Unicode (UTF-8). That is also the end output that I need. I than have a conversion program in Linux Ubuntu that needs to run on this file. However I run in to difficulties because the text file contains characters like <C3><AF> when I use cat or Emacs.
I tried almost everything. Saving in different formats, converting with iconf and dos2unix, checking the Ubuntu character standard. But I've always ended up with the same problem, characters between <>. Is there someone who can give me the winning combination?
example line: Ze werken op de computer waarop ze ge<95>nstalleerd zijn
How it should be: Ze werken op de computer waarop ze geïnstalleerd zijn
|