Register a domain and help support LQ
Go Back > Forums > Other *NIX Forums > Solaris / OpenSolaris
User Name
Solaris / OpenSolaris This forum is for the discussion of Solaris and OpenSolaris.
General Sun, SunOS and Sparc related questions also go here.


  Search this Thread
Old 03-25-2008, 04:47 PM   #1
Registered: Dec 2005
Location: Madrid
Distribution: Solaris 10, Solaris Express Community Edition
Posts: 547

Rep: Reputation: 36
Converting UTF-16 files to another encoding (such as UTF-8)


I received a bunch (>1700) of scripts generated by a Microsoft SQL Server Enterprise Manager and I must work on them. I think they are UTF-16 files, which is the internal representation of text of Windows >= 2000 and on Solaris they just appear as data.
bash-3.2$ file dbo.tTransactionIncidents.TAB
dbo.tTransactionIncidents.TAB: data
I mean, I cannot grep or sed through them if I don't re-encode them. With vim, I can :set fileencoding=utf-8, then update and write the file, and it works, but the problem is that the number of files is so high that I need a way to do it with a script and I'm not aware of any tool or command (not even vim) to do the work with.

Have you got any suggestion?
Thanks a lot,
Old 03-25-2008, 06:04 PM   #2
Senior Member
Registered: Nov 2002
Location: Edmonton AB, Canada
Distribution: Gentoo x86_64; Gentoo PPC; FreeBSD; OS X 10.9.4
Posts: 3,760
Blog Entries: 4

Rep: Reputation: 78

require 'iconv'
ic ="ASCII", "UTF-16LE") # replace 'ASCII' with 'UTF-8' if you prefer

ARGV.each do |file|
  in_file =
  out_file ="#{file}.out", "w")
  in_file.each do |line|
Note: This is untested. Will re-encode all input files to ascii and name as: "original_name.out".
You will need to use shell globbing or find/xargs to supply it with all your file names.



You can skip the middleman. Ruby iconv is just a wrapper for the iconv C library/utility. Have a look at 'man iconv'.

Last edited by bulliver; 03-25-2008 at 06:14 PM.
Old 03-25-2008, 06:20 PM   #3
Registered: Feb 2004
Location: Outside Paris
Distribution: Solaris10, Solaris 11, Mint, OL
Posts: 9,571

Rep: Reputation: 374Reputation: 374Reputation: 374Reputation: 374
Or simpler:
iconv -f UTF-16 -t UTF-8 file
Old 03-25-2008, 06:30 PM   #4
Registered: Dec 2005
Location: Madrid
Distribution: Solaris 10, Solaris Express Community Edition
Posts: 547

Original Poster
Rep: Reputation: 36
Thank you very much, to both of you, it works!



Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Change Encoding from UTF-8 to CP1256 AGazzaz Linux - Newbie 4 12-21-2007 05:58 PM
im getting UTF-8 to STRING: Could not open converter from 'UTF-8' to 'ISO-8859-1' jabka Linux - Newbie 2 11-24-2006 06:44 AM
determine encoding type of a file (ie - UTF-8) chovy Linux - Software 1 04-03-2006 01:46 AM
[Enter] in text documents diffrent on Windows and Linux? UTF-8/UTF-16 problem or? brynjarh Linux - General 1 11-24-2004 06:20 AM
X11 / UTF-8 locale seems missing 'fr_FR.UTF-8' chrsitophermann Debian 11 07-17-2004 03:04 PM

All times are GMT -5. The time now is 03:54 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration