LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 06-27-2006, 12:58 PM   #1
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Rep: Reputation: 148Reputation: 148
convert file from UTF8 to ASCII encoding


I have been writing a VBA script (not my area of expertise) to extract some data from an excel spreadsheet and generate an XML file. The XML file is going to be read in by an existing program. The file will include some Unicode characters and so VBA needs it to be saved as an Unicode (UTF-8) file but the program that will read the file needs it to be saved in ASCII format.

I have opened the file with Notepad++ switched the encoding to ASCII and saved the file and this works.

However I am going to generate about 5000 files, I don't really want to do this for each one by hand! So does anyone know of a way in which I can automate this process.

Ideas, solutions welcome.

Thanks very much.
 
Old 06-27-2006, 01:09 PM   #2
unSpawn
Moderator
 
Registered: May 2001
Posts: 27,140
Blog Entries: 54

Rep: Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791Reputation: 2791
iconv? From "apropos iconv": iconv (1) - Convert encoding of given files from one encoding to another
 
1 members found this post helpful.
Old 06-27-2006, 03:03 PM   #3
jim mcnamara
Member
 
Registered: May 2002
Posts: 964

Rep: Reputation: 34
There is a file, usually /usr/lib/nls/iconv/config.iconv, that lists all of the possible (allowable conversions)

On my box
Code:
iconv -f utf8 -t iso89 oldfile > newfile
does the job. You need to find config.iconv, then read it to find what your commands are.
 
1 members found this post helpful.
Old 06-27-2006, 10:02 PM   #4
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Original Poster
Rep: Reputation: 148Reputation: 148
Excellent response, thanks unSpawn & jim mcnamara. The iconv is supported in php so I can put together a quick script to locate each file and convert it. I'm a little confused by the encodings that I needed to use but I've tested it and it works, so I'm relieved.

Once again thanks very much.
 
Old 06-28-2006, 01:38 AM   #5
jlliagre
Moderator
 
Registered: Feb 2004
Location: Outside Paris
Distribution: Solaris10, Solaris 11, Mint, OL
Posts: 9,493

Rep: Reputation: 355Reputation: 355Reputation: 355Reputation: 355
I'm surprised it worked, as iso89 isn't a known encoding.

Didn't you run instead "iconv -f utf8 -t ascii oldfile > newfile" ?

What Unicode characters were causing a problem in your source file ?
 
Old 06-28-2006, 02:00 AM   #6
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Original Poster
Rep: Reputation: 148Reputation: 148
The php function I've written is as follows:
PHP Code:
function encodeFiles($rootPath)
{
    
$directories scandir($rootPath);
    
// Loop through each directory in the rootPath
    
foreach ($directories as $dir)
    {
        
$fileName $rootPath."\\".$dir"\\metadata.xml";
        
// Check to see if the current directory has a file called metadata.xml
        
if (file_exists($fileName))
        {
            
// Read in the contents
            
$data file_get_contents($fileName);
            
// Just display on the screen the file being modified
            
echo "Converting " $fileName "...\n";
            
// Convert the contents
            
$data iconv("UCS-2","UTF-8"$data);
            
// Write back out to the same file
            
file_put_contents($fileName,$data);
        }  
// end if file exists
    
}  // end loop for each directory in the root directory
}  // end of function encodeFiles() 
When I open the file in Notepad++ it tells me the encoding is ANSI, which is what I needed and the program I feed the file into accepts it. So I'm now making progress in the conversion job.
 
Old 06-28-2006, 02:56 AM   #7
jlliagre
Moderator
 
Registered: Feb 2004
Location: Outside Paris
Distribution: Solaris10, Solaris 11, Mint, OL
Posts: 9,493

Rep: Reputation: 355Reputation: 355Reputation: 355Reputation: 355
Okay, so instead of converting from UTF-8 to ASCII, you are converting from UTF-8 to UTF-16.

Your posting title was misleading, UTF-8 is close to ASCII, while UTF-16 is definitely not.
 
Old 06-28-2006, 03:07 AM   #8
graemef
Senior Member
 
Registered: Nov 2005
Location: Hanoi
Distribution: Fedora 13, Ubuntu 10.04
Posts: 2,379

Original Poster
Rep: Reputation: 148Reputation: 148
My title was describing what I thought I needed to do at the time. The program I'm creating the xml files for is fairly poor on documentation and I had spent the whole afternoon trying to figure out why they were being rejected, no errors, no log files, not much in the way of assistance.

But I'm now over that hurdle, I wonder what it will throw at me next.

Last edited by graemef; 06-28-2006 at 03:27 AM.
 
Old 12-15-2008, 04:45 AM   #9
kalimat
Member
 
Registered: Dec 2008
Posts: 30

Rep: Reputation: 0
hi....i have a question:
and if i want to do a script for converting every file in a folder,what can i do?

i started like this:

#!/bin/bash
cd folder
n=$(ls | wc -l)
i=1
p=... //path of the file

//and now..for every file in the folder (?)

while test i -le n; do
iconv -f UTF-16 -t ASCII $p
((i++))
p=.....//path of the next file
done

How i make i to retain the path of the file?..
i am waiting for an answer..
thanks a lot
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Convert encoding through subdirectories? dl7und Linux - Software 2 04-16-2006 01:06 AM
Convert ASCII text to an audio file ed_homeLinux Linux - Software 1 07-22-2005 12:30 PM
function in shell to convert hexadecimal into ascii suchi_s Linux - Software 1 04-01-2005 02:07 PM
in linux & c/c++: how do I convert an ascii string to utf8 & vice versa? davidh_uk Programming 2 02-06-2005 05:55 PM
convert integers to ascii raven Programming 3 02-15-2002 03:32 AM


All times are GMT -5. The time now is 07:28 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration