LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   convert file from UTF8 to ASCII encoding (https://www.linuxquestions.org/questions/programming-9/convert-file-from-utf8-to-ascii-encoding-458839/)

graemef 06-27-2006 12:58 PM

convert file from UTF8 to ASCII encoding
 
I have been writing a VBA script (not my area of expertise) to extract some data from an excel spreadsheet and generate an XML file. The XML file is going to be read in by an existing program. The file will include some Unicode characters and so VBA needs it to be saved as an Unicode (UTF-8) file but the program that will read the file needs it to be saved in ASCII format.

I have opened the file with Notepad++ switched the encoding to ASCII and saved the file and this works.

However I am going to generate about 5000 files, I don't really want to do this for each one by hand! So does anyone know of a way in which I can automate this process.

Ideas, solutions welcome.

Thanks very much.

unSpawn 06-27-2006 01:09 PM

iconv? From "apropos iconv": iconv (1) - Convert encoding of given files from one encoding to another

jim mcnamara 06-27-2006 03:03 PM

There is a file, usually /usr/lib/nls/iconv/config.iconv, that lists all of the possible (allowable conversions)

On my box
Code:

iconv -f utf8 -t iso89 oldfile > newfile
does the job. You need to find config.iconv, then read it to find what your commands are.

graemef 06-27-2006 10:02 PM

Excellent response, thanks unSpawn & jim mcnamara. The iconv is supported in php so I can put together a quick script to locate each file and convert it. I'm a little confused by the encodings that I needed to use but I've tested it and it works, so I'm relieved.

Once again thanks very much.

jlliagre 06-28-2006 01:38 AM

I'm surprised it worked, as iso89 isn't a known encoding.

Didn't you run instead "iconv -f utf8 -t ascii oldfile > newfile" ?

What Unicode characters were causing a problem in your source file ?

graemef 06-28-2006 02:00 AM

The php function I've written is as follows:
PHP Code:

function encodeFiles($rootPath)
{
    
$directories scandir($rootPath);
    
// Loop through each directory in the rootPath
    
foreach ($directories as $dir)
    {
        
$fileName $rootPath."\\".$dir"\\metadata.xml";
        
// Check to see if the current directory has a file called metadata.xml
        
if (file_exists($fileName))
        {
            
// Read in the contents
            
$data file_get_contents($fileName);
            
// Just display on the screen the file being modified
            
echo "Converting " $fileName "...\n";
            
// Convert the contents
            
$data iconv("UCS-2","UTF-8"$data);
            
// Write back out to the same file
            
file_put_contents($fileName,$data);
        }  
// end if file exists
    
}  // end loop for each directory in the root directory
}  // end of function encodeFiles() 

When I open the file in Notepad++ it tells me the encoding is ANSI, which is what I needed and the program I feed the file into accepts it. So I'm now making progress in the conversion job.

jlliagre 06-28-2006 02:56 AM

Okay, so instead of converting from UTF-8 to ASCII, you are converting from UTF-8 to UTF-16.

Your posting title was misleading, UTF-8 is close to ASCII, while UTF-16 is definitely not.

graemef 06-28-2006 03:07 AM

My title was describing what I thought I needed to do at the time. The program I'm creating the xml files for is fairly poor on documentation and I had spent the whole afternoon trying to figure out why they were being rejected, no errors, no log files, not much in the way of assistance.

But I'm now over that hurdle, I wonder what it will throw at me next. :)

kalimat 12-15-2008 04:45 AM

hi....i have a question:
and if i want to do a script for converting every file in a folder,what can i do?

i started like this:

#!/bin/bash
cd folder
n=$(ls | wc -l)
i=1
p=... //path of the file

//and now..for every file in the folder (?)

while test i -le n; do
iconv -f UTF-16 -t ASCII $p
((i++))
p=.....//path of the next file
done

How i make i to retain the path of the file?..
i am waiting for an answer..
thanks a lot


All times are GMT -5. The time now is 09:51 AM.