Register a domain and help support LQ
Go Back > Forums > Linux Forums > Linux - Software
User Name
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.


  Search this Thread
Old 04-02-2006, 09:39 PM   #1
Registered: Dec 2004
Location: Capitola, CA
Distribution: Debian
Posts: 51

Rep: Reputation: 15
determine encoding type of a file (ie - UTF-8)

I've tried several methods, including "file -i file.html" and "stat file.html", but it doesn't tell me the encoding type of the file.

I have <?xml version="1.0" encoding="UTF-8"?> in the head of my xhtml file, but how do I know it is really UTF-8?
Old 04-03-2006, 01:46 AM   #2
Senior Member
Registered: Jun 2004
Posts: 2,553

Rep: Reputation: 51
this is really hard
remember a file that appears to be 100% ascii at the byte level but declares itself UTF-8 can be/is a valid UTF-8 file because UTF-8 overlaps ascii (english for instance).
files and strings which contain only 7-bit ASCII characters have the same encoding under both ASCII and UTF-8. that is these files are both ascii files and UTF-8.
"file" is the Linux utility that tells you encoding
if i save a file as utf-8 in english and do
(gary) ~/test $ file utf8.txt
i get as output
utf8.txt: ASCII text, with no line terminators
but if i do a file in hebrew in utf-8 file says
(gary) ~/test $ file utf8.txt
utf8.txt: UTF-8 Unicode text, with no line terminators

sometimes i see people talk about byte order marks or prefix bytes for unicode encodings and you can see these in Linux for UTF-16 using a hex editor but i have never seen one for UTF-8
Byte Order Mark is not necesary in a XML file at all but XML has a leading less than sign. so the less than sign can give away encoding
but again its the same for ascii and UTF-8

(i was just playing with encoding on my keyboard so i hope this post is still readable english)


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to determine partition type? halturata Linux - General 2 08-11-2005 03:07 PM
How to determine partition type? halturata Linux - General 3 08-11-2005 04:11 AM
How do I determine file system type? lowpro2k3 Linux - General 5 07-09-2005 04:40 PM
How does KDE determine file type? vdemuth Linux - Software 4 01-08-2005 05:08 AM
How do I know how a file is encoded? UTF-8, UTF-16, etc.. ?? brynjarh Linux - General 1 12-03-2004 12:11 PM

All times are GMT -5. The time now is 09:22 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration