Sounds like it could be an encoding issue.
file -i on the questionable file may reveal more info.
Encoding detection is almost a whole field in it self, I've used a few libraries in the past that did poor to fair at encoding detection, but still not perfect. (Web site authors are famous for doing charset="XXX" when their really using some other encoding.)
The other tool to use is "iconv", it's a very popular re-encoding tool. You can easily tell iconv that it should read the input as XYZ encoding and output as ABC encoding.
The other potential option is, since you're using CJK, you may already be using a UTF-N encoding. In which case it may be that you need to install the correct fonts. Quick google search gave me this: http://kile.sourceforge.net/Documentation/html/cjk.html