LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-06-2014, 01:00 PM   #1
lonesoac0
Member
 
Registered: Jan 2010
Distribution: Ubuntu
Posts: 94

Rep: Reputation: 4
Strange characters in a file


Hello all,

I recently got a dataset from the website of http://www.omdbapi.com/. Within the data, I see a question mark with a black background. It looks like random characters cannot be read. I ran the command of more FILE_NAME.txt and I get the results I describe. I have also attached a screenshot.
Attached Thumbnails
Click image for larger version

Name:	error code.PNG
Views:	21
Size:	8.9 KB
ID:	17031  
 
Old 12-06-2014, 02:42 PM   #2
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,404

Rep: Reputation: Disabled
The text file was created/saved on a system using a different character set encoding than the computer/application you're using.

Only US ASCII codes are reasonably universal; these include characters A-Z and a-z, numbers, basic punctuation and a small selection of special characters like the dollar, the hash and percentage signs, some very basic mathematical symbols and so on. Other characters are considered "special", and various encoding schemes exist to handle various types of "extended" character sets.

If there's a mismatch between the encoding schemes used by a sender and a recipient of data, any "extended" codes may be interpreted incorrectly. In your case, accented characters aren't displayed properly. This is a very common problem with "pure" text files, since they lack any sort of header that identifies the character set encoding scheme being used.

If you can figure out which encoding scheme was used to create the file, you can convert it to the encoding scheme you're using with the iconv command.
 
Old 12-06-2014, 04:24 PM   #3
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,604

Rep: Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241
As above...

but in addition, this is common when the data originates on a Windows system. Microsoft software tends to generate/use some not quite standard character sets. In at least one instance such screwups involved having a parity bit set on the apostrophe character... thus showing up as a ? instead.

In your specific case, it does look a bit more like just a different character font, but it could just be some Windows software with the not-quite-standard characters.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Strange characters on nginx Velotrol Linux - Server 0 11-27-2012 04:44 PM
strange characters in mozilla firefox girish_hilage Linux - General 3 04-24-2009 11:05 AM
Strange characters in command line in X Vitalie Ciubotaru Linux From Scratch 5 11-30-2006 07:53 PM
strange characters when routing man page to txt file DJOtaku Linux - General 3 05-15-2005 02:03 AM
Strange Characters : RedHat 8.0 UberPhreek Linux - Newbie 1 10-18-2002 06:58 AM


All times are GMT -5. The time now is 07:05 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration