LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 10-17-2016, 03:41 AM   #1
FrancisG
LQ Newbie
 
Registered: Aug 2010
Location: West Cork, Ireland
Posts: 16

Rep: Reputation: 0
[SOLVED] Strange Characters reading .csv file in Python 2


I am trying to read a csv file in Python, but there are strange characters about!!
This file is downloaded from a Solar Photovoltaic Array System.
The file looks fine in gedit, geany and vim. I can import it into Libreoffice Calc, no problem.
Here are a couple of lines in gedit to show what it should look like:-
Code:
20/09/2016 00:00:00;16901.962;0.000
20/09/2016 00:05:00;16901.962;0.000
However when I try and read the file in Python using

Code:
_file = open(fl,'rU')
for line in _file:
    print line
I get gaps between each character and extra lines:-
Code:
 2 0 / 0 9 / 2 0 1 6   0 0 : 0 0 : 0 0 ; 1 6 9 0 1 . 9 6 2 ; 0 . 0 0 0

 

 2 0 / 0 9 / 2 0 1 6   0 0 : 0 5 : 0 0 ; 1 6 9 0 1 . 9 6 2 ; 0 . 0 0 0
without the 'rU' in the open() just 'r' I just get one odd character printed out, and then a blank line, instead of the actual line, despite the fact I can see the line in the debugger. I am using PyCharm.
In LibreOffice Writer I get:-
#2#0#/#0#9#/#2#0#1#6# #0#0#:#0#0#:#0#0#;#1#6#9#0#1#.#9#6#2#;#0#.#0#0#0##
#2#0#/#0#9#/#2#0#1#6# #0#0#:#0#5#:#0#0#;#1#6#9#0#1#.#9#6#2#;#0#.#0#0#0##
What are all the hashes about? Is this some sort of strange encoding issue? I am using utf-8 encoding at the beginning of my script.
Thanks in advance

Last edited by FrancisG; 10-18-2016 at 02:05 AM.
 
Old 10-17-2016, 06:59 AM   #2
FrancisG
LQ Newbie
 
Registered: Aug 2010
Location: West Cork, Ireland
Posts: 16

Original Poster
Rep: Reputation: 0
Answering my own post, I think this is defininately an encoding problem. The original .csv file comes from a Windows 10 machine.
here is the string using repr(line)
<CODE>
'\\'\\x002\\x000\\x00/\\x000\\x009\\x00/\\x002\\x000\\x001\\x006\\x00 \\x000\\x000\\x00:\\x000\\x000\\x00:\\x000\\x000\\x00;\\x001\\x006\\x009\\x000\\x001\\x00.\\x009\\x0 06\\x002\\x00;\\x000\\x00.\\x000\\x000\\x000\\x00\\r\\x00\\n\\''</CODE>
I think this would suggest UTF-16, but I am not sure.
Here is the output from a line
Code:
 2 0 / 0 9 / 2 0 1 6   0 0 : 0 0 : 0 0 ; 1 6 9 0 1 . 9 6 2 ; 0 . 0 0 0
I have tried all sorts of things about unicode
Here are some things I have tried, with no change to the output:
where the variable 'line' is a line from the csv file
Code:
codecs.encode(unicode(line),'utf-8')
line.encode('utf-8')
then from a good presentation this function:
Code:
def to_unicode_or_bust(
        obj, encoding='utf-8'):
    if isinstance(obj, basestring):
        if not isinstance(obj, unicode):
            obj = unicode(obj, encoding)
    return obj
called as:
Code:
to_unicode_or_bust(line)
and output is identical.
Anyone good on codecs?
 
Old 10-17-2016, 07:03 AM   #3
schneidz
LQ Guru
 
Registered: May 2005
Location: boston, usa
Distribution: fedora-30
Posts: 5,290

Rep: Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916Reputation: 916
dos2unix ?
 
Old 10-17-2016, 08:00 AM   #4
FrancisG
LQ Newbie
 
Registered: Aug 2010
Location: West Cork, Ireland
Posts: 16

Original Poster
Rep: Reputation: 0
Perfect!!! Works a treat
Thank you so much, saves a load of faffing about.
 
  


Reply

Tags
csv, encoding, python


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Reading comma(,) separated value in Shell script from CSV file jbvijayendra Linux - Newbie 6 06-15-2016 03:17 PM
Plot graph reading .csv file shridhar22 Linux - Newbie 1 09-04-2014 06:32 AM
Reading a .CSV file and then calculating average per minute basis in shell script. krishdeeps Linux - Newbie 1 04-23-2010 04:38 PM
Bash - Reading csv delimited file to array and for further manipulation BLWEGRZYN Programming 1 01-06-2010 09:38 PM
Reading a CSV text file and storing the values in Oracle Database table shafi2all Linux - Newbie 3 04-17-2008 12:19 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 06:03 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration