I'm trying to load a CSV with pandas, but am running into a problem if the file name has accents. It's clearly an encoding problem, but although read_csv
lets you set encoding for text within the file, I can't figure out how to encode the file name properly.
input_file = r'C:\...\Datasets\%s\Provinces\Points\%s.csv' % (country, province)
self.locs = pandas.read_csv(input_file,sep=',',skipinitialspace=True)
The CSV file is Anzoátegui.csv. When I'm getting errors,
input_file = 'C:\\...\Datasets\Venezuela\Provinces\Points\Anzoátegui.csv
Error code:
OSError: File b'C:\\PF2\\QGIS Valmiera\\Datasets\\Venezuela\\Provinces\\Points\\Anzo\xc3\xa1tegui.csv' does not exist
So maybe it's converting my string to bytes? I tried using io.StringIO(input_file)
as well, which puts the correct file name as a column header on an empty DataFrame
:
Empty DataFrame
Columns: [C:\PF2\QGIS Valmiera\Datasets\Venezuela\Provinces\Points\Anzoátegui.csv]
Index: []
Any ideas on how to get this file to load? Unfortunately I can't just strip out accents, as I have to interface with software that requires the proper name, and I have a ton of files to format (not just the one). Thanks!
Edit: Full error
Traceback (most recent call last):
File "C:\PF2\eclipse-standard-kepler-SR2-win32-x86_64\eclipse\plugins\org.python.pydev_3.3.3.201401272249\pysrc\pydevd_comm.py", line 891, in doIt
result = pydevd_vars.evaluateExpression(self.thread_id, self.frame_id, self.expression, self.doExec)
File "C:\PF2\eclipse-standard-kepler-SR2-win32-x86_64\eclipse\plugins\org.python.pydev_3.3.3.201401272249\pysrc\pydevd_vars.py", line 486, in evaluateExpression
result = eval(compiled, updated_globals, frame.f_locals)
File "<string>", line 1, in <module>
File "C:\Python33\lib\site-packages\pandas\io\parsers.py", line 404, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\Python33\lib\site-packages\pandas\io\parsers.py", line 205, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\Python33\lib\site-packages\pandas\io\parsers.py", line 486, in __init__
self._make_engine(self.engine)
File "C:\Python33\lib\site-packages\pandas\io\parsers.py", line 594, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\Python33\lib\site-packages\pandas\io\parsers.py", line 952, in __init__
self._reader = _parser.TextReader(src, **kwds)
File "parser.pyx", line 330, in pandas.parser.TextReader.__cinit__ (pandas\parser.c:3040)
File "parser.pyx", line 557, in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:5387)
OSError: File b'C:\\PF2\\QGIS Valmiera\\Datasets\\Venezuela\\Provinces\\Points\\Anzo\xc3\xa1tegui.csv' does not exist
Ok folks, I got a little lost in dependency hell, but it turns out that this issue was fixed in pandas 0.14.0. Install the updated version to get files named with accents to import correctly.
Thanks for the input!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.