[英]TypeError: data type not understood while parsing CSV with Pandas
When parsing a CSV file with Pandas the columns that contain datetime
are assigned type object
by default.在使用 Pandas 解析 CSV 文件时,包含
datetime
时间的列默认分配为object
类型。
How do I make sure the first column have the correct type assigned in this example?如何确保第一列在本例中分配了正确的类型?
import pandas as pd
import datetime as datetime
data = pd.read_csv("scans.csv")
# dtypes = {
# 'date': datetime,
# 'muscle': str,
# 'side': str,
# 'MQ(0-100)': float,
# 'MQ(raw)': int,
# 'fat': float
# }
# data = pd.read_csv("scans.csv", dtype=dtypes)
print(data.head())
print(data.dtypes)
Here is the console output这是控制台 output
date muscle side MQ(0-100) MQ(raw) fat
0 12/16/2018 16:08 glutes R 99.7 154 8.6
1 12/16/2018 16:08 total R 81.8 129 17.0
2 12/16/2018 16:04 glutes L 98.1 140 10.8
3 12/16/2018 16:03 upper_back R 70.2 132 11.6
4 12/16/2018 16:02 upper_back L 77.8 136 11.4
date object
muscle object
side object
MQ(0-100) float64
MQ(raw) int64
fat float64
dtype: object
Error when running the full code运行完整代码时出错
/Users/Developer/PycharmProjects/Sculpt/venv/bin/python /Users/Developer/PycharmProjects/Sculpt/script.py
Traceback (most recent call last):
File "/Users/Developer/PycharmProjects/Sculpt/venv/lib/python3.8/site-packages/pandas/core/dtypes/common.py", line 2050, in pandas_dtype
npdtype = np.dtype(dtype)
TypeError: data type not understood
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/Developer/PycharmProjects/Sculpt/script.py", line 18, in <module>
data = pd.read_csv("scans.csv", dtype=dtypes)
File "/Users/Developer/PycharmProjects/Sculpt/venv/lib/python3.8/site-packages/pandas/io/parsers.py", line 685, in parser_f
return _read(filepath_or_buffer, kwds)
File "/Users/Developer/PycharmProjects/Sculpt/venv/lib/python3.8/site-packages/pandas/io/parsers.py", line 457, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/Users/Developer/PycharmProjects/Sculpt/venv/lib/python3.8/site-packages/pandas/io/parsers.py", line 895, in __init__
self._make_engine(self.engine)
File "/Users/Developer/PycharmProjects/Sculpt/venv/lib/python3.8/site-packages/pandas/io/parsers.py", line 1135, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/Users/Developer/PycharmProjects/Sculpt/venv/lib/python3.8/site-packages/pandas/io/parsers.py", line 1917, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 490, in pandas._libs.parsers.TextReader.__cinit__
File "/Users/Developer/PycharmProjects/Sculpt/venv/lib/python3.8/site-packages/pandas/core/dtypes/common.py", line 2054, in pandas_dtype
raise TypeError("data type not understood")
TypeError: data type not understood
Here is a one-liner solution:这是一个单行解决方案:
data = pd.read_csv("scans.csv", parse_dates=['date'])
Now getting a good result:现在得到一个很好的结果:
date datetime64[ns]
muscle object
side object
MQ(0-100) float64
MQ(raw) int64
fat float64
dtype: object
The "muscle" and "side" columns are the correct dtype. “muscle”和“side”列是正确的 dtype。 Pandas treats strings as object dtypes.
Pandas 将字符串视为 object dtypes。 This can be read here: pandas documentation
这可以在这里阅读: pandas 文档
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.