繁体   English   中英

转换CSV文件中列的数据类型

[英]Converting data type of a column in a csv file

我试图使用Numpy和Pandas库修改Pycharm中Python中列的数据类型,但出现以下错误。

dataset.fillna(1e6).astype(int)

D:\Softwares\Python3.6.1\python.exe D:/PythonPractice/DataPreprocessing/DataPreprocessing_1.py
Traceback (most recent call last):
   Country   Age   Salary Purchased
  File "D:/PythonPractice/DataPreprocessing/DataPreprocessing_1.py", line 6, in <module>
    dataset.fillna(1e6).astype(int)
0   France  44.0  72000.0        No
1    Spain  27.0  48000.0       Yes
  File "D:\Softwares\Python3.6.1\lib\site-packages\pandas\util\_decorators.py", line 91, in wrapper
2  Germany  30.0  54000.0        No
3    Spain  38.0  61000.0        No
    return func(*args, **kwargs)
4  Germany  40.0      NaN       Yes
  File "D:\Softwares\Python3.6.1\lib\site-packages\pandas\core\generic.py", line 3299, in astype
    **kwargs)
  File "D:\Softwares\Python3.6.1\lib\site-packages\pandas\core\internals.py", line 3224, in astype
5   France  35.0  58000.0       Yes
    return self.apply('astype', dtype=dtype, **kwargs)
6    Spain   NaN  52000.0        No
  File "D:\Softwares\Python3.6.1\lib\site-packages\pandas\core\internals.py", line 3091, in apply
7   France  48.0  79000.0       Yes
    applied = getattr(b, f)(**kwargs)
8  Germany  50.0  83000.0        No
  File "D:\Softwares\Python3.6.1\lib\site-packages\pandas\core\internals.py", line 471, in astype
9   France  37.0  67000.0       Yes
    **kwargs)
  File "D:\Softwares\Python3.6.1\lib\site-packages\pandas\core\internals.py", line 521, in _astype
    values = astype_nansafe(values.ravel(), dtype, copy=True)
  File "D:\Softwares\Python3.6.1\lib\site-packages\pandas\core\dtypes\cast.py", line 625, in astype_nansafe
    return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
  File "pandas\_libs\lib.pyx", line 917, in pandas._libs.lib.astype_intsafe (pandas\_libs\lib.c:16260)
  File "pandas\_libs\src\util.pxd", line 93, in util.set_value_at_unsafe (pandas\_libs\lib.c:73093)
ValueError: invalid literal for int() with base 10: 'France'

您的错误消息ValueError: invalid literal for int() with base 10: 'France' -建议您使用“ Country列,其内容为字符串,不能更改为整数。 尝试调整范围。

您不能将“法国”转换为整数,您应该:

    dataset['Country'] = dataset['Country'].map({'France': 0, 'Spain': 1, 'Germany': 2})]

然后:

    dataset['Country'].astype(int)

如果仍然存在这样的错误:

ValueError: Cannot convert non-finite values (NA or inf) to integer

这是因为dataset['Country']中存在某些NaN

通过fillna()drop()等处理这些NaN ,您将解决它。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM