[英]CSV using '-' as NULL. Error to convert column to INT
I have a CSV我有一个 CSV
df = pd.read_csv('data.csv')
Table:桌子:
Column A ![]() |
Column B ![]() |
Column C![]() |
---|---|---|
4068744 ![]() |
-1472525 ![]() |
2596219 ![]() |
198366 ![]() |
- ![]() |
- ![]() |
The file is using '-' for nul values该文件对 nul 值使用“-”
I tried converting to int without handling that '-'.我尝试在不处理“-”的情况下转换为 int。
My question is: how do I strip the string '-' without changing the negative values?我的问题是:如何在不更改负值的情况下去除字符串“-”?
df['Column B'] = df['Column B'].astype(int)
ValueError: invalid literal for int() with base 10: '-'
ValueError:以 10 为底的 int() 的无效文字:'-'
Higher version of pandas
can hold integer
dtypes with missing values.更高版本的
pandas
可以容纳integer
具有缺失值的数据类型。 Normal int
conversion doesn't support null values.普通
int
转换不支持 null 值。
# replace - with null
df.replace('-', pd.NA, inplace=True)
# and use Int surrounding with ''
df['Column B'] = df['Column B'].astype('Int64')
output: output:
> df
Column A Column B Column C
0 4068744 -1472525 2596219
1 198366 <NA> <NA>
> df['Column B'].info
Name: Column B, dtype: Int64>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.