简体   繁体   English

在进行向量化计算时如何处理typeError?

[英]How to handle typeErrors when doing vectorize calculations?

I want to avoid crashes when performing vectorized calculations using pandas dataframes (python-3.6). 我想避免在使用pandas数据帧(python-3.6)执行矢量化计算时发生崩溃。

For example I have a dataframe with 2 Columns A,B. 例如,我有一个带有2列A,B的数据框。 I want to create a column C that will be C = A - B. However one cell in column A is a string and this cause a TypeError. 我想创建一个将为C = A-B的列C。但是,列A中的一个单元格是一个字符串,这会导致TypeError。 Have a look at the picture below. 看看下面的图片。

数据框示例

Column C is the outcome that I want to achieve. C列是我想要实现的结果。

Currently I get an Type Error message: 当前,我收到类型错误消息:

TypeError: unsupported operand type(s) for -: 'float' and 'str'

which is expected. 这是预期的。

It is possible by numpy.select , but get mixed values in output: 可以通过numpy.select ,但是在输出中得到混合值:

df = pd.DataFrame({
         'A':[7,8,9,10,5],
         'B':[1,2,3,'str',np.nan],
})

b = pd.to_numeric(df['B'], errors='coerce')
df['C'] = np.select([df['B'].isna(), b.isna()], [np.nan, 'ERROR'], default=df['A'] - b)
print (df)
    A    B      C
0   7    1    6.0
1   8    2    6.0
2   9    3    6.0
3  10  str  ERROR
4   5  NaN    nan

The best is convert to numeric by to_numeric and subtract only if need processing column later: 最好是使用to_numeric将其转换为数值,并且仅在以后需要处理列时才减去:

b = pd.to_numeric(df['B'], errors='coerce')
df['C'] = df['A'] - b
print (df)
    A    B    C
0   7    1  6.0
1   8    2  6.0
2   9    3  6.0
3  10  str  NaN
4   5  NaN  NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM