从pandas Dataframe中的混合dtype列中删除破折号字符串

Question

I have a dataframe with possible objects mixed with numerical values. 我有一个数据框，其中可能的对象与数值混合在一起。

My target is to change every value to a simple integer, however, some of these values have - between numbers. 我的目标是每一个值更改为一个简单的整数，但是，一些值有-数字之间。

A minimal working example looks like: 一个最小的工作示例如下所示：

import pandas as pd

d = {'API':[float(4433), float(3344), 6666, '6-9-11', '8-0-11', 9990]}
df = pd.DataFrame(d)

I try: 我尝试：

df['API'] = df['API'].str.replace('-','')

But this leaves me with nan for the numeric types because it's searching the entire frame for the strings only. 但这让nan保留了数字类型，因为它只在整个框架中搜索字符串。

The output is: 输出为：

API

nan
nan
nan
6911
8011
nan

I'd like an output: 我想要一个输出：

Where all types are int . 所有类型均为int 。

Is there an easy way to take care of just the object types in the Series but leaving the actual numericals in tact? 是否有一种简单的方法来处理系列中的对象类型，而使实际数值保持不变？ I'm using this technique on large data sets (300,000+ lines) so something like lambda or series operations would be preferred over a loop search. 我在大型数据集（超过300,000行）上使用了此技术，因此，像lambda或series operations类的东西比循环搜索更可取。

Answer 1

Use df.replace with regex=True 将df.replace与regex=True

df = df.replace('-', '', regex=True).astype(int)

    API
0   4433
1   3344
2   6666
3   6911
4   8011
5   9990

Answer 2

也，

df['API'] = df['API'].astype(str).apply(lambda x: x.replace('-', '')).astype(int)

从pandas Dataframe中的混合dtype列中删除破折号字符串

问题描述

2 个解决方案

解决方案1
4 已采纳 2019-03-21 17:41:48

解决方案2
1 2019-03-21 18:13:14

从pandas Dataframe中的混合dtype列中删除破折号字符串

问题描述

2 个解决方案

解决方案1 4 已采纳 2019-03-21 17:41:48

解决方案2 1 2019-03-21 18:13:14

解决方案1
4 已采纳 2019-03-21 17:41:48

解决方案2
1 2019-03-21 18:13:14