熊猫从字符串中提取数字

Question

Given the following data frame:给定以下数据框：

import pandas as pd
import numpy as np
df = pd.DataFrame({'A':['1a',np.nan,'10a','100b','0b'],
                   })
df

    A
0   1a
1   NaN
2   10a
3   100b
4   0b

I'd like to extract the numbers from each cell (where they exist).我想从每个单元格（它们存在的地方）中提取数字。 The desired result is:想要的结果是：

I know it can be done with str.extract , but I'm not sure how.我知道它可以用str.extract完成，但我不确定如何。

Answer 1

Give it a regex capture group:给它一个正则表达式捕获组：

df.A.str.extract('(\d+)')

Gives you:给你：

0      1
1    NaN
2     10
3    100
4      0
Name: A, dtype: object

Answer 2

要在上面的评论中回答@Steven G 的问题，这应该有效：

df.A.str.extract('(^\d*)')

Answer 3

您可以使用“分配”功能用您的结果替换您的列：

df = df.assign(A = lambda x: x['A'].str.extract('(\d+)'))

熊猫从字符串中提取数字

问题描述

3 个解决方案

解决方案1
67 已采纳 2016-06-07 15:39:21

解决方案2
5 2017-07-07 00:32:28

解决方案3
2 2020-10-30 00:06:41

熊猫从字符串中提取数字

问题描述

3 个解决方案

解决方案1 67 已采纳 2016-06-07 15:39:21

解决方案2 5 2017-07-07 00:32:28

解决方案3 2 2020-10-30 00:06:41

解决方案1
67 已采纳 2016-06-07 15:39:21

解决方案2
5 2017-07-07 00:32:28

解决方案3
2 2020-10-30 00:06:41