在 2 个条件下从 pandas dataframe 列中提取数字

Question

i have this dataframe with 1 column in python and i need to extract out the 2 numbers that come after a "#" character or the string "door" .我有这个 dataframe 在 python 中有 1 列，我需要提取“#”字符或字符串“门”之后的 2 个数字。 Example of this would be这方面的例子是

String_column
#12-123,456
mom101, door 101, pop10

i only want the 2 numbers that come after the # sign or the word door.我只想要 # 符号或单词 door 之后的 2 个数字。 how would i go about doing this.我将如何 go 这样做。 This is what i currently have but i think this only takes in the numbers that come after the # key这是我目前拥有的，但我认为这只包含 # 键之后的数字

import pandas as pd

df = pd.read_csv(data.csv)
df['qwerty'] = df.string_column.str.extract(
     r'(?<=#)(\d+)', expand=False
).fillna(0).astype(int)

Answer 1

You can use df.loc combined with apply which will get all the indexes that are true.您可以将df.loc与apply结合使用，这将获得所有为真的索引。

Here is a simple example这是一个简单的例子

In [5]: df=  pd.DataFrame({'String_column':['not useful', 'door useful1', '! useful 2', 'not useful']})                                         

In [6]: df                                                                                                                                      
Out[6]: 
  String_column
0    not useful
1  door useful1
2    ! useful 2
3    not useful

Now using our function现在使用我们的 function

In [7]: df.loc[df['String_column'].apply(lambda x: True if x.startswith('!') or x.startswith('door') else False)]                               
Out[7]: 
  String_column
1  door useful1
2    ! useful 2

We used startswith to match all our conditions to get the useful values that starts with '.'我们使用startswith来匹配我们所有的条件以获得以'.'开头的有用值。 or 'door'.或“门”。

Answer 2

IIUC, you can use a non capturing group to list your different options ( # or door\s* ): IIUC，您可以使用非捕获组来列出您的不同选项（ #或door\s* ）：

df['num'] = (df['String_column'].str.extract(r'(?:#|door\s*)(\d+)', expand=False)
             .fillna(0).astype(int)
            )

output: output：

             String_column  num
0              #12-123,456   12
1  mom101, door 101, pop10  101

regex demo正则表达式演示

在 2 个条件下从 pandas dataframe 列中提取数字

问题描述

2 个解决方案

解决方案1
0 2022-09-16 07:51:57

解决方案2
0 2022-09-16 07:52:04

在 2 个条件下从 pandas dataframe 列中提取数字

问题描述

2 个解决方案

解决方案1 0 2022-09-16 07:51:57

解决方案2 0 2022-09-16 07:52:04

解决方案1
0 2022-09-16 07:51:57

解决方案2
0 2022-09-16 07:52:04