修改alter number字符串熊猫

Question

Background 背景

I have the following sample df which is an alternation of Alter number string in pandas column 我有以下示例df是pandas列中Alter数字字符串的替代

import pandas as pd
df = pd.DataFrame({'Text' : ['Jon J Smith  Record #:  0000004 is this ', 
                                   'Record #:  0000003 Mary Lisa Hider found here', 
                                   'Jane A Doe is also here Record #:  0000002',
                                'Record #:  0000001'], 

                      'P_ID': [1,2,3,4],
                      'N_ID' : ['A1', 'A2', 'A3', 'A4']

                     })

#rearrange columns
df = df[['Text','N_ID', 'P_ID']]
df

                                    Text             N_ID   P_ID
0   Jon J Smith Record #: 0000004 is this       A1  1
1   Record #: 0000003 Mary Lisa Hider fou...    A2  2
2   Jane A Doe is also here Record #: 000...    A3  3
3   Record #: 0000001                           A4  4

Goal 目标

1) replace number after Record #: with **BLOCK** 1）将Record #:之后的数字替换为**BLOCK**

Jon J Smith Record #: 0000004 is this
Jon J Smith Record #: **BLOCK** is this

2) create new column 2）创建新列

Desired Output 期望的输出

    Text    N_ID    P_ID    New_Text              
0                          Jon J Smith Record #: **BLOCK** is this      
1                          Record #: **BLOCK**  Mary Lisa Hider fou...  
2                          Jane A Doe is also here Record #: **BLOCK**  
3                          Record #: **BLOCK**

Tried 试过了

I have tried the following but this is not quite right 我已经尝试了以下方法，但这不是很正确

df['New_Text']= df['Text'].replace(r'(?i)record\s+#: \d+', r"Date of Birth: **BLOCK**", regex=True)

Question 题

How do I alter my code to get my desired output? 如何更改代码以获得所需的输出？

Answer 1

You are matching a single space after the : which you could turn into \\s+ (or repeat a space + if it can only be spaces) and use a capturing group for the first part. 您在:后面匹配一个空格，您可以将其变成\\s+ （或者，如果只能是空格，则重复空格+ ），并在第一部分使用捕获组。

(?i)(medical\s+record\s+#:\s+)\d+

Regex demo 正则表达式演示

In the replacement use 在替换使用中

\1**BLOCK**

The final piece of code will look like this 最后的代码如下所示

df['New_Text']= df['Text'].replace(r'(?i)(medical\s+record\s+#:\s+)\d+', r"\1**BLOCK**", regex=True)

修改alter number字符串熊猫

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-07-29 22:04:29

修改alter number字符串熊猫

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-07-29 22:04:29

解决方案1
1 已采纳 2019-07-29 22:04:29