简体   繁体   English

使用正则表达式替换 pandas 单元格中的字符串,在两个特定字符串之间

[英]Replace strings in pandas cell using regex, between two specific strings

there were some tipics connected with strings 'between' replacement, but I think I have something wrong in my regex, or maybe I should use different approach.有一些与字符串“之间”替换相关的技巧,但我认为我的正则表达式有问题,或者我应该使用不同的方法。

I need to replace in my Name column word (in this case is , but it will be not always is , sometime different word) with is not .我需要用is not替换我的Name列中的单词(在这种情况下is ,但并不总是is ,有时是不同的单词)。 This specific world is between numbers ending with 'h'directly.这个特定的世界位于直接以“h”结尾的数字之间。

my df:我的df:

df=pd.DataFrame({'Name':['Adam is 23.2h is 223h mike is 223h',
'Katie is 13.2h is 22h mike is 223h','Ilam is 2h is 223h mike is 223h',
'Katie','Brody','Brody like mike'],
'B':[20,20,21,21,22,21]})

    B                                Name
0  20  Adam is 23.2h is 223h mike is 223h
1  20  Katie is 13.2h is 22h mike is 223h
2  21     Ilam is 2h is 223h mike is 223h
3  21                               Katie
4  22                               Brody
5  21                     Brody like mike

expected output:预期 output:

    B                                     Name
0  20   Adam is 23.2h is not 223h mike is 223h
1  20   Katie is 13.2h is not 22h mike is 223h
2  21      Ilam is 2h is not 223h mike is 223h
3  21                                    Katie
4  22                                    Brody
5  21                          Brody like mike

code:代码:

df.Name=df.Name.replace({'([0-9]{1,8}.[0-9]{1,4}h|[0-9]{1,8}h)(.*?)([0-9]{1,8}.[0-9]{1,4}h|[0-9]{1,8}h)':'is not'},regex=True)

To use matching group write it :r'\1 is not \3' .要使用匹配组写它:r'\1 is not \3' And, seems, you can use a little easier regex而且,您似乎可以使用更简单的正则表达式

   df.Name.replace({'([0-9]{1,8}(?:.[0-9]{1,4})?h)(.*?)([0-9]{1,8}(.[0-9]{1,4})?h)':r'\1 is not \3'}, regex=True)

0    Adam is 23.2h is not 223h mike is 223h
1    Katie is 13.2h is not 22h mike is 223h
2       Ilam is 2h is not 223h mike is 223h
3                                     Katie
4                                     Brody
5                           Brody like mike
Name: Name, dtype: object

You can try using apply with re.sub(r'(?<=\dh )is', 'is not', text) .您可以尝试将applyre.sub(r'(?<=\dh )is', 'is not', text)一起使用。

Code代码

import pandas as pd
import re

df=pd.DataFrame({'Name':['Adam is 23.2h is 223h mike is 223h',
'Katie is 13.2h is 22h mike is 223h','Ilam is 2h is 223h mike is 223h',
'Katie','Brody','Brody like mike'],
'B':[20,20,21,21,22,21]})

df['Name'] = df['Name'].apply(lambda t: re.sub(r'(?<=\dh )is', 'is not', t))

Output Output

print(df)
#                                      Name   B
# 0  Adam is 23.2h is not 223h mike is 223h  20
# 1  Katie is 13.2h is not 22h mike is 223h  20
# 2     Ilam is 2h is not 223h mike is 223h  21
# 3                                   Katie  21
# 4                                   Brody  22
# 5                         Brody like mike  21

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Select 使用 pandas 的 dataframe 中两个特定字符串之间的所有行 - Select all rows between two specific strings in a dataframe using pandas 如何在 Pandas 中的两个特定字符串术语之间获取字符串 - How to get strings between two specific strings terms in Pandas Python在两个特定字符串的每个实例之间分割文本(Regex) - Python splitting text between every instance of two specific strings (Regex) 两个字符串之间的正则表达式匹配? - Regex matching between two strings? 两个字符串之间的正则表达式文本 - Regex text between two strings 使用 python 获取包含正则表达式特定字符的两个变量字符串之间的字符串 - Get a string between two variable strings that contain regex specific characters using python Python 正则表达式多次替换两个字符串之间的文本,同时保留字符串 - Python regex replace text between two strings multiple times while keeping the strings Python Pandas:使用正则表达式替换带有超链接的字符串 - Python Pandas: Use regex to replace strings with hyperlink 如果在 Python 中使用正则表达式在两个字符串之间存在子字符串,则提取两个字符串之间的文本 - Extract text between two strings if a substring exists between the two strings using Regex in Python 通过使用正则表达式匹配在两个字符串之间选择文本 - Selecting text between two strings by matching using regex
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM