简体   繁体   English

正则表达式匹配两个相互跟随的字符

[英]Regex match two characters following each other

I have a string with several spaces followed by commas in a pandas column.我有一个带有几个空格的字符串,后跟熊猫列中的逗号。 These are how the strings are organized.这些是字符串的组织方式。

original_string = "okay, , , , humans"

I want to remove the spaces and the subsequent commas so that the string will be:我想删除空格和随后的逗号,以便字符串为:

goodstring = "okay,humans"

But when I use this regex pattern: [\s,]+ what I get is different.但是当我使用这个正则表达式模式时: [\s,]+我得到的是不同的。 I get我明白了

badstring = "okayhumans" . badstring = "okayhumans" .

It removes the comma after okay but I want it to be like in goodstring.它会在好的之后删除逗号,但我希望它像在好字符串中一样。 How can I do that?我怎样才能做到这一点?

Replace:代替:

[\s,]*,[\s,]*

With:和:

,

See an online demo查看在线演示


  • [\s,]* - 0+ leading whitespace-characters or comma; [\s,]* - 0+ 前导空白字符或逗号;
  • , - A literal comma (ensure we don't replace a single space); , - 文字逗号(确保我们不替换单个空格);
  • [\s,]* - 0+ trainling whitespace-characters or comma. [\s,]* - 0+ 训练空白字符或逗号。

In Pandas, this would translate to something like:在 Pandas 中,这将转化为:

df[<YourColumn>].str.replace('[\s,]*,[\s,]*', ',', regex=True)

You have two issues with your code:您的代码有两个问题:

  1. Since [\s,]+ matches any combination of spaces and commas (eg single comma , ) you should not remove the match but replace it with ','由于[\s,]+匹配空格和逗号的任何组合(例如单个逗号, ),因此您不应删除匹配项,而是将其替换','
  2. [\s,]+ matches any combination of spaces and commas, eg just a space ' ' ; [\s,]+匹配空格和逗号的任意组合,例如只是一个空格' ' it is not what we are looking for, we must be sure that at least one comma is present in the match.这不是我们要找的,我们必须确保匹配中至少存在一个逗号。

Code:代码:

text = 'okay, ,  ,,,, humans! A,B,C'

result = re.sub(r'\s*,[\s,]*', ',', text);

Pattern:图案:

\s*    - zero or more (leading) whitespaces
,      - comma (we must be sure that we have at least one comma in a match)
[\s,]* - arbitrary combination of spaces and commas

Please try this请试试这个

re.sub('[,\s+,]+',',',original_string)

you want to replace ",[space]," with ",".您想用“,”替换“,[空格]”。

You could use substitution:您可以使用替换:

import re

pattern = r'[\s,]+'
original_string = "okay, , , , humans"
re.sub(r'[\s,]+', ',', original_string)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM