Regex match two characters following each other

Question

I have a string with several spaces followed by commas in a pandas column. These are how the strings are organized.

original_string = "okay, , , , humans"

I want to remove the spaces and the subsequent commas so that the string will be:

goodstring = "okay,humans"

But when I use this regex pattern: [\s,]+ what I get is different. I get

badstring = "okayhumans" .

It removes the comma after okay but I want it to be like in goodstring. How can I do that?

Answer 1

Replace:

[\s,]*,[\s,]*

With:

See an online demo

[\s,]* - 0+ leading whitespace-characters or comma;
, - A literal comma (ensure we don't replace a single space);
[\s,]* - 0+ trainling whitespace-characters or comma.

In Pandas, this would translate to something like:

df[<YourColumn>].str.replace('[\s,]*,[\s,]*', ',', regex=True)

Answer 2

You have two issues with your code:

Since [\s,]+ matches any combination of spaces and commas (eg single comma , ) you should not remove the match but replace it with ','
[\s,]+ matches any combination of spaces and commas, eg just a space ' ' ; it is not what we are looking for, we must be sure that at least one comma is present in the match.

Code:

text = 'okay, ,  ,,,, humans! A,B,C'

result = re.sub(r'\s*,[\s,]*', ',', text);

Pattern:

\s*    - zero or more (leading) whitespaces
,      - comma (we must be sure that we have at least one comma in a match)
[\s,]* - arbitrary combination of spaces and commas

Answer 3

Please try this

re.sub('[,\s+,]+',',',original_string)

you want to replace ",[space]," with ",".

Answer 4

You could use substitution:

import re

pattern = r'[\s,]+'
original_string = "okay, , , , humans"
re.sub(r'[\s,]+', ',', original_string)

Regex match two characters following each other

Question

4 answers

solution1
2 2022-07-14 09:23:51

solution2
1 2022-07-14 09:33:20

solution3
0 2022-07-14 09:23:33

solution4
0 2022-07-14 09:25:29

Regex match two characters following each other

Question

4 answers

solution1 2 2022-07-14 09:23:51

solution2 1 2022-07-14 09:33:20

solution3 0 2022-07-14 09:23:33

solution4 0 2022-07-14 09:25:29

solution1
2 2022-07-14 09:23:51

solution2
1 2022-07-14 09:33:20

solution3
0 2022-07-14 09:23:33

solution4
0 2022-07-14 09:25:29