Removing repeated trailing characters from a string in Python

Question

I have a field with comments. Some of the comments are just "no" but with varying trailing "o"s. I want to do a transformation to these comments, such that I only get "no" returned. How can I achieve this using regex?

Eg:

remove_trailing_os("noooooo") should output "no"

remove_trailing_os("nooOOoooooooo") should output "no"

Answer 1

You could use a case insensitive backreference:

import re
re.sub(r'(.)(?i:\1)+$', r'\1', "nooOOoooooooo", re.I)

output: 'no'

regex:

(.)        # match a character
(?i:\1)+$  # match trailing case insensitive repeats of the character

Answer 2

You can try with join

cc = "noooooo"
cc1= "nooOOoooooooo"
print(''.join(sorted(set(cc), key=cc.index)))
print(''.join(sorted(set(cc1.lower()), key=cc1.index)))

will give

no
no

Also With regex you can do

repeat_pattern = re.compile(r'(\w)\1*', flags=re.IGNORECASE)
d = repeat_pattern.sub(r"\1", cc)
d1 = repeat_pattern.sub(r"\1", cc1)
print(d)
print(d1)

will also give

no
no

Answer 3

This seems similar to how can I remove all characters after the second occurrence of a ' ' (space)

but essentially you want to replace space with o. Hence

## Assuming the two instances
t = 'noooooo'
t2 = 'nooOOoooooooo'
## Trying them on the two instances
t[:t.find('o',t.find('o')+1)]
t2[:t2.find('o',t2.find('o')+1)]

Removing repeated trailing characters from a string in Python

Question

2 answers

solution1
3 2022-01-27 11:09:01

solution2
-1 2022-01-27 11:12:53

solution3
-1 2022-01-27 11:16:45

Removing repeated trailing characters from a string in Python

Question

2 answers

solution1 3 2022-01-27 11:09:01

solution2 -1 2022-01-27 11:12:53

solution3 -1 2022-01-27 11:16:45

solution1
3 2022-01-27 11:09:01

solution2
-1 2022-01-27 11:12:53

solution3
-1 2022-01-27 11:16:45