![](/img/trans.png)
[英]How to find repeated substring in a string using regular expressions in Python?
[英]How to find and get rid of consecutive repeated punctuation signs without using regular expressions in python?
我想擺脫重復的連續標點符號,只留下其中一個。
如果我有string = 'Is it raining????'
,我想得到string = 'Is it raining?'
但我不想擺脫'...'
我還需要在不使用正則表達式的情況下執行此操作。 我是 python 的初學者,希望得到任何建議或提示。 謝謝 :)
另一種groupby
方法:
from itertools import groupby
from string import punctuation
punc = set(punctuation) - set('.')
s = 'Thisss is ... a test!!! string,,,,, with 1234445556667 rrrrepeats????'
print(s)
newtext = []
for k, g in groupby(s):
if k in punc:
newtext.append(k)
else:
newtext.extend(g)
print(''.join(newtext))
輸出
Thisss is ... a test!!! string,,,,, with 1234445556667 rrrrepeats????
Thisss is ... a test! string, with 1234445556667 rrrrepeats?
下面的方法怎么樣:
import string
text = 'Is it raining???? No,,,, but...,,,, it is snoooowing!!!!!!!'
for punctuation in string.punctuation:
if punctuation != '.':
while True:
replaced = text.replace(punctuation * 2, punctuation)
if replaced == text:
break
text = replaced
print(text)
這將提供以下輸出:
Is it raining? No, but..., it is snoooowing!
或者對於給出相同結果的更有效的版本:
import string
text = 'Is it raining???? No,,,, but...,,,, it is snoooowing!!!!!!!'
last = None
output = []
for c in text:
if c == '.':
output.append(c)
elif c != last:
if c in string.punctuation:
last = c
else:
last = None
output.append(c)
print(''.join(output))
import string
from itertools import groupby
# get all punctuation minus period.
puncs = set(string.punctuation)-set('.')
s = 'Is it raining???? No but...,,,, it is snowing!!!!!!!###!@#@@@@'
# get count of consecutive characters
t = [[k,len(list(g))] for k, g in groupby(s)]
s = ''
for ele in t:
char = ele[0]
count = ele[1]
if char in puncs and count > 1:
count = 1
s+=char*count
print s
#Is it raining? No but..., it is snowing!#!@#@
from itertools import groupby
s = 'Is it raining???? okkkk!!! ll... yeh""" ok?'
replaceables = [ch for i, ch in enumerate(s) if i > 0 and s[i - 1] == ch and (not ch.isalpha() and ch != '.')]
replaceables = [list(g) for k, g in groupby(replaceables)]
start = 0
for replaceable in replaceables:
replaceable = ''.join(replaceable)
start = s.find(replaceable, start)
r = s[start:].replace(replaceable, '', 1)
s = s.replace(s[start:], r)
print s
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.