簡體   English   中英

python - 如何在不使用正則表達式的情況下查找和刪除連續重復的標點符號?

[英]How to find and get rid of consecutive repeated punctuation signs without using regular expressions in python?

我想擺脫重復的連續標點符號,只留下其中一個。

如果我有string = 'Is it raining????' ,我想得到string = 'Is it raining?' 但我不想擺脫'...'

我還需要在不使用正則表達式的情況下執行此操作。 我是 python 的初學者,希望得到任何建議或提示。 謝謝 :)

另一種groupby方法:

from itertools import groupby 
from string import punctuation

punc = set(punctuation) - set('.')

s = 'Thisss is ... a test!!! string,,,,, with 1234445556667 rrrrepeats????'
print(s)

newtext = []
for k, g in groupby(s):
    if k in punc:
        newtext.append(k)
    else:
        newtext.extend(g)

print(''.join(newtext))

輸出

Thisss is ... a test!!! string,,,,, with 1234445556667 rrrrepeats????
Thisss is ... a test! string, with 1234445556667 rrrrepeats?

下面的方法怎么樣:

import string

text = 'Is it raining???? No,,,, but...,,,, it is snoooowing!!!!!!!'

for punctuation in string.punctuation:
    if punctuation != '.':
        while True:
            replaced =  text.replace(punctuation * 2, punctuation)
            if replaced == text:
                break
            text = replaced

print(text)

這將提供以下輸出:

Is it raining? No, but..., it is snoooowing!

或者對於給出相同結果的更有效的版本:

import string

text = 'Is it raining???? No,,,, but...,,,, it is snoooowing!!!!!!!'
last = None
output = []

for c in text:
    if c == '.':
        output.append(c)
    elif c != last:
        if c in string.punctuation:
            last = c
        else:
            last = None
        output.append(c)

print(''.join(output))
import string
from itertools import groupby

# get all punctuation minus period.
puncs = set(string.punctuation)-set('.')
s = 'Is it raining???? No but...,,,, it is snowing!!!!!!!###!@#@@@@'

# get count of consecutive characters
t = [[k,len(list(g))] for k, g in groupby(s)]

s = ''
for ele in t:
    char = ele[0]
    count = ele[1]
    if char in puncs and count > 1:
        count = 1
    s+=char*count

print s
#Is it raining? No but..., it is snowing!#!@#@
from itertools import groupby

s = 'Is it raining???? okkkk!!! ll... yeh""" ok?'
replaceables = [ch for i, ch in enumerate(s) if i > 0 and s[i - 1] == ch and (not ch.isalpha() and ch != '.')]
replaceables = [list(g) for k, g in groupby(replaceables)]

start = 0
for replaceable in replaceables:
    replaceable = ''.join(replaceable)
    start = s.find(replaceable, start)
    r = s[start:].replace(replaceable, '', 1)
    s = s.replace(s[start:], r)
print s

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM