简体   繁体   English

我们如何使用Python在字符串的开头删除标点符号?

[英]How can we strip punctuation at the start of a string using Python?

I want to strip all kinds of punctuation at the start of the string using Python. 我想使用Python在字符串的开头删除所有类型的标点符号。 My list contains strings and some of them starting with some kind of punctuation. 我的列表包含字符串,其中一些以某种标点符号开头。 And how can I strip all type of punctuation from the strings? 如何从字符串中删除所有类型的标点符号?

For example: If my word is like ,,gets , I want to strip ,, from the word, and I want gets as the result. 例如:如果我的话就像是,,gets ,我想从单词中删除,,我希望gets结果。 Also, I want to strip away spaces as well as numbers from the list . 此外,我想从列表中删除空格数字 I have tried with the following code but it is not producing the correct result. 我尝试使用以下代码,但它没有产生正确的结果。

If 'a' is a list containing some words: 如果'a'是包含一些单词的列表:

for i in range (0,len(a)):
      a[i]=a[i].lstrip().rstrip()
      print a[i]

You can use strip() : 你可以使用strip()

Return a copy of the string with the leading and trailing characters removed. 返回删除了前导和尾随字符的字符串副本。 The chars argument is a string specifying the set of characters to be removed. chars参数是一个字符串,指定要删除的字符集。

Passing string.punctuation will remove all leading and trailing punctuation chars: 传递string.punctuation将删除所有前导和尾随标点符号:

>>> import string
>>> string.punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

>>> l = [',,gets', 'gets,,', ',,gets,,']
>>> for item in l:
...     print item.strip(string.punctuation)
... 
gets
gets
gets

Or, lstrip() if you need only leading characters removed, rstip() - for trailing characters. 或者, lstrip()如果只需要删除前导字符,则rstip() - 用于尾随字符。

Hope that helps. 希望有所帮助。

lstriprstrip传递要删除的字符

'..foo..'.lstrip('.').rstrip('.') == 'foo'

strip() when used without parameters strips only spaces. strip()在没有参数的情况下使用时只剥离空格。 If you want to strip any other character, you need to pass it as a parameter to strip function. 如果要剥离任何其他字符,则需要将其作为参数传递给strip函数。 In your case you should be doing 在你的情况下,你应该这样做

a[i]=a[i].strip(',')

To remove punctuation, spaces, numbers from the beginning of each string in a list of strings: 要从字符串列表中的每个字符串的开头删除标点符号,空格,数字:

import string

chars = string.punctuation + string.whitespace + string.digits    
a[:] = [s.lstrip(chars) for s in a]

Note: it doesn't take into account non-ascii punctuation, whitespace, or digits. 注意:它不考虑非ascii标点符号,空格或数字。

If you want to remove it only from the begining, try this: 如果你只想从开头删除它,试试这个:

    import re
    s='"gets'
    re.sub(r'("|,,)(.*)',r'\2',s)

Assuming you want to remove all punctuation regardless of where it occurs in a list containing strings (which may contain multiple words), this should work: 假设你要删除所有标点符号,无论它出现在包含字符串的列表中的位置(可能包含多个单词),这应该有效:

test1 = ",,gets"
test2 = ",,gets,,"
test3 = ",,this is a sentence and it has commas, and many other punctuations!!"
test4 = [" ", "junk1", ",,gets", "simple", 90234, "234"]
test5 = "word1 word2 word3 word4 902344"

import string

remove_l = string.punctuation + " " + "1234567890"

for t in [test1, test2, test3, test4, test5]:
    if isinstance(t, str):
        print " ".join([x.strip(remove_l) for x in t.split()])
    else:
        print [x.strip(remove_l) for x in t \
               if isinstance(x, str) and len(x.strip(remove_l))]
for each_string in list:
    each_string.lstrip(',./";:') #you can put all kinds of characters that you want to ignore.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM