简体   繁体   English

如何从Python中的字符串开头删除特殊字符

[英]How to remove special characters from the beginning of a string in Python

I am getting my data from XML which may some time contain special Character at beginning like: 我从XML获取数据,有时可能会在开头包含特殊字符:

'This is a sample title or %&*I don't know if this is the text '这是一个示例标题或%&*我不知道这是否是文本

I tried with : title[0].isstring() or title[0].isdigit() and then remove the character. 我试过: title[0].isstring() or title[0].isdigit()然后删除该字符。 But if there are more than one special character at the beginning, then how do I remove it? 但如果一开始有多个特殊字符,那么如何删除它呢? Do I need a for loop? 我需要一个for循环吗?

You could use a regular expression: 您可以使用正则表达式:

import re
mystring = re.sub(r"^\W+", "", mystring)

This removes all non-alphanumeric characters from the start of your string: 这将从字符串的开头删除所有非字母数字字符:

Explanation: 说明:

^   # Start of string
\W+ # One or more non-alphanumeric characters
>>> import re
>>> re.sub(r'^\W*', '', "%&*I don't know if this is the text")
"I don't know if this is the text"

#or

>>> "%&*I don't know if this is the text".lstrip("!@#$%^&*()")
"I don't know if this is the text"

If there are just a few specific kinds of characters you want to remove, use lstrip() ("left strip"). 如果只想删除几种特定类型的字符,请使用lstrip() (“left strip”)。

For instance, if you wanted to remove any starting % , & , or * characters, you'd use: 例如,如果要删除任何起始%&*字符,您将使用:

actual_title = title.lstrip("%&*")

On the other hand, if you want to remove any characters that aren't part of a certain set (eg alphanumerics), then the regex solution specified in Tim Pietzcker's solution is probably the easiest way. 另一方面,如果你想删除任何属于某个集合的字符(例如字母数字),那么Tim Pietzcker解决方案中指定的正则表达式解决方案可能是最简单的方法。

Using a strip function to remove any special characters from the beginning and end of the string. 使用strip函数从字符串的开头和结尾删除任何特殊字符。 Ex. 防爆。

 str = ").* this is text .(" str.strip(")(.* ") Output: 'this is text' 

If you want to remove from the beginning of string use lstrip() Ex. 如果要从字符串的开头删除,请使用lstrip()Ex。

 str = ").* this is text .(" str.lstrip(")(.* ") Output: 'this is text .(' 

If you want to remove from the end of the string use rstrip() Ex. 如果要从字符串末尾删除,请使用rstrip()Ex。

 str = ").* this is text .(" str.rstrip(")(.* ") Output: ').* this is text' 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM