[英]How to remove special characters from the beginning of a string in Python
I am getting my data from XML which may some time contain special Character at beginning like: 我从XML获取数据,有时可能会在开头包含特殊字符:
'This is a sample title or %&*I don't know if this is the text
'这是一个示例标题或%&*我不知道这是否是文本
I tried with : title[0].isstring() or title[0].isdigit()
and then remove the character. 我试过:
title[0].isstring() or title[0].isdigit()
然后删除该字符。 But if there are more than one special character at the beginning, then how do I remove it? 但如果一开始有多个特殊字符,那么如何删除它呢? Do I need a for loop?
我需要一个for循环吗?
You could use a regular expression: 您可以使用正则表达式:
import re
mystring = re.sub(r"^\W+", "", mystring)
This removes all non-alphanumeric characters from the start of your string: 这将从字符串的开头删除所有非字母数字字符:
Explanation: 说明:
^ # Start of string
\W+ # One or more non-alphanumeric characters
>>> import re
>>> re.sub(r'^\W*', '', "%&*I don't know if this is the text")
"I don't know if this is the text"
#or
>>> "%&*I don't know if this is the text".lstrip("!@#$%^&*()")
"I don't know if this is the text"
If there are just a few specific kinds of characters you want to remove, use lstrip()
("left strip"). 如果只想删除几种特定类型的字符,请使用
lstrip()
(“left strip”)。
For instance, if you wanted to remove any starting %
, &
, or *
characters, you'd use: 例如,如果要删除任何起始
%
, &
或*
字符,您将使用:
actual_title = title.lstrip("%&*")
On the other hand, if you want to remove any characters that aren't part of a certain set (eg alphanumerics), then the regex solution specified in Tim Pietzcker's solution is probably the easiest way. 另一方面,如果你想删除任何不属于某个集合的字符(例如字母数字),那么Tim Pietzcker解决方案中指定的正则表达式解决方案可能是最简单的方法。
Using a strip function to remove any special characters from the beginning and end of the string. 使用strip函数从字符串的开头和结尾删除任何特殊字符。 Ex.
防爆。
str = ").* this is text .(" str.strip(")(.* ") Output: 'this is text'
If you want to remove from the beginning of string use lstrip() Ex. 如果要从字符串的开头删除,请使用lstrip()Ex。
str = ").* this is text .(" str.lstrip(")(.* ") Output: 'this is text .('
If you want to remove from the end of the string use rstrip() Ex. 如果要从字符串末尾删除,请使用rstrip()Ex。
str = ").* this is text .(" str.rstrip(")(.* ") Output: ').* this is text'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.