[英]Regular expression to check whitespace in the beginning and end of a string
This code supposed not to let strings with whitespaces in the beginning and end.这段代码不应该让开头和结尾有空格的字符串。 In some reason I have negative result with this code出于某种原因,我对这段代码有负面结果
import re
def is_match(pattern, string):
return True if len(re.compile(pattern).findall(string)) == 1 else False
print(is_match("[^\s]+[a-zA-Z0-9]+[^\s]+", '1'))
However, other strings work fine.但是,其他字符串工作正常。 Can anyone explain why result is negative, or even provide better function (newbie in python).任何人都可以解释为什么结果为负,甚至提供更好的功能(python 新手)。
检查字符串开头或结尾是否有空格的最简单方法不涉及正则表达式。
if test_string != test_string.strip():
The regexp you're looking for is ^\\s|\\s$
:您正在寻找的正则表达式是^\\s|\\s$
:
xs = ["no spaces", " starts", "ends ", "\t\tboth\n\n", "okay"]
import re
print [x for x in xs if re.search(r'^\s|\s$', x)]
## [' starts', 'ends ', '\t\tboth\n\n']
^\\s.*?\\s$
only matches whitespace on both ends: ^\\s.*?\\s$
只匹配两端的空格:
print [x for x in xs if re.search(r'^\s.*?\s$', x, re.S)]
## ['\t\tboth\n\n']
An inverse expression (no starting-ending whitespace) is ^\\S.*?\\S$
:一个逆表达式(没有开始和结束的空格)是^\\S.*?\\S$
:
print [x for x in xs if re.search(r'^\S.*?\S$', x, re.S)]
## ['no spaces', 'okay']
def is_whiteSpace(string):
t=' ','\t','\n','\r'
return string.startswith(t) or string.endswith(t)
print is_whiteSpace(" GO") -> True
print is_whiteSpace("GO") -> False
print is_whiteSpace("GO ") -> True
print is_whiteSpace(" GO ") -> True
No fancy regex needed, just use the way more readable:不需要花哨的正则表达式,只需使用更具可读性的方式:
>>> def is_whitespace(s):
from string import whitespace
return any((s[0] in whitespace, s[-1] in whitespace))
>>> map(is_whitespace, ("foo", "bar ", " baz", "\tspam\n"))
[False, True, True, True]
Instead of trying to construct a regular expression that detects strings without spaces, it's easier to check for strings that DO have spaces and then invert the logic in your code.与其尝试构建一个检测没有空格的字符串的正则表达式,不如检查确实有空格的字符串,然后反转代码中的逻辑。
Remember that re.match()
returns None
(a logical false value) if it doesn't find a match, and a SRE_Match
object (logical true value) if it does find a match.请记住,如果没有找到匹配项,则re.match()
返回None
(逻辑假值),如果找到匹配项,则SRE_Match
对象(逻辑真值)。 Use that to write something like this:用它来写这样的东西:
In [24]: spaces_pattern = re.compile ( r"^(\s.+|.+\s)$" )
In [27]: for s in ["Alpha", " Bravo", "Charlie ", " Delta "]:
....: if spaces_pattern.match(s):
....: print ( "%s had whitespace." % s )
....: else:
....: print ( "%s did not have whitespace." % s )
....:
Alpha did not have whitespace.
Bravo had whitespace.
Charlie had whitespace.
Delta had whitespace.
Note the use of the ^$
anchors to force the match over the entire input string.请注意使用^$
锚点来强制匹配整个输入字符串。
Edit:编辑:
This doesn't even need regexp at all - you only need to check the first and last characters:这甚至根本不需要正则表达式 - 您只需要检查第一个和最后一个字符:
test_strings = ['a', ' b', 'c ', ' d ', 'e f', ' g h', ' i j', ' k l ']
for s in test_strings:
if s[0] in " \n\r\t":
print("'%s' started with whitespace." % s)
elif s[-1] in " \n\r\t":
print("'%s' ended with whitespace." % s)
else:
print("'%s' was whitespace-free." % s)
Edit 2:编辑2:
A regex that should work anywhere: ^\\S(.*\\S)?
一个可以在任何地方工作的正则表达式: ^\\S(.*\\S)?
. . You may need to come up with a local equivalent to \\S
("anything but whitespace") if your regex dialect doesn't include it.如果您的正则表达式方言不包含它,您可能需要想出一个本地等效于\\S
(“除空白之外的任何东西”)。
test_strings = ['a', ' b', 'c ', ' d ', 'e f', ' g h', ' i j', ' k l ']
import re
pat = re.compile("^\S(.*\S)?$")
for s in test_strings:
if pat.match(s):
print("'%s' had no whitespace." % s)
else:
print("'%s' had whitespace." % s)
Note that \\S
is the negated form of \\s
, ie \\S
means "anything but whitespace".请注意, \\S
是的否定形式\\s
,即\\S
的意思是“什么,但空白”。
Also note that strings of length 1 are accounted for by making part of the match optional.另请注意,长度为 1 的字符串是通过将匹配的一部分设为可选来计算的。 (You might think to use \\S.*\\S
, but this forces a match length of at least 2.) (您可能会考虑使用\\S.*\\S
,但这会强制匹配长度至少为 2。)
'a' had no whitespace.
' b' had whitespace.
'c ' had whitespace.
' d ' had whitespace.
'e f' had no whitespace.
' g h' had whitespace.
' i j' had whitespace.
' k l ' had whitespace.
A variant on ch3ka's suggestion: ch3ka 建议的一个变体:
import string
whitespace = tuple(string.whitespace)
'a '.endswith(whitespace)
## True
'a '.startswith(whitespace)
## False
'a\n'.endswith(whitespace)
## True
'a\t'.endswith(whitespace)
## True
I find it easier to remember than the regex stuff (except maybe the bit converting whitespace
to a tuple).我发现它比正则表达式更容易记住(除了可能将whitespace
转换为元组的位)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.