简体   繁体   中英

re.match pattern matching with ^,$

In the code below:

    >>> pattern = re.compile(r'^<HTML>')
    >>> pattern.match("<HTML>")
    <_sre.SRE_Match at 0x1043bc8b8>
    >>> pattern.match("⇢ ⇢ <HTML>", 2) # ⇢ stands for whitespace character.
    None

When we are using ^ metacharacter and matching pattern then any whitespace character at the beginning as given below doesn't give a match even if the

'pos' argument is equal to 2, and the reason being given was that the metacharacter ^ couldn't be matched in such cases( < is at position 2, and it cannot be matched with ^).

>>> pattern = re.compile(r'<HTML>$')
>>> pattern.match("<HTML>⇢", 0,6) # ⇢ stands for whitespace character.
<_sre.SRE_Match object at 0x1007033d8>
>>> pattern.match("<HTML>⇢"[:6])
<_sre.SRE_Match object at 0x100703370>

But, when we are using $ at the end of regular expression and giving the 'end' argument there is a match? Why the difference?

You'd have to dig a little into the docs, but the answer lies there. You will find the following information in the docs for pattern.search , the same description applies to pattern.match as well.

The optional second parameter pos gives an index in the string where the search is to start; it defaults to 0. This is not completely equivalent to slicing the string; the '^' pattern character matches at the real beginning of the string and at positions just after a newline, but not necessarily at the index where the search is to start.

So, this means the SOL anchor ^ will match from the true beginning of the string (and not from the position dictated by pos . OTOH,

The optional parameter endpos limits how far the string will be searched; it will be as if the string is endpos characters long, so only the characters from pos to endpos - 1 will be searched for a match.

Emphasis mine. Meaning that a pattern with the EOL anchor ^ will actually match upto endpos only (unlike pos ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM