简体   繁体   English

Python正则表达式 - 匹配和开始()

[英]Python Regex - Match and Start()

Let's say I need to find the word "water" in a string. 假设我需要在字符串中找到“水”这个词。 This word cannot be part of another word and it can't be preceded by punctuation (so i'm assuming it can only be preceded by a " " or it's the beginning of the string). 这个单词不能成为另一个单词的一部分,也不能在标点之前(所以我假设它只能以“”开头,或者它是字符串的开头)。 I need to return the index of the word's first char "w". 我需要返回单词的第一个字符“w”的索引。 So I'm trying this code : 所以我正在尝试这段代码:

import re
s = re.search(r"(\A| )\bwater\b", "Need water") 
return s.start() # This returns the index of the char " " :(

Is it possible to ignore the (\\A| ) part of the pattern so that s.start() always returns the index of the char "w"? 是否可以忽略模式的(\\ A |)部分,以便s.start()始终返回char“w”的索引? Or am I thinking this wrong? 或者我认为这是错的?

You can use 您可以使用

(?<!\S)\bwater\b

See the regex demo 请参阅正则表达式演示

Explanation: 说明:

  • (?<!\\S) - a negative lookbehind failing a match if there is a non-whitespace character right before a whole word water (?<!\\S) - 如果在整个单词water之前有一个非空白字符,则会出现一个负面的后观失败匹配
  • \\bwater\\b - a whole word water . \\bwater\\b - 一句话water

Here is a Python demo : 这是一个Python演示

import re
s = re.search(r"(?<!\S)\bwater\b", "Need water") 
if s:
    print(s.start())

You don't need to have that "beginning of a string or space" check. 您不需要检查“字符串或空格的开头”。 You've already applied the word boundaries check: 您已经应用了边界检查这个词:

>>> s = re.search(r"\bwater\b", "Need water")
>>> s.start()
5
>>> s = re.search(r"\bwater\b", "water is needed")
>>> s.start()
0

You don't even need regex. 你甚至不需要正则表达式。 Just match the space and the word, that will get you the character that the space is at, but you want the first letter so add 1 只需匹配空格和单词,即可获得空格所在的字符,但您需要第一个字母,因此请添加1

bigString = "I drink water"

if " " not in bigString:
    print(bigString.find("water"))
else:
    print(bigString.find(" water")+1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM