简体   繁体   English

正则表达式仅在同一行中匹配

[英]Regex match with in the same line only

Sample of the text am trying to solve using regex in Python is as below在 Python 中使用正则表达式尝试解决的文本示例如下

it is amamzing to look at the evening sky and the color
color of the sky is blue
color
sky color is blue

Am trying to find up to 3 words previous to the color however I want to extract the words if they are only in the same line我试图在颜色之前找到最多 3 个单词,但是如果它们仅在同一行中,我想提取这些单词

Highlighted is the Output am looking for突出显示的是我正在寻找的输出

it is amamzing to look at the evening sky and the color
color of the sky is blue
color
sky color is blue

Code I am using is我正在使用的代码是

((?:\S+\s+){0,4}\b(?=color)\b\s*)

Sample as below示例如下

https://regex101.com/r/Q61Hi7/1 https://regex101.com/r/Q61Hi7/1

This may be a duplicate question, however I couldn't find any answer that solves这可能是一个重复的问题,但是我找不到任何可以解决的答案

Instead of using \\s to match any whitespace, use a literal space to just match spaces.不要使用\\s来匹配任何空格,而是使用文字空间来匹配空格。 You could add \\t if you want to include that too.如果你也想包含它,你可以添加\\t

((?:\S+ ){0,4}\b(?=color)\b\s*)
  • the \\n is included in \\s that's why you read form different lines, you may use \\t (space + tab) \\n包含在\\s ,这就是您阅读不同行的原因,您可以使用\\t (空格 + 制表符)
  • for up to 3 words, I'd say {1,3} to get 1, 2 or 3 words ({0,4} can read none and 4 words too)对于最多 3 个单词,我会说{1,3}以获得 1、2 或 3 个单词({0,4} 可以不读,也可以读 4 个单词)

Result in ((?:\\S+[ \\t]){1,3}\\b(?=color)\\b\\s*) https://regex101.com/r/Q61Hi7/3结果((?:\\S+[ \\t]){1,3}\\b(?=color)\\b\\s*) https://regex101.com/r/Q61Hi7/3

Try (?:\\S+ +){0,3}color试试(?:\\S+ +){0,3}color

Explanation:解释:

(?:...) - non-capturing group (?:...) - 非捕获组

\\S+ - match 1+ of non-whtespace characters (to match a word) \\S+ - 匹配 1+ 个非空白字符(匹配一个单词)

+ - match 1+ spaces (you can include here other whitecharacters, but don't use \\s as it will break your requirement of single line match, because it matches newline character as well) + - 匹配 1+ 个空格(您可以在此处包含其他空白字符,但不要使用\\s因为它会破坏您对单行匹配的要求,因为它也匹配换行符)

{0,3} - match preceding pattern between 0 and 3 times {0,3} - 匹配前面的模式 0 到 3 次

color - match color literally color - 从字面上匹配color

Demo演示

You may use您可以使用

\S+(?:[^\S\r\n]+\S+){0,3}(?=[^\S\r\n]+color\b)

See the regex demo and the Regulex graph :请参阅正则表达式演示正则表达式

在此处输入图片说明

Details细节

  • \\S+ - 1+ non-whitespace chars \\S+ - 1+ 个非空白字符
  • (?:[^\\S\\r\\n]+\\S+){0,3} - zero to three occurrences of (?:[^\\S\\r\\n]+\\S+){0,3} - 零到三个出现
    • [^\\S\\r\\n]+ - 1+ horizontal whitespaces (assuming line endings are CR/LF) [^\\S\\r\\n]+ - 1+ 个水平空格(假设行尾是 CR/LF)
    • \\S+ - 1+ non-whitespace chars \\S+ - 1+ 个非空白字符
  • (?=[^\\S\\r\\n]+color\\b) - immediately to the right of the current location, there must be 1+ horizontal whitespaces and then a whole word color . (?=[^\\S\\r\\n]+color\\b) - 在当前位置的右侧,必须有 1+ 个水平空格,然后是整个单词color

Python demo : Python 演示

import re
rx = r"\S+(?:[^\S\r\n]+\S+){0,3}(?=[^\S\r\n]+color\b)"
s = "it is amamzing to look at the evening sky and the color\ncolor of the sky is blue\ncolor\nsky color is blue"
print(re.findall(rx, s))
# => ['evening sky and the', 'sky']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM