简体   繁体   English

在正则表达式匹配中包含以下行

[英]Include following lines in a regex match

I'm trying to set up a regular expression in PHP so that it matches both single and multiple lines.我正在尝试在 PHP 中设置一个正则表达式,以便它同时匹配单行和多行。 What I'm trying to do with the data below is match all of those entries that are within the 1p range.我试图对下面的数据做的是匹配 1p 范围内的所有条目。

[24-Jun-2021 09:15:24 America/New_York] One line data here
[24-Jun-2021 09:15:27 America/New_York] One line data here
[24-Jun-2021 09:15:31 America/New_York] One line data here
[24-Jun-2021 13:49:21 America/New_York] One line data here
[24-Jun-2021 13:49:27 America/New_York] One line data here
[24-Jun-2021 13:49:27 America/New_York] One line data here
[24-Jun-2021 13:49:28 America/New_York] Start of multi-line data here
multi-line data here
multi-line data here
multi-line data here
multi-line data here
multi-line data here
multi-line data here
end of multi-line data here
[24-Jun-2021 13:51:16 America/New_York] One line data here
[24-Jun-2021 14:51:25 America/New_York] One line data here

The regex I'm using (and it could probably be written better as is but my regex-fu is weak) is我正在使用的正则表达式(它可能会写得更好,但我的正则表达式很弱)是

/(\\[24-Jun-2021\\s13:.+)\\n\\[\\d\\d/mi

(again, I'm specifically looking for that date/time and is why I have the date and time hard coded in the regex). (同样,我专门寻找那个日期/时间,这就是我在正则表达式中硬编码日期和时间的原因)。 And the line of php I'm using is我正在使用的 php 行是

preg_match_all('/(\[24-Jun-2021\s13:.+)\n\[\d\d/mi', $string, $array);

That pattern has no problem matching all of those entries that are only "One line data here".该模式匹配所有这些只有“一行数据”的条目没有问题。 But what I'm needing is so that it matches everything for the 1p hour, including the entirety of the multi-line entry.但我需要的是它匹配 1p 小时的所有内容,包括整个多行条目。 I've tried a variety of different patterns but they either resulted in either no matches at all or still just those single line matches.我尝试了各种不同的模式,但它们要么根本没有匹配,要么仍然只是那些单行匹配。 I've tried adding the s modifier to the end of the pattern (ie, /mis ) but all that does is match everything from the first 1p entry all the way down to the end of the string and that's definitely not what I'm wanting.我已经尝试将s修饰符添加到模式的末尾(即/mis ),但所做的只是匹配从第一个 1p 条目一直到字符串末尾的所有内容,这绝对不是我想要的想要。

I've been beating my head against the wall for several hours.几个小时以来,我一直在用头撞墙。 I've been searching to try to find something that might help but I keep coming up empty.我一直在寻找可能有帮助的东西,但我一直空着。 I'm hoping that someone knows how to do this.我希望有人知道如何做到这一点。

thnx,谢谢,
Christoph克里斯托夫

Using /s will make the dot match a newline.使用/s将使点匹配换行符。 In the pattern that you posted /(\\[24-Jun-2021\\s13:.+?)\\n\\[\\d\\d/mis it will capture as least as possible lines in group 1 and then match \\n\\[\\d\\d .在您发布的模式中/(\\[24-Jun-2021\\s13:.+?)\\n\\[\\d\\d/mis它将捕获组 1 中尽可能少的行,然后匹配\\n\\[\\d\\d

Because it matches that at the start of the string, it will not match the multi-line data here part, see this example .因为它匹配了字符串开头的那个,所以它不会匹配multi-line data heremulti-line data here部分,看这个例子

You might overcome this issue using a positive lookahead (?=\\n\\[\\d\\d) , asserting that part on the next line instead of matching it.您可以使用正向前瞻(?=\\n\\[\\d\\d)来解决这个问题,在下一行断言该部分而不是匹配它。

\[24-Jun-2021\s13:.+?(?=\n\[\d\d)

See a regex demo查看正则表达式演示

The line in php (without the /m flag as there are no anchors in the pattern): php 中的行(没有/m标志,因为模式中没有锚点):

preg_match_all('/\[24-Jun-2021\s13:.+?(?=\n\[\d\d)/is', $string, $array);

As you want to start the match with the specific date followed by all lines that do not start with a date pattern (and using a non greedy quantifier .*? increases backtracking), I would suggest a pattern to match all following lines that to not start with \\n\\[\\d\\d .由于您想以特定日期开始匹配,然后是所有不以日期模式开头的行(并使用非贪婪量词.*?增加回溯),我建议使用一种模式来匹配以下所有行以\\n\\[\\d\\d开头。

As the specific date is at the start if the string, you can add an anchor ^ and the /m flag.由于特定日期在字符串的开头,您可以添加锚点^/m标志。

^\[24-Jun-2021\s13:.+(?:\R(?!\[\d\d).*)*

The pattern matches:模式匹配:

  • ^ Start of string ^字符串开始
  • \\[24-Jun-2021\\s13:.+ Match the specific date and the rest of the line \\[24-Jun-2021\\s13:.+匹配特定日期和行的其余部分
  • (?: Non capture group (?:非捕获组
    • \\R(?!\\[\\d\\d) Match a newline and assert not [ and 2 digits directly to the right \\R(?!\\[\\d\\d)匹配换行符并断言不是[和右边的 2 位数字
    • .* If that is the case, match the whole line .*如果是这样,匹配整行
  • )* Close non capture group, and optionally repeat to match all lines )*关闭非捕获组,并可选择重复以匹配所有行

Regex demo |正则表达式演示| Php demo php 演示

The line in php (without the /s flag): php 中的行(没有/s标志):

preg_match_all('/^\[24-Jun-2021\s13:.+(?:\R(?!\[\d\d).*)*/im', $string, $array);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM