简体   繁体   中英

Regular expression: match all untill a certain word (PHP)

I'm processing a file with PHP.

This file contains a few blocks, which always start with the word "Step" (step 1, step 2 etc), and always end with "end step". Withing, it can have newlines, but never 2 consequently.

I'm trying to build a regex that will turn this into an array.

What I have so far is

preg_match_all("/Step([^\"end step\"]*)/s", $content, $matches);

The /s at the end of the patnern is to allow newslines to be included too. But of course, this does not work since all letters from "end step" are excluded, not only if they form 1 word. How can I write the correct regex?

One simple way:

preg_match_all('/Step(.*?)"end step"/s', $content, $matches);

This matches any text from Step to the nearest "end step" . But it needs to backtrack after every single character which could be slow.

Slightly more explicit and possibly more efficient::

preg_match_all('/Step((?:(?!"end step").)*)/s', $content, $matches);

This matches all the text from Step up to but not including the nearest "end step" . It will match until the end of the string if "end step" never comes. This regex looks ahead at every step to check whether the string "end step" could be matched there and ends the match if that's true.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM