简体   繁体   中英

Regex matching multiple lines multiple times

I have a string like this:

Name: John Doe

Age: 23

Primary Language: English

Description: This is a multiline
description field that I want 
to capture

Country: Canada

That's not the actual data, but you can see what I'm trying to do. I want to use regex to get an array of the "key" fields (Name, Age, Primary Language, Description, Country) and their values.

I'm using PHP.

My current attempt is this, but it doesn't work:

preg_match( '/^(.*?\:) (.*?)(\n.*?\:)/ism', $text, $matches );

Here's one solution: http://rubular.com/r/uDgXcIvhac .

    \s*([^:]+?)\s*:\s*(.*(?:\s*(?!.*:).*)*)\s*

Note that I've used a negative lookahead assertion, (?!.*:) . This is the only way you can check that the next line doesn't look like a new field, and at the same time continue where you left off. (This is why lookaheads and lookbehinds are known as zero-width assertions.)

EDIT: Removed bit about arbitrary-width lookaheads; I was mistaken. The above solution is fine.

Would PHP's strtok help you? You could use it with ":" as the delimeter/token and trim leading and trailing spaces to remove the unwanted new lines.

http://php.net/manual/en/function.strtok.php

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM