简体   繁体   中英

PHP RegEx: How to extract certain words from a sentence?

I have a Wikitext like:

Directed by David Fincher. Written by Jim Uhls. Based on the novel by Chuck Palahniuk.

I am trying to make regex that only fetches David Fincher and Jim Uhls , both name will vary based on url. I made following Regex and it does work (after replacing unwanted text), is there some better way?

/(Directed by)([\w\s]+). (Written by)([\w\s]+). /g

(?:Directed|Written)\\s*by\\s this will match Directed by or Written by

\\K will discard previous matches.

[^\\.]+ this will match upto character . dot(excluding .dot).

Regex: /(?:Directed|Written)\\s*by\\s+\\K[^.]+/g

Regex demo

<?php

ini_set('display_errors', 1);
$string='Directed by David Fincher. Written by Jim Uhls. Based on the novel by Chuck Palahniuk.';
preg_match_all("/(?:Directed|Written)\s*by\s+\K[^.]+/", $string,$matches);
print_r($matches);

Output:

Array
(
    [0] => Array
        (
            [0] => David Fincher
            [1] => Jim Uhls
        )

)

Here's the cleanest I can think of:

<?php

// https://regex101.com/r/dxft9p/1

$string = 'Directed by David Fincher. Written by Jim Uhls. Based on the novel by Chuck Palahniuk.';
$regex = '#Directed by (?<director>\w+\s?\w+). Written by (?<author>\w+\s?\w+)#';
preg_match($regex, $string, $matches);

echo 'Director - '.$matches['director']."\n";
echo 'Author - '.$matches['author'];

See here for a working example https://3v4l.org/AuDL0

When you use (?<somelabel> blah) in brackets, you create a named capture group. Very handy!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM