简体   繁体   English

PHP RegEx:如何从句子中提取某些单词?

[英]PHP RegEx: How to extract certain words from a sentence?

I have a Wikitext like: 我有一个像这样的Wikitext:

Directed by David Fincher. 由David Fincher执导。 Written by Jim Uhls. 由Jim Uhls撰写。 Based on the novel by Chuck Palahniuk. 改编自查克·帕拉纽克(Chuck Palahniuk)的小说。

I am trying to make regex that only fetches David Fincher and Jim Uhls , both name will vary based on url. 我正在尝试制作仅获取David FincherJim Uhls的正则表达式,这两个名称将根据url有所不同。 I made following Regex and it does work (after replacing unwanted text), is there some better way? 我做了以下正则表达式,它确实起作用了(替换了不需要的文本之后),还有更好的方法吗?

/(Directed by)([\w\s]+). (Written by)([\w\s]+). /g

(?:Directed|Written)\\s*by\\s this will match Directed by or Written by (?:Directed|Written)\\s*by\\s这将与Directed byWritten by

\\K will discard previous matches. \\K将放弃之前的比赛。

[^\\.]+ this will match upto character . [^\\.]+最多匹配一个字符. dot(excluding .dot). 点(不包括.dot)。

Regex: /(?:Directed|Written)\\s*by\\s+\\K[^.]+/g 正则表达式:/( /(?:Directed|Written)\\s*by\\s+\\K[^.]+/g

Regex demo 正则表达式演示

<?php

ini_set('display_errors', 1);
$string='Directed by David Fincher. Written by Jim Uhls. Based on the novel by Chuck Palahniuk.';
preg_match_all("/(?:Directed|Written)\s*by\s+\K[^.]+/", $string,$matches);
print_r($matches);

Output: 输出:

Array
(
    [0] => Array
        (
            [0] => David Fincher
            [1] => Jim Uhls
        )

)

Here's the cleanest I can think of: 这是我能想到的最干净的:

<?php

// https://regex101.com/r/dxft9p/1

$string = 'Directed by David Fincher. Written by Jim Uhls. Based on the novel by Chuck Palahniuk.';
$regex = '#Directed by (?<director>\w+\s?\w+). Written by (?<author>\w+\s?\w+)#';
preg_match($regex, $string, $matches);

echo 'Director - '.$matches['director']."\n";
echo 'Author - '.$matches['author'];

See here for a working example https://3v4l.org/AuDL0 请参阅此处以获取工作示例https://3v4l.org/AuDL0

When you use (?<somelabel> blah) in brackets, you create a named capture group. 当在方括号中使用(?<somelabel> blah)时,将创建一个命名的捕获组。 Very handy! 非常便利!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM