使用regex php匹配文件中的单词

Question

I'm new with regex and php. 我是regex和php的新手。 I know this quite simple but i just can't get it. 我知道这很简单，但我无法理解。 Now, i have file words.txt that contain: 现在，我有包含以下内容的words.txt文件：

happy
sad
laugh

I want to find match this sentence with my words.txt: 我想找到与我的words.txt匹配的这句话：

I am happy 我很开心

So far, i've tried this but it doesn't valid because it read as a sentence not words: (not yet implement regex bcs im confused) 到目前为止，我已经尝试过了，但是它无效，因为它读为句子而不是单词：（尚未实现regex bcs im混淆）

$input0= "I am happy";
$handle = fopen('words.txt', 'r');
$valid = false; 
while (($buffer = fgets($handle)) !== false) {
if (strpos($buffer, $input0) !== false) { // here's the problem
    $valid = TRUE;
    break;
   }      
}
if($valid == TRUE){
//print the matches word
}
fclose($handle);

can u help me? 你能帮我吗？ :( :(

Answer 1

Depending on your final goal you may not even need regexp here, since you want to match entire word with no variable part. 根据您的最终目标，您可能甚至不需要在这里使用正则表达式，因为您希望匹配不带可变部分的整个单词。

if you want to have a loop on your keywords a simple str_replace() would do the job to replace the word by an emphasize one for instance, or simple if (strpos($input0, $word) !== false) to just check if found in sentence and find position. 如果您想在关键字上循环，那么一个简单的str_replace（）便可以用一个强调单词替换该单词，或者简单地通过if (strpos($input0, $word) !== false)来检查该if (strpos($input0, $word) !== false)如果在句子中找到位置。

But if you want to avoid a loop, for faster results and especially if you have many words preg_match_all() will do what you need as said by Zanderwar. 但是，如果您想避免循环，则可以更快地获得结果 ，尤其是当您有很多单词时，如Zanderwar所说， preg_match_all()可以满足您的需求。 Here is an example: 这是一个例子：

$input0= "I am happy but sometimes quite pretty sad. It depends but I prefer to be happy in general.\nMy paragraph also continue on multilines\nend it makes me laugh and rejoy. I am so happy. HAPPY?";
// $contents = file_get_contents('words.txt');
$contents = "happy\nsad\nlaugh";

$words_list = str_replace("\n", '|', $contents);

if (preg_match_all("~($words_list)~si", $input0, $matches))
{
    print_r(array($matches));
    // Do what you want
}

The i flag match case insensitive if you need. 如果需要， i标志不区分大小写。

The s flag match on multilines content. s标志在多行内容上匹配。

[EDIT] to add more details on regexp [EDIT]添加有关regexp的更多详细信息

In the pattern you need a delimiter which can be ~ because it is very seldom used in sentences and strings to match so you wont need to escape / as when you use / delimiter. 在模式中，您需要一个定界符，该定界符可以是~因为很少在句子和字符串中使用它来匹配，因此您无需像使用/分隔符那样转义/ 。

also I am joining your words like ~(sad|joy|happy)~ if you want to capture the words. 如果您想捕获单词，我也会加入您的单词，例如~(sad|joy|happy)~ 。 if you don't you need a group like (?:sad|joy|happy) 如果您不这样做，则需要一个类似(?:sad|joy|happy)的团体

the | | means or. 意味着或。

You can try to replace regex ~($words_list)~si by ~(?:$words_list)~si if you dont need capturing - and you don't - you will then have only one level of captures in $matches array, at position [0] it is always the full match. 如果不需要捕获，则可以尝试用~(?:$words_list)~si ~($words_list)~si替换正则表达式~($words_list)~si ~(?:$words_list)~si然后，您将在$ matches数组中只有一层捕获，即位置[0]始终是完整匹配。 but here you don't have more complex patterns to match and so no need to capture 但是这里您没有更复杂的模式可以匹配，因此无需捕获

使用regex php匹配文件中的单词

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-06-23 04:19:26

使用regex php匹配文件中的单词

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-06-23 04:19:26

解决方案1
2 已采纳 2016-06-23 04:19:26