[英]Match words in file with regex php
I'm new with regex and php. 我是regex和php的新手。 I know this quite simple but i just can't get it.
我知道这很简单,但我无法理解。 Now, i have file words.txt that contain:
现在,我有包含以下内容的words.txt文件:
happy
sad
laugh
I want to find match this sentence with my words.txt: 我想找到与我的words.txt匹配的这句话:
I am happy
我很开心
So far, i've tried this but it doesn't valid because it read as a sentence not words: (not yet implement regex bcs im confused) 到目前为止,我已经尝试过了,但是它无效,因为它读为句子而不是单词:(尚未实现regex bcs im混淆)
$input0= "I am happy";
$handle = fopen('words.txt', 'r');
$valid = false;
while (($buffer = fgets($handle)) !== false) {
if (strpos($buffer, $input0) !== false) { // here's the problem
$valid = TRUE;
break;
}
}
if($valid == TRUE){
//print the matches word
}
fclose($handle);
can u help me? 你能帮我吗? :(
:(
Depending on your final goal you may not even need regexp here, since you want to match entire word with no variable part. 根据您的最终目标,您可能甚至不需要在这里使用正则表达式,因为您希望匹配不带可变部分的整个单词。
if you want to have a loop on your keywords a simple str_replace() would do the job to replace the word by an emphasize one for instance, or simple if (strpos($input0, $word) !== false)
to just check if found in sentence and find position. 如果您想在关键字上循环,那么一个简单的str_replace()便可以用一个强调单词替换该单词,或者简单地通过
if (strpos($input0, $word) !== false)
来检查该if (strpos($input0, $word) !== false)
如果在句子中找到位置。
But if you want to avoid a loop, for faster results and especially if you have many words preg_match_all()
will do what you need as said by Zanderwar. 但是,如果您想避免循环,则可以更快地获得结果 ,尤其是当您有很多单词时,如Zanderwar所说,
preg_match_all()
可以满足您的需求。 Here is an example: 这是一个例子:
$input0= "I am happy but sometimes quite pretty sad. It depends but I prefer to be happy in general.\nMy paragraph also continue on multilines\nend it makes me laugh and rejoy. I am so happy. HAPPY?";
// $contents = file_get_contents('words.txt');
$contents = "happy\nsad\nlaugh";
$words_list = str_replace("\n", '|', $contents);
if (preg_match_all("~($words_list)~si", $input0, $matches))
{
print_r(array($matches));
// Do what you want
}
The i
flag match case insensitive if you need. 如果需要,
i
标志不区分大小写。
The s
flag match on multilines content. s
标志在多行内容上匹配。
[EDIT] to add more details on regexp [EDIT]添加有关regexp的更多详细信息
In the pattern you need a delimiter which can be ~
because it is very seldom used in sentences and strings to match so you wont need to escape /
as when you use /
delimiter. 在模式中,您需要一个定界符,该定界符可以是
~
因为很少在句子和字符串中使用它来匹配,因此您无需像使用/
分隔符那样转义/
。
also I am joining your words like ~(sad|joy|happy)~
if you want to capture the words. 如果您想捕获单词,我也会加入您的单词,例如
~(sad|joy|happy)~
。 if you don't you need a group like (?:sad|joy|happy)
如果您不这样做,则需要一个类似
(?:sad|joy|happy)
的团体
the |
|
means or. 意味着或。
You can try to replace regex ~($words_list)~si
by ~(?:$words_list)~si
if you dont need capturing - and you don't - you will then have only one level of captures in $matches array, at position [0] it is always the full match. 如果不需要捕获,则可以尝试用
~(?:$words_list)~si
~($words_list)~si
替换正则表达式~($words_list)~si
~(?:$words_list)~si
然后,您将在$ matches数组中只有一层捕获,即位置[0]始终是完整匹配。 but here you don't have more complex patterns to match and so no need to capture 但是这里您没有更复杂的模式可以匹配,因此无需捕获
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.