简体   繁体   English

PHP如何提取给定字符串的一部分?

[英]PHP How to extract part of given string?

I'm writing a search engine for my site and need to extract chunks of text with given keyword and few words around for the search result list. 我正在为我的网站编写一个搜索引擎,并且需要使用给定的关键字和少量单词来提取文本块以作为搜索结果列表。 I ended with something like that: 我以这样的结尾:


/**
 * This function return part of the original text with
 * the searched term and few words around the searched term
 * @param string $text Original text
 * @param string $word Searched term
 * @param int $maxChunks Number of chunks returned
 * @param int $wordsAround Number of words before and after searched term
 */
public static function searchTerm($text, $word=null, $maxChunks=3, $wordsAround=3) {
        $word = trim($word);
        if(empty($word)) {
            return NULL;
        }
        $words = explode(' ', $word); // extract single words from searched phrase
        $text  = strip_tags($text);  // clean up the text
        $whack = array(); // chunk buffer
        $cycle = 0; // successful matches counter
        foreach($words as $word) {
            $match = array();
            // there are named parameters 'pre', 'term' and 'pos'
            if(preg_match("/(?P\w+){0,$wordsAround} (?P$word) (?P\w+){0,$wordsAround}/", $text, $match)) {
                $cycle++;
                $whack[] = $match['pre'] . ' ' . $word . ' ' . $match['pos'];
                if($cycle == $maxChunks) break;
            }
        }
        return implode(' | ', $whack);
    }
This function does not work, but you can see the basic idea. 此功能不起作用,但是您可以看到基本思想。 Any suggestions how to improve the regular expression is welcome! 欢迎提出任何改进正则表达式的建议!

Never, never inject user content into the pattern of a RegEx without using preg_quote to sanitize the input: 永远, 永远注入用户内容插入到正则表达式的图案而无需使用preg_quote到净化输入:

http://us3.php.net/manual/en/function.preg-quote.php http://us3.php.net/manual/en/function.preg-quote.php

为什么在这里重新发明轮子谷歌没有最好的搜索引擎,我会看看他们的设备

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM