简体   繁体   中英

How to make search in static site?

There is a static old site, with 50 pages on html. The question is how to implement a less fast search? Which way to look? I also did a script on php which simply searches for text in files, but it works sooooo slowly, there are some methods for indexing pages or something like that.

<?php

ini_set('max_execution_time', 900);

if(!isset($_GET['s'])) {
    die('You must define a search term!');
}

$search_in = array('html', 'htm');
$search_dir = '.';
$countWords = 15;

$files = list_files($search_dir);
$search_results = array();
foreach($files as $file){
    $contents = file_get_contents($file);
    preg_match_all("/\<p\>(.*)".$_GET['s']."(.*)\<\/p\>/i", $contents, $matches, PREG_SET_ORDER);
    foreach($matches as $match){
        $match[1] = trim_result($match[1]);
        $match[2] = trim_result($match[2], true);
        $match[1] .= '<span style="background: #ffff00;">';
        $match[2] = '</span>'.$match[2];

        preg_match("/\<title\>(.*)\<\/title\>/", $contents, $matches2);
        $search_results[] = array($file, $match[1].$_GET['s'].$match[2], $matches2[1]);
    }
}

?>

    <html>
    <head>
        <title>Search results</title>
    </head>
    <body>
    <?php foreach($search_results as $result) :?>
        <div>
            <h3><a href="<?php echo $result[0]; ?>"><?php echo $result[2]; ?></a></h3>
            <p><?php echo $result[1]; ?></p>
        </div>
    <?php endforeach; ?>
    </body>
    </html>

<?php
function list_files($dir){
    global $search_in;

    $result = array();
    if(is_dir($dir)){
        if($dh = opendir($dir)){
            while (($file = readdir($dh)) !== false) {
                if(!($file == '.' || $file == '..')){
                    $file = $dir.'/'.$file;
                    if(is_dir($file) && $file != './.' && $file != './..'){
                        $result = array_merge($result, list_files($file));
                    }
                    else if(!is_dir($file)){
                        if(in_array(get_file_extension($file), $search_in)){
                            $result[] = $file;
                        }
                    }
                }
            }
        }
    }
    return $result;
}

function get_file_extension($filename){
    $result = '';
    $parts = explode('.', $filename);
    if(is_array($parts) && count($parts) > 1){
        $result = end($parts);
    }
    return $result;
}

function trim_result($text, $start = false){
    $words = split(' ', strip_tags($text));
    if($start){
        $words = array_slice($words, 0, $countWords);
    }
    else{
        $start = count($words) - $countWords;
        $words = array_slice($words, ($start < 0 ? 0 : $start), $countWords);
    }
    return implode(' ', $words);
}

?>

the best way for speed up the search is:

parse all files with a DOM parser and extract the content.

write this content in an sqlite database (for only 50 Pages you don´t need MYSQL)

then organize the live search with simple sql where statements.

This isn't something you're going to solve (well) just by a script that runs at runtime.

You're going to want something to pre-parse it into one something that can quickly be searched through.

A simple method would be to parse it all into a text or JSON file. You can then load that one text file, search for your string and then handle it accordingly.

A more elegant method would be to use SQL database (MySQL, SQLite, SQL Server, etc) or NoSQL database (Mongo, Cassandra, etc.) to store the info and then run queries against it.

Probably the best solution though would be to use Solr to allow for proper searches. It's going to give the best results (and a lot of fine tuning), but may be overkill for your needs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM