简体   繁体   中英

MYSQL and PHP search script with word list and links

I'm working on a search form/script for my website.

For a start every word used in content on my website is stored in a MYSQL table called words and it looks something like this :

id |  word | title_count | content_count | article_count | photo_count | video_count |
---+-------+-------------+---------------+---------------+-------------+-------------+
 1 | hello |           3 |             1 |             0 |           1 |           0 |
 2 |  what |           1 |             4 |             1 |           0 |           0 |

The word and id fields are UNIQUE, the fields title_count and content_count are used to store how many time the word was used in a title or in a main content. The other _count field just say how many time the word was used in an article/photo/video.

I'm not sure if all these count fields will be useful for a search function, but thought it could maybe come in handy.

Then I have multiple linking tables to all the _count fields of the words table that look like this :

id |  word_id | 
---+----------+
43 |        2 |
 7 |        1 |
 7 |        2 |

These tables are called : word_link_title , word_link_content , word_link_article , word_link_photo , word_link_video etc. The id field stores the id of a article/photo/video it links to and the word_id stores the the ID of the word linked.

Now that I have all of that set up I'm kind of stuck. I don't really have any clear idea how to sort out relevant content based on all these numbers and the research terms.

Something that I plan on doing is a search result page with multiple tabs, one there it would show all the results of the search and then in other tabs it would be separated in articles/photos/videos.

I have no idea if I'm actually on the right path to get something working. Hope someone can help me..

It all depends on what you want, if you want to suggest results, you might want to look into some famous word search algorithms : Jaro-Winkler is good for short words and Levenshtein is good for short words in long text. You can also use PHP's similar_text function for better matching after these previous algorithms.

I posted a bunch of search scripts here (Jaro-Winkler) and here when I built a search engine project, if you want to check it out.

As for the different counts, why not ? You can use the number of occurences to favor some keywords. But careful which ones you insert into your database ! You don't want words like 'the' or 'it' or any other common word corrupting all of this.

EDIT: of course, this means the search will be in PHP, and will have a huge inconvenient of needing to select a large amount of keywords (if not all) from the database. I ended up with a maximum estimated search time of 0.04 seconds, on a database with over 3000 words. So it seems ok =)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM