简体   繁体   English

快速检查字符串是否包含特定单词

[英]Fast check if strings contain specific words

I have a bunch of strings in database, on which i'm hoping to use some kind of data structure, instead of doing an SQL "WHERE 'ab' LIKE 'a%'" kind of search. 我在数据库中有一堆字符串,我希望在这些字符串上使用某种数据结构,而不是执行SQL“ WHERE'ab'LIKE'a%'”这样的搜索。

(In real world example these can be up to 5000) (在实际示例中,最多可以达到5000)
A - Let's say i have the words/needles: for, in, like, release A-假设我有字/针: for, in, like, release
Note: these are always the same, the only thing that can happen here is to expand, but they don't change 注意:这些总是相同的,这里唯一可以发生的事情就是扩展,但它们不会改变

(In real world example these can be up to 50) (在实际示例中,最多可以有50个)
B - Then i have the other words/haystacks: for people, in magazine, date of release, daily news B-然后我有其他字眼/干草堆: for people, in magazine, date of release, daily news
Note: These are dynamic, they are always different 注意:这些是动态的,它们总是不同的

I'd like to know of a good way to find/remove all the words from B, which start or end in any of the words from A 我想知道一种从B中查找/删除所有单词的好方法,这些单词以A中的任何单词开头或结尾

So the ones i would remove from the example, would be: for people, in magazine, date of release 所以我从示例中删除的是: for people, in magazine, date of release

I'm happy even with a generic idea, which i can implement in PHP 我对通用的想法感到满意,可以在PHP中实现

PS: I might go back to mysql if all the given ideas are slower, then using a mysql LIKE search, so i'd prefer something faster, or at least as fast as mysql PS:如果所有给定的想法都比较慢,我可能会回到mysql,然后使用mysql像mysql这样的搜索,所以我希望更快一些,或者至少与mysql一样快

One solution i could think of would be to split the words from B 我能想到的一种解决方案是从B中拆分单词
for, people, in, magazine, date, of, release

Mark those that exist in A 标记那些存在于A
for, in, release

This way we would end up with much smaller data and it would be fine to just do an strpos on them in a foreach loop 这样,我们最终得到的数据会少得多,只需要在一个foreach循环中对它们进行一次strpos就可以了

Let me know if you know of a better way. 如果您知道更好的方法,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM