简体   繁体   English

检查外部数据是否已经在我们的MySQL数据库中的最佳方法

[英]Best way to check if external data is already in our MySQL database

We are looking at the best way of developing a search routine in a database that uses the least resources possible within a PHP/MySQL environment. 我们正在寻找在数据库中开发搜索例程的最佳方法,该数据库在PHP / MySQL环境中使用的资源最少。

We process an external feed of information that has a tendency to subtly change or have new status values from time to time. 我们处理的外部信息馈送有时会发生细微变化或具有新的状态值。 This means we are limited to add our own numeric keys and search by conventional SQL search. 这意味着我们仅限于添加自己的数字键并只能通过常规SQL搜索进行搜索。

We are thinking of using MD5 to create a unique string so instead of searching for... 我们正在考虑使用MD5创建唯一的字符串,而不是搜索...

WHERE DATE = '12/12/2012 09:00' 
AND TYPE = 'new alert' 
AND loc = 'rear door' 
AND subtype = 'pir hit' 
AND lat = 39.3343 
AND lon = 145.234 
AND current STATUS = 'active' 
AND Support = 'en-route';

we create a MD5 eg ef6d3c25ac9362413fed2b4d3f65962a out of the fields that we are interested in eg 12/12/2012 09:00~new alert~rear door~pir hit~-39.3343~145.234~active~en-route and then we can search for just this MD5 in the database rather than separate fields. 我们在我们感兴趣的字段中创建了一个md5,例如ef6d3c25ac9362413fed2b4d3f65962a,例如2012年12月12日09:00〜新警报〜后门〜撞中了〜-39.3343〜145.234〜active_en-route,然后我们可以搜索数据库中仅此MD5,而不是单独的字段。

We would also be interested in looking into using a file with the list of most recent MD5's rather that interrogating the db constantly as we may have 1100+ jobs in the feed at the most. 我们也有兴趣使用包含最新MD5列表的文件,而不是不断地查询数据库,因为Feed中最多可能有1100多个作业。 More often it is around 60 or so jobs. 通常,它大约有60个工作岗位。

We are interested in your thoughts and reasons on what you think is the best solution. 我们对您的想法和最佳解决方案的原因感兴趣。

I'd go for Apache Solr as a solution. 我会选择Apache Solr作为解决方案。 Faceted search would suite all your needs here. 多面搜索将在这里满足您的所有需求。 Replicating/indexing your data wouldn't take much effort. 复制/索引数据不会花费很多精力。 We've implemented this engine in our company's project, doing search over name/date/characteristics/vendor/distributor/etc, and it works like a charm. 我们已经在公司的项目中实现了此引擎,可以在名称/日期/特征/供应商/发行人/等上进行搜索,并且它的工作原理很像魅力。 Though md5 over glued string has been a solution for some time there. 尽管在胶合字符串上使用md5已经有一段时间了。 Anyways, it depends on time you have and how well your current solution handles the situation. 无论如何,这取决于您的时间以及当前解决方案处理情况的能力。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM