[英]Is there a better way to get data from two tables at once with Sphinx/MySQL?
Before asking this question it is important to understand what it is I am actually doing. 在问这个问题之前,重要的是要了解我实际上在做什么。
The best comparison to the feature I am implementing would be Facebook's search feature. 与我正在实现的功能的最佳比较将是Facebook的搜索功能。 When you begin typing a drop down list appears with various search results. 当您开始输入内容时,会出现一个包含各种搜索结果的下拉列表。 At the top you will find your friends whose names match your search, then other people who match, then pages, events etc.... 在顶部,您会找到名字与您的搜索相匹配的朋友,然后是与之匹配的其他人,然后是页面,事件等。
My situation is similar however I only want to search for two things. 我的情况类似,但是我只想搜索两件事。 users and documents (named ripples in the code below). 用户和文档(在下面的代码中命名为涟漪图)。
I have this working fine. 我的工作很好。 Please bear with me while I talk through the logic of this feature in my case: 在我讨论此功能的逻辑时,请多多包涵:
At the same time and for every keypress an ajax request is fired to the script below which does the following : 同时,对于每次按键,都会向下面的脚本触发一个ajax请求,该脚本执行以下操作 :
And finally any returned results are appended to the auto complete list. 最后,所有返回的结果都将附加到自动完成列表中。
This is an example of the PHP function that performs the sphinx lookups and gets the data from the database. 这是PHP函数的一个示例,该函数执行狮身人面像查找并从数据库中获取数据。
public function search()
{
$this->disableLayout();
$request = new Request();
$params = $request->getParams(GET);
//Perform sphinx textsearch
include('/usr/local/lib/php/sphinxapi.php');
$sphinx = new \SphinxClient();
$sphinx->setMatchMode(SPH_MATCH_ANY);
$sphinx->SetLimits(0, 4);
$mysqlconn = mysql_connect("127.0.0.1:9306") or die ("Couldn't connect to MySQL.");
$users = $sphinx->Query($params['data']['q'], "users");
$ripples = $sphinx->Query($params['data']['q'], "ripples");
/*
*USERS
*/
//Loop through users and only collect ID's that are not already present
if (!empty($users["matches"])) {
$ids = "";
foreach($users['matches'] as $id => $data) {
if($ids > ""){
$ids .= ",";
}
if(!isset($params['data']['e'][$id])){
$ids .= $id;
}
}
//If there any any remaining ID's collect the data from the database and return as JSON
if(!empty($ids)){
$userdataquery = "select users.userid, users.firstname, users.lastname
from tellycards_user_data users
where userid IN($ids)
";
$query = new Query($userdataquery);
$usersoutput = $query->fetchAll();
}
}
/*
*RIPPLES
*/
//Loop through ripples and collect ID's
if (!empty($ripples["matches"])) {
$rippleids = "";
foreach($ripples['matches'] as $id => $data) {
if($rippleids > ""){
$rippleids .= ",";
}
$rippleids .= $id;
}
//If there any any remaining ID's collect the data from the database and return as JSON
if(!empty($rippleids)){
$rippledataquery = "select ripples.id, ripples.name, ripples.screenshot
from tellycards_ripples ripples
where id IN($rippleids)
";
$query = new Query($rippledataquery);
$ripplesoutput = $query->fetchAll();
}
}
header('Content-type: text/json');
echo json_encode(array(
'users' => (!empty($usersoutput)) ? $usersoutput : null,
'ripples' => (!empty($ripplesoutput)) ? $ripplesoutput : null
));
} }
You might ask why we are doing the initial friends lookup and not just using sphinx for everything. 您可能会问为什么我们要进行初始好友查找,而不仅仅是对所有内容都使用sphinx。 Well by implementing the method above. 通过实现上面的方法就可以了。 the user gets instant feedback when they are typing due to having the array of friends stored client side, while despite the fantastic speed of sphinx there inevitably will be some lag due to the http request. 用户输入时由于在客户端存储了朋友的数组而获得了即时反馈,尽管狮身人面像的速度惊人,但由于http请求,不可避免地会有一些滞后。 In practice it works fantastically and incidentally it appears to be the method that facebook uses also. 在实践中,它工作得非常好,顺便说一句,它似乎也是Facebook使用的方法。
Also there is a lot of javascript code preventing unnecessary lookups, the returned data gets added to the cache pile etc so that future searches do not require hitting sphinx/db etc... 还有很多JavaScript代码可防止不必要的查找,将返回的数据添加到缓存堆等中,以便将来的搜索不需要点击sphinx / db等。
Now finally onto my actual question.... 现在终于到了我的实际问题。
This current server side function bothers me a lot. 当前的服务器端功能使我非常困扰。 Right now there are two searches being performed by Sphinx and two searches being performed by MySQL. 目前,Sphinx执行了两次搜索,MySQL执行了两次搜索。 How can I possibly collate all this into one sphinx query and one MySQL query? 如何将所有这些整理为一个狮身人面像查询和一个MySQL查询? Is there any way at all? 有什么办法吗? (Please bare in mind that documents and users may share the same PK ID's as they are on two completely different tables in MySQL and are spread (currently) across two separate indexes). (请记住,文档和用户可能共享相同的PK ID,因为它们位于MySQL的两个完全不同的表中,并且(当前)分布在两个单独的索引中)。 Or is there any way to combine the two MySQL queries to make them more efficient than having two separate selects? 还是有什么办法可以将两个MySQL查询组合在一起,使其比拥有两个单独的选择更有效率?
Or alternatively... Due to the simplicity of the queries am I best keeping them separate as above? 或者作为替代...由于查询的简单性,我是否最好如上所述将它们分开? (both are indexed primary key queries) (均为索引主键查询)
I guess what I am asking for is any recommendations/advice. 我想我要的是任何建议/建议。
Any commentary is very welcome. 任何评论都非常欢迎。
You cant really get away with not having two MySQL queries. 没有两个MySQL查询真的无法摆脱。 Well you could, by either jsut combining them into one, with UNION. 好的,您可以通过JSUT将它们与UNION合并为一个。 Or by creating a new combined 'table' (either a view, or a materialized view) - but really dont think its worth the effort. 或通过创建一个新的组合“表”(一个视图或一个实体化视图)-但实际上并不认为这样做值得。 Two queries is perfectly fine - as you say they indexed. 两个查询都很好-正如您所说的,它们已建立索引。
You could use one sphinx index (and hence one search query) - by creating a new combined index. 您可以使用一个狮身人面像索引(因此可以使用一个搜索查询)-通过创建新的组合索引。 Because you say your keys are not unique, would have to create a new synthetic key. 因为您说密钥不是唯一的,所以必须创建一个新的合成密钥。
eg... 例如...
sql_query = SELECT userid*2 AS id, 1 AS table_id, firstname AS one, lastname as two FROM tellycards_user_data \
UNION \
SELECT (id*2)+1 as id, 2 AS table_id, name AS one, screenshot AS two FROM tellycards_ripples
sql_attr_unit = table_id
This gives you a fake key , and an attribute to identify what table the result came from. 这为您提供了一个伪造的key,以及一个用于标识结果来自哪个表的属性。 You can use this to get the original table it came from. 您可以使用它来获取它来自的原始表。 (there are many other ways of doing the same thing) (有许多其他方式可以做相同的事情)
This allows you to run one query, can get combined results. 这使您可以运行一个查询,可以获得合并结果。
... BUT not convinced its a good idea. ...但是没有说服它是一个好主意。 Because if the results are asymmetric, you may miss results. 因为如果结果不对称,您可能会错过结果。 Say there are 20 matching results from one table, and 10 from another. 假设一张表有20个匹配结果,另一张表有10个匹配结果。 Say you show the top 10 results, now becayse of the limit, the results from the second table, could well be hidden below the first table (extream example, in reality, hopefully they intermingled). 假设您显示的是前10个结果,现在是极限,因为第二个表的结果很可能隐藏在第一个表的下方(极端示例,实际上,希望它们混合在一起)。 Two seperate queries, allows you to guarantee, to get SOME results from each table. 两个单独的查询使您可以保证从每个表中获得一些结果。
... so after all that. ...所以毕竟。 Stick with what you got. 坚持你所得到的。 Its fine. 没关系。
You can store and retrieve all data about users and documents in Sphinx so no MySQL is needed. 您可以在Sphinx中存储和检索有关用户和文档的所有数据,因此不需要MySQL。
Use Sphinx QL not API (much better and easier to get stuff done -> http://sphinxsearch.com/docs/current.html#sphinxql-reference ) 使用Sphinx QL而不是API(更好,更轻松地完成工作-> http://sphinxsearch.com/docs/current.html#sphinxql-reference )
Notice: dont forget to set all textual fields that you want to retrieve data back from as sql_field_string in sphinx.conf source 注意:不要忘记将所有要从中检索数据的文本字段设置为sphinx.conf源中的sql_field_string
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.