[英]Read multiple records given array of record IDs from the database efficiently
If you have an array of record ID's in your application code, what is the best way to read the records from the database? 如果您的应用程序代码中具有记录ID的数组,那么从数据库中读取记录的最佳方法是什么?
$idNumsIWant = {2,4,5,7,9,23,56};
Obviously looping over each ID is bad because you do n queries: 显然,遍历每个ID,因为你做N次查询不好:
foreach ($idNumsIWant as $memID) {
$DBinfo = mysql_fetch_assoc(mysql_query("SELECT * FROM members WHERE mem_id = '$memID'"));
echo "{$DBinfo['fname']}\n";
}
So, perhaps it is better to use a single query? 因此,也许最好使用单个查询?
$sqlResult = mysql_query("SELECT * FROM members WHERE mem_id IN (".join(",",$idNumsIWant).")");
while ($DBinfo = mysql_fetch_assoc($sqlResult))
echo "{$DBinfo['fname']}\n";
But does this method scale when the array has 30,000 elements? 但是,当数组具有30,000个元素时,此方法可扩展吗?
How do you tackle this problem efficiently? 您如何有效地解决这个问题?
The best approach depends eventually on the number of IDs you have in your array (you obviously don't want to send a 50MB SQL query to your server, even though technically it might be capable of dealing with it without too many trouble), but mostly on how you're going to deal with the resulting rows. 最好的方法最终取决于阵列中ID的数量(您显然不希望向服务器发送50MB的SQL查询,尽管从技术上讲它可能能够轻松处理它),但是主要是关于如何处理结果行。
If the number of IDs is very low (let's say a few thousands tops), a single query with a WHERE clause using the IN syntax will be perfect. 如果ID的数量非常少(比如说数千个顶部),那么使用IN语法的带有WHERE子句的单个查询将是完美的。 Your SQL query will be short enough for it to be transfered reliably, efficiently and quickly to the DB server.
您的SQL查询将足够短,可以可靠,高效,快速地传输到数据库服务器。 This method is perfect for a single thread looping through the resulting records.
该方法非常适合单线程循环遍历结果记录。
If the number of IDs is really big, I would suggest you split the IDs array in several groups, and run more than 1 query, each one with a group of IDs. 如果ID的数量确实很大,建议您将ID数组分成几组,然后运行多个查询,每个查询都有一组ID。 It may be a little heavier for the DB server, but on the application side you can spawn several threads and deal with the multiple recordsets as soon as they arrive, in a parrallel way.
对于DB服务器而言,这可能会稍微重一些,但是在应用程序端,您可以生成多个线程,并在它们到达后立即以并行方式处理多个记录集。
Both methods will work. 两种方法都可以。
Cliffnotes : For that kind of situations, focus on data usage, as long as data extraction isn't too big of a bottleneck. 注释:对于这种情况,只要数据提取不是很大的瓶颈,就应专注于数据使用。 And profile your app !
并配置您的应用程序!
My thoughts: 我的想法:
The first method is too costly in terms of processing and disk reads. 就处理和磁盘读取而言,第一种方法的成本太高。
The second method is more efficient and you don't have to worry much about query size limit (but check it anyway). 第二种方法更有效,您不必担心查询大小限制 (但仍然可以检查它)。
When I have to deal with that kind of situation, I see at least three or four possible solutions : 当我不得不处理这种情况时,我看到至少三个或四个可能的解决方案:
IN()
IN()
的数据数量有限制 IN()
might not be good performance-wise IN()
一个很大的列表可能不是很好的性能 Also, to facilitate modifications of the fetching logic, I would write a function that gets the list of ids, and return the list of data corresponding to those. 另外,为便于修改获取逻辑,我将编写一个获取ID列表并返回与ID对应的数据列表的函数。
This way, you just call this function the same way, and you always get the same data, not having to worry about how that data is fetched ; 这样,您只需以相同的方式调用此函数,就可以始终获取相同的数据,而不必担心如何获取该数据; this will allow you to change the fetching method if needed (if you find another better way some day), without breaking anything : HOW the function works will change, but as it's interface (input/output) will remain the same, it will not change a thing for the rest of your code :-)
这将允许您在需要时更改获取方法(如果有一天找到另一种更好的方法),而又不会中断任何事情:函数的工作方式将发生变化,但是由于其接口(输入/输出)将保持不变,因此不会更改代码其余部分的内容:-)
If it were me and I had that large a list of values for the in clause, I would use a stored proc with a variable containing the values I wanted and use a function in it to send them into a temp table and then join to it. 如果是我并且in子句的值列表如此之大,我将使用存储的proc和一个包含所需值的变量,并在其中使用函数将其发送到临时表中,然后加入该表。 Depending on the size of the values you want to send, you might need to split it up into mutiple input vairables to process.
根据要发送的值的大小,您可能需要将其拆分为多个输入变量以进行处理。 Is there any way the values could be permanently stored (if they are often querying on this) in the database?
有什么方法可以将值永久存储(如果它们经常对此进行查询)在数据库中? And how is the user going to pick out 30,000 values, surely he or she is n;t going to tyope them all in?
用户将如何挑选出30,000个值,确定他或她不会将所有值都记录下来? So there is probably a better way to query the table based ona a join and a where clause.
因此,可能有更好的方法基于联接和where子句查询表。
通过将字符串分成令牌来使用StringTokenizer,对于u来说,处理多个值的数据会更容易处理
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.