[英]PHP PDO fetch() loop dies after processing part of large dataset
I have a PHP script which processes a "large" dataset (about 100K records) from a PDO query into a single collection of objects, in a typical loop:我有一个 PHP 脚本,它在一个典型的循环中将来自 PDO 查询的“大”数据集(大约 10 万条记录)处理为单个对象集合:
while ($record = $query->fetch()) {
$obj = new Thing($record);
/* do some processing */
$list[] = $obj;
$count++;
}
error_log('Processed '.$count.' records');
This loop processes about 50% of the dataset and then inexplicably breaks.这个循环处理了大约 50% 的数据集,然后莫名其妙地中断了。
Things I have tried:我尝试过的事情:
memory_get_peak_usage()
consistently outputs about 63MB before the loop dies.memory_get_peak_usage()
在循环终止之前始终输出大约 63MB。 The memory limit is 512MB, set through php.ini.set_time_limit()
to increase script execution time to 1 hour (3600 seconds).set_time_limit()
将脚本执行时间增加到 1 小时(3600 秒)。 The loop breaks long before that and I don't see the usual error in the log for this one.PDO::MYSQL_ATTR_USE_BUFFERED_QUERY
to false
to avoid buffering the entire datasetPDO::MYSQL_ATTR_USE_BUFFERED_QUERY
设置为false
以避免缓冲整个数据集$query->errorInfo()
immediately after the loop break.$query->errorInfo()
。 This was no help as the error code was "00000". Other weird behavior:其他奇怪的行为:
ini_set('memory_limit', '1024MB')
, the loop actually dies earlier than with a smaller memory limit, at about 20% progress.ini_set('memory_limit', '1024MB')
设置内存限制时,循环实际上比使用较小的内存限制更早结束,进度约为 20%。 I am doing this all locally using MAMP PRO if that makes any difference.如果这有什么不同,我将使用 MAMP PRO 在本地完成所有这些操作。
Is there something else that could be consistently breaking this loop that I haven't checked?有没有其他东西可以持续打破我没有检查过的这个循环? Is this simply not a viable strategy for processing this many records?
这难道不是处理这么多记录的可行策略吗?
After using a batching strategy (20K increments), I have started to see a MySQL error consistently around the third batch: MySQL server has gone away
;使用批处理策略(以 20K 为增量)后,我开始在第三批
MySQL server has gone away
始终看到 MySQL 错误: MySQL server has gone away
; possibly a symptom of a long-running unbuffered query.可能是长时间运行的无缓冲查询的症状。
If You really need to process 100K records on the fly, You should do the processing in SQL, and fetch the result as You need it - it should save a lot of time.如果您真的需要动态处理 100K 条记录,您应该在 SQL 中进行处理,并根据需要获取结果 - 这应该可以节省大量时间。
But You probably cant do that for some reason.但是由于某种原因,您可能无法这样做。 You always process all the rows from statement, so use fetchAll once - and let MySQL alone after that, like that:
你总是处理语句中的所有行,所以使用 fetchAll 一次 - 然后让 MySQL 单独使用,就像这样:
$records = $query->fetchAll()
foreach ($records as record)
{
$obj = new Thing($record);
/* do some processing */
$list[] = $obj;
$count++;
}
error_log('Processed '.$count.' records');
Also, select only rows that You will use.此外,仅选择您将使用的行。 If this does not help, You can try with this: Setting a connect timeout with PDO .
如果这没有帮助,您可以尝试这样做: 使用 PDO 设置连接超时。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.