简体   繁体   English

在循环中执行 Doctrine 查询时的内存泄漏

[英]Memory leak when executing Doctrine query in loop

I'm having trouble in locating the cause for a memory leak in my script.我无法在脚本中找到内存泄漏的原因。 I have a simple repository method which increments a 'count' column in my entity by X amount:我有一个简单的存储库方法,它将我的实体中的“计数”列增加 X 数量:

public function incrementCount($id, $amount)
{
    $query = $this
        ->createQueryBuilder('e')
        ->update('MyEntity', 'e')
        ->set('e.count', 'e.count + :amount')
        ->where('e.id = :id')
        ->setParameter('id', $id)
        ->setParameter('amount', $amount)
        ->getQuery();

    $query->execute();
}

Problem is, if I call this in a loop the memory usage balloons on every iteration:问题是,如果我在循环中调用它,每次迭代时内存使用量都会激增:

$entityManager = $this->getContainer()->get('doctrine')->getManager();
$myRepository = $entityManager->getRepository(MyEntity::class);
while (true) {
    $myRepository->incrementCount("123", 5);
    $doctrineManager->clear();
    gc_collect_cycles();
}

What am I missing here?我在这里想念什么? I've tried ->clear() , as per Doctrine's advice on batch processing .根据 Doctrine 关于批处理的建议,我已经尝试过->clear() I even tried gc_collect_cycles() , but still the issue remains.我什至尝试过gc_collect_cycles() ,但问题仍然存在。

I'm running Doctrine 2.4.6 on PHP 5.5.我在 PHP 5.5 上运行 Doctrine 2.4.6。

I just ran into the same issue, these are the things that fixed it for me:我刚刚遇到了同样的问题,这些是为我解决的问题:

--no-debug --no-debug

As the OP mentioned in their answer, setting --no-debug (ex: php bin/console <my_command> --no-debug ) is crucial for performance/memory in Symfony console commands.正如 OP 在他们的回答中提到的那样,设置--no-debug (例如: php bin/console <my_command> --no-debug )对于 Symfony 控制台命令中的性能/内存至关重要。 This is especially true when using Doctrine, as without it, Doctrine will go into debug mode which consumes a huge amount of additional memory (that increases on each iteration).在使用 Doctrine 时尤其如此,因为没有它,Doctrine 将进入调试模式,这会消耗大量额外的内存(每次迭代都会增加)。 See the Symfony docs here and here for more info.有关更多信息,请参阅此处此处的 Symfony 文档。

--env=prod --env=产品

You should also always specify the environment.您还应该始终指定环境。 By default, Symfony uses the dev environment for console commands.默认情况下,Symfony 使用dev环境来执行控制台命令。 The dev environment usually isn't optimized for memory, speed, cpu etc. If you want to iterate over thousands of items, you should probably be using the prod environment (ex: php bin/console <my_command> --env prod ). dev环境通常没有针对内存、速度、cpu 等进行优化。如果你想迭代数千个项目,你可能应该使用prod环境(例如: php bin/console <my_command> --env prod )。 See here and here for more info.有关更多信息,请参见此处此处

Tip: I created an environment called console that I specifically configured for running console commands.提示:我创建了一个名为console的环境,我专门为运行控制台命令进行了配置。 Here is info abouthow to create additional Symfony environments .这里是关于如何创建额外 Symfony 环境的信息。

php -d memory_limit=YOUR_LIMIT php -d memory_limit=YOUR_LIMIT

If running a big update, you should probably choose how much memory is acceptable for it to consume.如果运行大更新,您可能应该选择可以接受的内存消耗量。 This is especially important if you think there might be a leak.如果您认为可能存在泄漏,这一点尤其重要。 You can specify the memory for the Command by using php -d memory_limit=x (ex: php -d memory_limit=256M ).您可以使用php -d memory_limit=x指定命令的内存(例如: php -d memory_limit=256M )。 Note: you can set the limit to -1 (usually the default for the php cli) to let the command run with no memory limit but this is obviously dangerous.注意:您可以将限制设置为-1 (通常是 php cli 的默认值)以让命令在没有内存限制的情况下运行,但这显然很危险。

A Well Formed Console Command For Batch Processing用于批处理的格式良好的控制台命令

A well formed console command for running a big update using the above tips would look like:使用上述提示运行大更新的格式良好的控制台命令如下所示:

php -d memory_limit=256M bin/console <acme>:<your_command> --env=prod --no-debug

Use Doctrine's IterableResult使用 Doctrine 的 IterableResult

Another huge one when using Doctrine's ORM in a loop, is to use Doctrine's IterableResult (see the Doctrine Batch Processing docs ).在循环中使用 Doctrine 的 ORM 时,另一个巨大的问题是使用 Doctrine 的 IterableResult(请参阅Doctrine Batch Processing 文档)。 This won't help in the example provided but usually when doing processing like this it is over results from a query.这在提供的示例中无济于事,但通常在进行这样的处理时,它会超出查询的结果。

Flush Periodically定期冲洗

If part of what you are doing is making changes to the data, you should flush periodically instead of on each iteration.如果您正在做的部分工作是对数据进行更改,则应定期刷新,而不是在每次迭代时刷新。 Flushing is expensive and slow.冲洗既昂贵又缓慢。 The less often you flush, the faster your command will finish.刷新的频率越低,命令完成的速度就越快。 Keep in mind, however, that Doctrine will hold the unflushed data in memory.但是请记住,Doctrine 会将未刷新的数据保存在内存中。 So the less often that you flush, the more memory you will need.因此,您刷新的频率越低,您需要的内存就越多。

You can use something like the following to flush every 100 iterations:您可以使用以下内容每 100 次迭代刷新一次:

if ($count % 100 === 0) {
    $this->em->flush();
}

Also make sure to flush again at the end of your loop (for flushing the last < 100 entries).还要确保在循环结束时再次刷新(用于刷新最后 < 100 个条目)。

Clear the EntityManager清除 EntityManager

You may also want to clear after you flush:您可能还想在冲洗后清除:

$this->em->flush();
$em->clear();  // Detach ALL objects from Doctrine.

Or或者

$this->em->flush();
$em->clear(MyEntity::class); // Detach all MyEntity from Doctrine.
$em->clear(MyRelatedEntity::class); // Detach all MyRelatedEntity from Doctrine.

Output the memory usage as you go随时输出内存使用情况

It can be really helpful to keep track of how much memory your command is consuming while it is running.跟踪命令在运行时消耗了多少内存非常有帮助。 You can do that by outputting the response returned by PHP's built-in memory_get_usage() function.您可以通过输出 PHP 的内置memory_get_usage()函数返回的响应来做到这一点。

$output->writeln(memory_get_usage());

Example例子

$memUse = round(memory_get_usage() / 1000000, 2).'MB';
$this->output->writeln('Processed '.$i.' of '.$totalCount.' (mem: '.$memUse.')');

Roll Your Own Batches滚动你自己的批次

It may also be helpful to roll your own batches.滚动您自己的批次也可能会有所帮助。 You can do this by using a start and limit just like you would for pagination.您可以像使用分页一样使用开始和限制来做到这一点。 I was able to process 4 millions rows using only 90Mb of RAM doing this.我只使用 90Mb 的 RAM 就能够处理 400 万行。

Here's some example code:这是一些示例代码:


protected function execute(InputInterface $input, OutputInterface $output) {
    /* ... */
    $totalCount = $this->getTotalCount();
    $batchSize = 10000;
    $i = 0;
    while ($i < $totalCount) {
        $i = $this->processBatch($i, $batchSize, $totalCount);
    }
    /* ... */
}

private function processBatch(int $start, int $limit, int $totalCount): int {
    /* @var $q \Doctrine\ORM\Query */
    $qb = $this->em->createQueryBuilder()
        ->select('e')
        ->from('AcmeExampleBundle:MyEntity', 'e')
        ->setFirstResult($start)
        ->setMaxResults($limit)
        ->getQuery();

    /* @var $iterableResult \Doctrine\ORM\Internal\Hydration\IterableResult */
    $iterableResult = $q->iterate(null, \Doctrine\ORM\Query::HYDRATE_SIMPLEOBJECT);

    $i = $start;
    foreach ($iterableResult as $row) {
        /* @var $myEntity \App\Entity\MyEntity */
        $myEntity = $row[0];

        $this->processOne($myEntity);

        if (0 === ($i % 1000)) {
            $memUse = round(memory_get_usage() / 1000000, 2).'MB';
            $this->output->writeln('Processed '.$i.' of '.$totalCount.' (mem: '.$memUse.')');
        }
        $this->em->detach($row[0]);
        $i++;
    }

    return $i;
}

private function processOne(MyEntity $myEntity): void {
    // Do entity processing here.
}

private function getTotalCount(): int {
    /* @var $q \Doctrine\ORM\Query */
    $q = $this->em
        ->createQueryBuilder()
        ->select('COUNT(e.id)')
        ->from('AcmeExampleBundle:MyEntity', 'e')
        ->getQuery();

    $count = $q->getSingleScalarResult();

    return $count;
}

Good luck!祝你好运!

I resolved this by adding --no-debug to my command.我通过在命令中添加--no-debug解决了这个问题。 It turns out that in debug mode, the profiler was storing information about every single query in memory.事实证明,在调试模式下,分析器将有关每个查询的信息存储在内存中。

Doctrine keeps logs of any query you make. Doctrine 会记录您所做的任何查询。 If you make lots of queries (normally happens in loops) Doctrine can cause a huge memory leak.如果您进行大量查询(通常发生在循环中),Doctrine 可能会导致巨大的内存泄漏。

You need to disable the Doctrine SQL Logger to overcome this.您需要禁用 Doctrine SQL Logger 来克服这个问题。

I recommend doing this only for the loop part.我建议仅对循环部分执行此操作。

Before loop, get current logger:在循环之前,获取当前记录器:

$sqlLogger = $em->getConnection()->getConfiguration()->getSQLLogger();

And then disable the SQL Logger:然后禁用 SQL 记录器:

$em->getConnection()->getConfiguration()->setSQLLogger(null); $em->getConnection()->getConfiguration()->setSQLLogger(null);

Do loop here: foreach() / while() / for()在这里循环: foreach() / while() / for()

After loop ends, put back the Logger:循环结束后,放回 Logger:

$em->getConnection()->getConfiguration()->setSQLLogger($sqlLogger);

For me it was clearing doctrine, or as the documentation says, detaching all entities:对我来说,这是清除学说,或者如文档所述,分离所有实体:

$this->em->clear(); //Here em is the entity manager.

So inside my loop y flush every 1000 iterations and detach all entities (I don't need them anymore):因此,在我的循环中,每 1000 次迭代刷新一次并分离所有实体(我不再需要它们):

    foreach ($reader->getRecords() as $position => $value) {
        $this->processValue($value, $position);
        if($position % 1000 === 0){
            $this->em->flush();
            $this->em->clear();
        }
        $this->progress->advance();
    }

Hope this helps.希望这可以帮助。

PS: here's the documentation . PS: 这是文档

You're wasting memory for each iteration.每次迭代都在浪费内存。 A much better way would be to prepare the query once and swap arguments many times .更好的方法是准备一次查询并多次交换参数。 For example:例如:

class MyEntity extends EntityRepository{
    private $updateQuery = NULL;

    public function incrementCount($id, $ammount)
    {
        if ( $this->updateQuery == NULL ){
            $this->updateQuery = $this->createQueryBuilder('e')
                ->update('MyEntity', 'e')
                ->set('e.count', 'e.count + :amount')
                ->where('e.id = :id')
                ->getQuery();
        }

        $this->updateQuery->setParameter('id', $id)
                ->setParameter('amount', $amount);
                ->execute();
    }
}

As you mentioned, you can employ batch processing here, but try this out first and see how well (if at all) performs...正如您所提到的,您可以在此处使用批处理,但首先尝试一下,看看(如果有的话)性能如何......

I had similar issues with a memory leak.我有类似的内存泄漏问题。 I'm running Doctrine in a Symfony 5.2 project.我在 Symfony 5.2 项目中运行 Doctrine。 More specific, I built a never-ending Command which is processing entries from one table, retrieves entries from another table, and creates 2 new entries in other tables.更具体地说,我构建了一个永无止境的命令,它正在处理一个表中的条目,从另一个表中检索条目,并在其他表中创建 2 个新条目。 (Event Processing) (事件处理)

I solved my leakage problems in two steps.我分两步解决了我的泄漏问题。

  1. I use the --no-debug when running the command (as already suggested by Jonathan)我在运行命令时使用--no-debug (正如乔纳森已经建议的那样)
  2. I added at the end of the loop $this->entityManager->clear();我在循环末尾添加了$this->entityManager->clear();

In order to see and identify the leakages, I used the following line to output the current memory usage:为了查看和识别泄漏,我使用以下行来输出当前的内存使用情况:

$output->writeln('Memory Usage in MB: ' . memory_get_usage() / 1024 / 1024);

Maybe this helps anyone still fighting with leakages.也许这有助于任何仍在与泄漏作斗争的人。

I encountered the same issue and disabling the query cache helped me.我遇到了同样的问题,禁用查询缓存对我有帮助。

$query = $this
    ->createQueryBuilder('e')
    ->update('MyEntity', 'e')
    ->set('e.count', 'e.count + :amount')
    ->where('e.id = :id')
    ->setParameter('id', $id)
    ->setParameter('amount', $amount)
    ->getQuery()
    ->useQueryCache(false); // <-- this line

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM