简体繁体 English

刷新缓存以在PostgreSQL 9.1中进行基准测试

[英]Flushing the cache for benchmarking in PostgreSQL 9.1

原文 2012-03-28 12:41:43 6 1 query-optimization/ benchmarking/ postgresql-9.1

I am performing some benchmarking tasks using Postgresql 9.1 running on Debian Linux. 我正在使用在Debian Linux上运行的Postgresql 9.1执行一些基准测试任务。 I would like to benchmark a workload of queries that share a common part. 我想对共享相同部分的查询工作量进行基准测试。 Before running each query I restart the database and execute the following command: 在运行每个查询之前，我重新启动数据库并执行以下命令：

echo 3 > /proc/sys/vm/drop_caches 回声3> / proc / sys / vm / drop_caches

aiming at dropping both the shared memory and the OS cache. 旨在同时删除共享内存和操作系统缓存。 However, I have noticed that if I run the same query workload with a different order I get different query response times. 但是，我注意到，如果我以不同的顺序运行相同的查询工作负载，则会得到不同的查询响应时间。 I suspect that somehow either the query optimizer 'remembers' how to efficiently execute the common query parts or reuses some previously cached results. 我怀疑查询优化器以某种方式“记住”了如何有效执行常见查询部分，或者重用了某些先前缓存的结果。

Do you have any ideas how to workaround this issue? 您是否有解决此问题的想法？ I would like to get roughly the same response times regardless of query ordering. 无论查询顺序如何，我都希望获得大致相同的响应时间。 Note, that I am parsing the EXPLAIN output to extract the actual running times. 注意，我正在解析EXPLAIN输出以提取实际运行时间。

1 个解决方案

The first thing which comes to mind is that autovacuum (a background maintenance task in PostgreSQL: http://www.postgresql.org/docs/current/interactive/routine-vacuuming.html#AUTOVACUUM ) may be doing some work which is re-populating your cache in hard-to-predict ways. 首先想到的是autovacuum（PostgreSQL中的后台维护任务： http : //www.postgresql.org/docs/current/interactive/routine-vacuuming.html#AUTOVACUUM ）正在做一些工作， -以难以预测的方式填充缓存。 You can disable it, but be aware that this can lead to bloat, bad statistics leading to bad plan choice, and pushing additional work onto front-end processes -- so it is generally not recommended. 您可以禁用它，但是要注意，这可能会导致膨胀，错误的统计信息导致错误的计划选择，以及将更多的工作推到前端流程上，因此通常不建议这样做。 Another way to approach this would be to run VACUUM FREEZE ANALYZE after loading your data, to put everything into a well-maintained state, stop PostgreSQL, flush your OS cache, and then start up and do your benchmark. 解决此问题的另一种方法是在加载数据后运行VACUUM FREEZE ANALYZE，将所有内容置于维护良好的状态，停止PostgreSQL，刷新OS缓存，然后启动并进行基准测试。

Another possible source of issues can be checkpoints; 问题的另一个可能来源是检查站。 you should make sure you have checkpoint_segments configured high enough to avoid forcing frequent checkpoints, and you should consider the checkpoint_timeout setting in terms of when the checkpoints will occur during your benchmark. 您应确保已将checkpoint_segments配置为足够高，以避免强制执行频繁的检查点，并且应根据在基准测试期间何时出现检查点来考虑checkpoint_timeout设置。

It's also possible that a RAID controller card or hard drive may be caching enough to matter -- I don't know whether flushing the OS cache clears those, but I doubt it. RAID控制器卡或硬盘驱动器也可能足够缓存而已-我不知道刷新OS缓存是否可以清除这些缓存，但我对此表示怀疑。

In general, keep in mind that PostgreSQL ships with settings designed to let the database start up and function on a smallish laptop -- optimal performance generally requires some tuning, so unless your benchmarks are testing the effects different configuration settings, you might want to review overall configuration before benchmarking. 通常，请记住，PostgreSQL附带了旨在让数据库启动并在小型笔记本电脑上运行的设置-最佳性能通常需要进行一些调整，因此，除非您的基准测试不同配置设置的效果，否则您可能需要查看一下基准测试之前的整体配置。