简体   繁体   中英

Flushing the cache for benchmarking in PostgreSQL 9.1

I am performing some benchmarking tasks using Postgresql 9.1 running on Debian Linux. I would like to benchmark a workload of queries that share a common part. Before running each query I restart the database and execute the following command:

echo 3 > /proc/sys/vm/drop_caches

aiming at dropping both the shared memory and the OS cache. However, I have noticed that if I run the same query workload with a different order I get different query response times. I suspect that somehow either the query optimizer 'remembers' how to efficiently execute the common query parts or reuses some previously cached results.

Do you have any ideas how to workaround this issue? I would like to get roughly the same response times regardless of query ordering. Note, that I am parsing the EXPLAIN output to extract the actual running times.

The first thing which comes to mind is that autovacuum (a background maintenance task in PostgreSQL: http://www.postgresql.org/docs/current/interactive/routine-vacuuming.html#AUTOVACUUM ) may be doing some work which is re-populating your cache in hard-to-predict ways. You can disable it, but be aware that this can lead to bloat, bad statistics leading to bad plan choice, and pushing additional work onto front-end processes -- so it is generally not recommended. Another way to approach this would be to run VACUUM FREEZE ANALYZE after loading your data, to put everything into a well-maintained state, stop PostgreSQL, flush your OS cache, and then start up and do your benchmark.

Another possible source of issues can be checkpoints; you should make sure you have checkpoint_segments configured high enough to avoid forcing frequent checkpoints, and you should consider the checkpoint_timeout setting in terms of when the checkpoints will occur during your benchmark.

It's also possible that a RAID controller card or hard drive may be caching enough to matter -- I don't know whether flushing the OS cache clears those, but I doubt it.

In general, keep in mind that PostgreSQL ships with settings designed to let the database start up and function on a smallish laptop -- optimal performance generally requires some tuning, so unless your benchmarks are testing the effects different configuration settings, you might want to review overall configuration before benchmarking.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM