简体繁体 English

使用流水线函数微调Oracle查询

[英]Fine tuning oracle query with pipelined function

原文 2014-02-20 19:31:33 2 1 sql/ oracle/ oracle-apex/ database-performance/ pipelined-function

I have a query (that powers an Oracle Application Express Report) that I was told by our users was executing "slowly" or at an unacceptable speed (wasn't given an actual load time for the page and the query is the only thing on the page). 我有一个查询（为Oracle Application Express报告提供了动力），该查询告诉我们的用户“执行缓慢”或执行速度不可接受（未给出页面的实际加载时间，并且该查询是唯一的处理方法）这一页）。

The query involves many tables and actually references a pipelined function which identifies the currently logged-in users to our website and returns a custom "table" of records they have permission to based upon a custom security scheme we have. 该查询涉及许多表，实际上引用了流水线函数，该函数识别当前登录用户到我们的网站，并根据我们拥有的自定义安全方案返回他们有权访问的记录的自定义“表”。

My main question is around Oracle's caching of queries and how they could be affected by our setup. 我的主要问题是关于Oracle的查询缓存以及我们的设置如何影响它们。

When I took the query out of the webpage and ran it in Sql Developer (and manually specified a user ID to simulate a logged-in user to the website), the performance went from 71 seconds to 19 seconds to .5 seconds. 当我从网页中取出查询并在Sql Developer中运行它（并手动指定一个用户ID以模拟登录到网站的用户）时，性能从71秒缩短到19秒，再到0.5秒。 Clearly, Oracle is utilizing its caching mechanism to make subsequent runs faster. 显然，Oracle正在利用其缓存机制来加快后续运行速度。

How is this affected by?: 受到什么影响？：

The fact that different users will get different tables from the pipe-lined function (all the same columns, just different number of rows and the values in the rows). 不同的用户将从管道函数中获得不同的表的事实（所有相同的列，只是行数和行中的值不同）。 Does the pipe-lining prevent caching from working? 管道衬里是否阻止缓存工作？ Am I only seeing caching because I'm running a very isolated test? 我只因为正在运行一个非常孤立的测试而看到缓存吗？
Further more - is caching easily influenced by the number of people using the system? 更进一步-缓存是否容易受到使用该系统的人数的影响？ I'm not sure how " much " can get cached. 我不确定可以缓存多少。 Therefore, if we have 50 concurrent users that are accessing different parts of the website that are loading different queries all day long, is it likely that oracle won't be able to cache many/any of them because it is constantly seeing different request for queries? 因此，如果我们有50个并发用户整天访问网站的不同部分并加载不同的查询，那么oracle是否有可能无法缓存很多/其中的任何一个，因为它不断看到不同的请求。疑问？

Sorry my question isn't very technical. 抱歉，我的问题不是非常技术性。

I'm a developer who has been asked to help out in this seemingly DBA question. 我是一名开发人员，曾被要求为这个看似DBA的问题提供帮助。

Also, this is complicated because I can't really determine what the actual load times are since our users don't report that level of detail. 而且，这很复杂，因为我无法真正确定实际的加载时间，因为我们的用户没有报告该详细程度。

Any thoughts on: 关于以下方面的任何想法：

how I can determine if this query is actually slow? 如何确定此查询是否实际上很慢？
what the average processing time would be? 平均处理时间是多少？
and how to proceed with fine tuning if it is a problem? 以及如果有问题该如何进行微调？

Thanks! 谢谢！

1 个解决方案

It doesn't sound like this has anything to do with APEX, pipelined table functions, or query caching. 听起来这与APEX，流水线表功能或查询缓存无关。 It sounds like you are describing the effects of plain old data caching (most likely at the database level but potentially at the operating system and disk subsystem layers). 听起来您正在描述普通旧数据缓存的效果（最有可能在数据库级别，但有可能在操作系统和磁盘子系统层）。

As a very basic overview, data is stored in rows, rows are stored in blocks (most commonly 8 kb in size), blocks are stored in extents (generally a few MB in size), and extents roll up to segments (ie a table). 作为一个非常基本的概述，数据存储在行中，行存储在块中（最常见的大小是8 kb），块存储在扩展区中（通常为几MB大小），扩展区可汇总为段（即表））。 Oracle maintains a buffer cache where the most recently accessed blocks are stored. Oracle维护一个缓冲区高速缓存，用于存储最近访问的块。 When you run a query, Oracle figures out which blocks it needs to read in order to get your data (this is the query plan). 运行查询时，Oracle会弄清楚它需要读取哪些块才能获取数据（这是查询计划）。 It then looks to see whether those blocks are in the buffer cache or whether they have to be read from disk. 然后查看这些块是否在缓冲区高速缓存中，或者是否必须从磁盘读取它们。 Obviously, reading a block from cache is much more efficient than reading it off the disk since RAM is much faster than disk. 显然，由于RAM比磁盘快得多，因此从缓存中读取数据块比从磁盘读取数据要有效得多。 If you run the same query with the same set of bind variable values multiple times in a row, you'll be accessing the same set of blocks each time but more and more of the blocks you care about are going to be in the cache. 如果您连续多次使用相同的绑定变量值集合运行相同的查询，则每次将访问同一组块，但是您关心的越来越多的块将位于缓存中。 So you'd generally expect that the second and third time that you call the query, you'll see faster performance. 因此，您通常希望第二次和第三次调用查询时，您会看到更快的性能。

If you run the query with a different set of bind variable values, if the second set of bind variable values causes Oracle to access many of the same blocks, those executions will benefit from the data the prior test cached. 如果使用一组不同的绑定变量值运行查询，如果第二组绑定变量值导致Oracle访问许多相同的块，则这些执行将受益于先前测试缓存的数据。 Otherwise, you'd be back to square 1 potentially reading all the data you need off disk. 否则，您将回到平方1，有可能从磁盘上读取所有需要的数据。 Most likely, you'll see some combination of the two. 您很可能会看到两者的某种组合。

Remember as well that it is not just Oracle that is caching data. 还要记住，缓存数据不仅仅是Oracle。 Frequently, the operating system will be caching the most active pieces of the underlying Oracle data files. 通常，操作系统将缓存最活跃的底层Oracle数据文件。 And the I/O subsystem will be caching the most recently accessed data as well. I / O子系统也将缓存最近访问的数据。 So even if Oracle thinks that it needs to go out to fetch a block because it is not in the database's buffer cache, the file system or the I/O subsystem may have cached that data so it may not require an actual physical read off of disk. 因此，即使Oracle认为由于它不在数据库的缓冲区高速缓存中而需要出去获取一个块，文件系统或I / O子系统也可能已经缓存了该数据，因此它可能不需要实际的物理读取。磁盘。 These other caches behave similarly where running the same query multiple times in a row is likely to cause the cache to be "warm" and improve the performance of the later runs. 这些其他缓存的行为类似，其中连续运行多次查询可能会导致缓存“变暖”并提高以后运行的性能。