[英]Large SQL dataset query using java
I have the following configuration: 我有以下配置:
Basically what I want to do is a select with a where clause on a table. 基本上我想做的是在表上使用where子句进行选择。 The problem is the table has about 700M entries and the query takes a really long time.
问题是该表有大约700M条目,查询需要很长时间。
Can you please indicate some pointers on where to optimize the query or what sort of techniques are can I use in order to get an improvement in performance? 您能否指出一些关于优化查询的指针,或者为了提高性能我可以使用哪种技术?
Thanks. 谢谢。
Using indexes is the standard technique used to deal with this problem. 使用索引是用于解决此问题的标准技术。 As requested, here are some pointers that should get you started:
根据要求,以下一些指针可以帮助您入门:
The first thing I do in this case is isolate whether it is the amount of data I am returning that is the problem or not (an i/o issue). 在这种情况下,我要做的第一件事就是确定是否要返回的数据量是问题(I / O问题)。 A simple non-scientific way to do this is change your query to just return the count:
一种简单的非科学方法是将查询更改为仅返回计数:
select count(*) --just return a count, no data!
from MyTable
inner join MyOtherTable on ...
where ...
If this runs very quickly, it tells you your indexes are in order (assuming no sub-selects in your WHERE
clause). 如果运行非常快,它会告诉您索引是正确的(假设
WHERE
子句中没有子选择)。 If not, then you need to work on indexes , the WHERE
clause, or your query construction itself (JOINs being done, etc). 如果没有,那么您需要处理index ,
WHERE
子句或查询构造本身(已完成JOIN等)。
Once that is satisfactory, add back in your SELECT
clause. 一旦满意,请重新添加您的
SELECT
子句。 If it is slow, you are going to have to look at your data access pattern: 如果速度很慢,则必须查看数据访问模式:
I would run Profiler to find the exact query that is being generated. 我将运行Profiler来查找正在生成的确切查询。 ORMs can create less than optimal queries.
ORM创建的查询少于最佳查询。 Once you know the query, you can run it in SSMS and see the execution plan.
知道查询后,您可以在SSMS中运行它并查看执行计划。 This will give you clues as to where you have performance problems.
这将为您提供有关性能问题的线索。
Several things that can cause performance problems: 可能导致性能问题的几件事:
There's more (After all whole very long books are written o nthis subject) but that should be enough to get you started at where to look. 还有更多的东西(毕竟在这个主题上写了很长的书之后),但这应该足以使您开始寻找什么。
You should provide some indexes for those column you often use to restrict the result. 您应该为那些经常用来限制结果的列提供一些索引。 Other thing is the pagination of the result set.
另一件事是结果集的分页。
Regardless of the specific DB, I would do the following: 无论使用哪个特定的数据库,我都将执行以下操作:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.