简体   繁体   English

在 Java 中不为 SQL 查询解析整个表的方法 - JDBC

[英]Method to not parse the whole table for an SQL Query in Java - JDBC

Let's say we have a very large table, and we have queries in the form of (this is an example only)假设我们有一个非常大的表,我们有以下形式的查询(这只是一个例子)

SELECT personID FROM people WHERE birthYear>2010 LIMIT 50

I want to maximize the performance of getting the results of that query, the problem is that the database will parse the whole table to find the tuples that match the condition and then return the first 50. That is a problem if we have a database with millions or billions of tuples.我想最大限度地提高获取该查询结果的性能,问题是数据库将解析整个表以找到与条件匹配的元组,然后返回前 50 个。如果我们有一个数据库数百万或数十亿个元组。

Is there a way in Java - JDBC or SQL to not parse the whole table and either parse it progressively and get the first 50 that match the condition, or parse the first 1000 rows for example and get all of them that match, and keep fetching more results when the user clicks a "Show More" button? Java - JDBC 或 SQL 中是否有一种方法可以不解析整个表,而是逐步解析它并获取与条件匹配的前 50 行,或者例如解析前 1000 行并获取所有匹配的行,然后继续获取当用户单击“显示更多”按钮时会显示更多结果?

Thank you for your time.感谢您的时间。

The problem is not real.问题不是真的。 Here's an analysis of what might happen:以下是对可能发生的情况的分析:

SELECT personID FROM people WHERE birthYear>1900 LIMIT 50
SELECT personID FROM people WHERE birthYear>2010 LIMIT 50

Case 1: No index on birthYear:案例 1:没有关于birthYear 的索引:

  • 1900: It will scan the table only until 50 rows match the WHERE clause . 1900:它只会扫描表,直到有 50 行与WHERE子句匹配 This is likely to be the first 50.这很可能是前 50 名。
  • 2010: It will scan most or all of the table unless it is a catalog of young kids. 2010 年:它将扫描大部分或全部表格,除非它是幼儿目录。 So, it might need to read all the rows to find 50.因此,它可能需要读取所有行才能找到 50。

Case 2: An index starting with birthYear:案例 2:birthYear开头的索引:

It will jump into the middle of the index to the first value with >1900 (or >2010), then grab the next 50 rows (or fewer).它将跳到索引的中间,到第一个值 >1900(或 >2010),然后抓取接下来的 50 行(或更少)。 For each of those rows, it will reach into the table for personID .对于这些行中的每一行,它将进入personID表。

Case 3: INDEX(birthYear, personID) :案例 3: INDEX(birthYear, personID)

As with case 2, but it does not need to "reach into the table".与案例 2 一样,但它不需要“伸手到桌子上”。 This is because personID is part of the index.这是因为personID是索引的一部分。

Only in case 1, and only if fewer than 50 rows have >1900 (seems unlikely), will it scan the entire table.只有在第 1 种情况下,并且只有当少于 50 行的行数大于 1900(似乎不太可能)时,它才会扫描整个表。 Cases 2 and 3 stop promptly at 50.案例 2 和案例 3 在 50 处立即停止。

Two things you have to do in this case-在这种情况下你必须做两件事——

  1. Create an index of birthYear and personID创建一个birthYear 和personID 的索引
  2. Do the partitioning on the table on year wise按年份在表上进行分区

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM