简体   繁体   English

SQLite使用自动索引代替我自己的索引

[英]SQLite use autoindex instead my own index

I've problem with SQLite autoindex in UNIQUE table. 我在UNIQUE表中有SQLite自动索引的问题。 I've create table like below. 我创建了如下表。

c.execute('''CREATE TABLE user(
    id INTEGER PRIMARY KEY,
    email TEXT NOT NULL UNIQUE,
    password TEXT NOT NULL,
    name TEXT NOT NULL,
    );'''
)
c.execute('CREATE INDEX USR on user(email, password);')

But when I check using explain query plan, SQLite use the autoindex provide by itself. 但是当我使用解释查询计划进行检查时,SQLite使用了自身提供的自动索引。 How to avoid this to use my own index instead of it's auto index? 如何避免使用我自己的索引而不是自动索引? How I try: 我如何尝试:

c.execute('EXPLAIN QUERY PLAN SELECT id, name FROM social WHERE email = "a@a.com" AND password = 'password'')

And the resut is: 结果是:

(0, 0, 0, 'SEARCH TABLE social USING INDEX sqlite_autoindex_user_1(email=?))

In your case, I think "sqlite_autoindex_user_1" is the index SQLite uses to implement the declared constraint on "email". 在您的情况下,我认为“ sqlite_autoindex_user_1”是SQLite用于实现对“电子邮件”的声明约束的索引。 Despite the name, it's an internal index, not an autoindex. 尽管有名称,它是一个内部索引,而不是自动索引。

Do not confuse automatic indexes with the internal indexes (having names like "sqlite_autoindex_table_N") that are sometimes created to implement a PRIMARY KEY constraint or UNIQUE constraint. 不要将自动索引与内部索引(诸如“ sqlite_autoindex_table_N”之类的名称)混淆,内部索引有时是为了实现PRIMARY KEY约束或UNIQUE约束而创建的。 The automatic indexes described here exist only for the duration of a single query, are never persisted to disk, and are only visible to a single database connection. 此处描述的自动索引仅在单个查询期间存在,永不持久化到磁盘,并且仅对单个数据库连接可见。 Internal indexes are part of the implementation of PRIMARY KEY and UNIQUE constraints, are long-lasting and persisted to disk, and are visible to all database connections. 内部索引是PRIMARY KEY和UNIQUE约束的实现的一部分,具有持久性并持久化到磁盘,并且对所有数据库连接都是可见的。 The term "autoindex" appears in the names of internal indexes for legacy reasons and does not indicate that internal indexes and automatic indexes are related. 出于传统原因,术语“自动索引”出现在内部索引的名称中,并不表示内部索引和自动索引相关。

Source 资源

The query optimizer decided that using the index on "email" would be fastest. 查询优化器决定使用“电子邮件”上的索引最快。 It's probably right. 可能是对的。


To see how SQLite might use your covering index "medp", build a test table like this. 要了解SQLite如何使用覆盖索引“ medp”,请构建一个像这样的测试表。

create table social_test (
  id integer primary key, 
  name text not null,    -- no UNIQUE constraint for testing
  tampil integer not null
);

create index medp on social (name, tampil);

Insert a million rows if you want to. 如果需要,请插入一百万行。

analyze social;
explain query plan select * from social where name = 'facebook' and tampil = 6;
0|0|0|SEARCH TABLE social USING COVERING INDEX medp (name=? AND tampil=?)

Due to the UNIQUE constraint, the database knows that the lookup on the email column can return at most one row. 由于UNIQUE约束,数据库知道email列上的查询最多可以返回一行。 This means that only a single password value needs to be checked, and that can be done just as easily by looking at the already-known table row. 这意味着只需要检查一个password值,并且可以通过查看已知的表行来轻松地完成检查。

The two-column index needs more space than the single-column one, so it would be slower to load it from disk. 两列索引比单列索引需要更多的空间,因此从磁盘加载它会比较慢。

To force the database to use your index, you could use the INDEXED BY clause , but this would not improve the performance. 要强制数据库使用索引,可以使用INDEXED BY子句 ,但这不会提高性能。

A see a few problems here. 一个在这里看到一些问题。

  1. The first SQL statement (CREATE TABLE ...) is malformed due to an extra comma between the last column and the closing parenthesis. 由于最后一列和右括号之间有多余的逗号,因此第一个SQL语句(CREATE TABLE ...)格式错误。

  2. The third SQL statement (EXPLAIN QUERY PLAN SELECT ...) uses double quotes around the specified email. 第三个SQL语句(EXPLAIN QUERY PLAN SELECT ...)在指定的电子邮件周围使用双引号。 SQLite will allow this for backward compatibility reasons, but it is not recommended. SQLite出于向后兼容的原因将允许这样做,但不建议这样做。

  3. The quoting of the string representing the third SQL statement is not quoted properly in any language I'm familiar with. 用我熟悉的任何语言都不能正确地引用代表第三条SQL语句的字符串的引号。

  4. Finally, the table name social is not defined in the limited schema you gave, so really we have no way of knowing what real indexes are available. 最后,表名social没有在您提供的有限模式中定义,因此实际上我们无法知道哪些实际索引可用。

If we assume you meant "user" when you typed "social", as CL. 如果我们假设您在键入“社交”时表示“用户”,则为CL。 said, the "autoindex" in use guarantees the email is unique so it doesn't need anything else. 说,使用中的“自动索引”可确保电子邮件是唯一的,因此不需要其他任何内容。 It might have used your explicit index as a covering index if you didn't need the name column as well, but since your explicit index does not include the name column (as required by the select statement), it deems the autoindex best. 如果您也不需要name列,则它可能已使用您的显式索引作为覆盖索引,但是由于您的显式索引不包含name列(如select语句所要求),因此它被认为是自动索引的最佳选择。 Additionally, the autoindex is almost certainly better for this case because the smaller index (only email vs email & password) means that there will likely be fewer reads of btree pages while trying to find the requested email address. 此外,对于这种情况,自动索引几乎可以肯定更好,因为较小的索引(仅电子邮件与电子邮件和密码)意味着在尝试查找请求的电子邮件地址时,对btree页面的读取可能会更少。 In other words, an index on email address only will be a smaller index than an index on email and password. 换句话说,仅电子邮件地址上的索引比电子邮件和密码上的索引小。

From the documentation : 文档中

Choosing between multiple indices 在多个索引之间选择

(...) (...)

When faced with a choice of two or more indices, SQLite tries to estimate the total amount of work needed to perform the query using each option. 当面对两个或多个索引的选择时,SQLite会尝试使用每个选项来估计执行查询所需的总工作量。 It then selects the option that gives the least estimated work. 然后,选择工作量最少的选项。

To help the optimizer get a more accurate estimate of the work involved in using various indices, the user may optionally run the ANALYZE command. 为了帮助优化器更准确地估计使用各种索引所涉及的工作,用户可以选择运行ANALYZE命令。 The ANALYZE command scans all indices of database where there might be a choice between two or more indices and gathers statistics on the selectiveness of those indices. ANALYZE命令扫描数据库中所有可能在两个或多个索引之间进行选择的索引,并收集有关这些索引的选择性的统计信息。 The statistics gathered by this scan are stored in special database tables names shows names all begin with "sqlite_stat". 通过此扫描收集的统计信息存储在特殊的数据库表中,名称显示的名称均以“ sqlite_stat”开头。 The content of these tables is not updated as the database changes so after making significant changes it might be prudent to rerun ANALYZE. 这些表的内容不会随着数据库的更改而更新,因此在进行重大更改后,重新运行ANALYZE可能是明智的。 The results of an ANALYZE command are only available to database connections that are opened after the ANALYZE command completes. ANALYZE命令的结果仅可用于ANALYZE命令完成后打开的数据库连接。

(...) (...)

So, you can run the analyze command to rescan the indices, but this will not guarantee that the optimizer will prefer your index. 因此,您可以运行analyze命令来重新扫描索引,但这不能保证优化器会首选您的索引。

To force a given index to be used you can use the INDEXED BY phrase. 要强制使用给定的索引,可以使用INDEXED BY短语。 From the documentation : 文档中

The INDEXED BY phrase forces the SQLite query planner to use a particular named index on a DELETE, SELECT, or UPDATE statement. INDEXED BY短语会强制SQLite查询计划程序在DELETE,SELECT或UPDATE语句上使用特定的命名索引。 The INDEXED BY phrase is an SQLite extension and is not portable to other SQL database engines. INDEXED BY短语是SQLite扩展,不能移植到其他SQL数据库引擎中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM