简体   繁体   中英

Does SQLite3 supports indexing? How can I speed up a query?

I am executing this query:

NSString *querySQL = [NSString stringWithFormat:@"
        SELECT DISTINCT P1.ID_RUTA_PARADAS
        FROM FastParadas AS P1
        WHERE P1.ID_ESTACION_INIT <= %d AND
            %d <= P1.ID_ESTACION_END
        INTERSECT
        SELECT DISTINCT P2.ID_RUTA_PARADAS
        FROM FastParadas AS P2
        WHERE P2.ID_ESTACION_INIT <= %d AND
            %d <= P2.ID_ESTACION_END",
    (int)estacionOrigen.ID_Estacion,(int)estacionOrigen.ID_Estacion,
    (int)estacionDestino.ID_Estacion,(int)estacionDestino.ID_Estacion];

And I want to speed it up. I tried by creating some indexes but there is no improvement. Does SQLite3 supports indexes?

The database has 3900+ rows, and this query has to be repeated 1800+ times in less than a second.

The database has 3900+ rows, and this query has to be repeated 1800+ times in less than a second.

No. Not going to happen outside of a machine with fantastically HUGE memory bandwidth using a highly optimized algorithm that scans the data in memory.

In any situation like this, it is critical that you design that data model such that this kind of query simply isn't necessary. 3900+ rows is really not that much, but 1800+ queries against that data is a hell of a lot.

Your best bet is to pursue a schema that eliminates the need for the 1800+ queries/second or, worst case, design the app such that the 1800+ queries/second is done behind a progress bar or something.

Besides the points from @bbum and @ipmcc regarding physical limtations, you won't have much luck with indexes in theory, too. What you are looking for is the ID_RUTA_PARADAS entry of all tuples that satisfy ID_ESTACION_INIT smaller than some value and ID_ESTACION_END bigger than some value (Just to put that into natural language).

What could an index help with that?

(1) Say you have an index on ID_ESTACION_INIT that supports range queries. You could get all ids for the rows satisfying ID_ESTACION_INIT <= %d relatively fast. But then you have to get all those rows in order to find out if they also satisfy %d <= P1.ID_ESTACION_END .

(2) Say you have an index on ID_ESTACION_INIT and one on ID_ESTACION_END both supporting range queries. Then these both could get all rows satisfying the predicates, and the rowids that are returned by both indexes could be used for fetching the ID_RUTA_PARADA .

The problem with both of these approaches is, that if you want to work with them, you would have to do random access to disk which makes only sense for small result sets (ie if there are few rows that satisfy those predicates). For bigger cardinalities (I think I heard of >= 5%, but that might also have been just an example) your database system would go for a tablescan in order to find all tuples which means, your index does not help.

Here a SQLFiddle to play around with indexes and maybe also other DBMSs: http://sqlfiddle.com/#!5/d1a86/2

(In fact, a clustered index could help for reading less not-qualifying tuples, but SQLite does not support them: sqlite: Fastest way to get all rows (consecutive disk access) )

In this query, the INTERSECT already takes care of removing duplicates, so you don't need the DISTINCT . The following query might be even faster:

SELECT DISTINCT ID_RUTA_PARADAS
FROM FastParadas
WHERE %d BETWEEN ID_ESTACION_INIT AND ID_ESTACION_END
  AND %d BETWEEN ID_ESTACION_INIT AND ID_ESTACION_END

However, a range query like this cannot be easily optimized with normal indexes. You should change your database to use a one-dimensional R-tree index , in which case 1800 queries/s might be possible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM