简体   繁体   中英

Android - Fastest way to search data in SQLite database

I have an image processing app. My app stores the already processed images in a database. Every time the user opens the app, the app starts to check the database to see what photos have already been processed. With my code this process is taking around 10-20 seconds, which for my needs is a lot of time. The database only has one column, the path of the image. I take the full image list from the phone and then search every item of the list in the database.

My code is as follows:

public static ArrayList<String> getAlreadyProcessedPhotos(Context context, ArrayList<String> photos, SQLiteDatabase db)
{
    ArrayList<String> notAlreadyProcessedPhotos = new ArrayList<>();

    for(String path : photos)
    {
        File imgFile = new File(path);

        if (!Utils.isAlreadyProcessed(context, imgFile, db))
        {
            notAlreadyProcessedPhotos.add(path);
        }
    }

    return notAlreadyProcessedPhotos;
}

public static boolean isAlreadyProcessed(Context context, File imgFile, SQLiteDatabase photosDb) {
    if(photosDb == null || !photosDb.isOpen())
        photosDb = new DatabaseHelper(context).getReadableDatabase();

    String searchQuery = "SELECT * FROM " + DatabaseHelper.TABLE_NAME + " WHERE " + DatabaseHelper.PATH_COLUMN + "=?";

    Cursor cursor = photosDb.rawQuery(searchQuery, new String[] {imgFile.getAbsolutePath()});
    boolean result = cursor.moveToFirst();
    cursor.close();

    return result;
}

For each file that you want to check you are executing a separate sqlite query. No wonder it's slow! If there are 100 files you will need to do a 100 queries. But this can really be done with one simple query. You just need to combine your two methods into 1

public static ArrayList<String> getAlreadyProcessedPhotos(Context context, ArrayList<String> photos, SQLiteDatabase db)
{
   ArrayList<String> notAlreadyProcessedPhotos = new ArrayList<>();
   ArrayList<String> preProc = new ArrayList()

   for (String item: photos) {
       preProc.add("'" + item + "'");
   } 
   String inClause = TextUtils.join(",", preProc);

   String searchQuery = "SELECT " + DatabaseHelper.PATH_COLUMN + "FROM " + DatabaseHelper.TABLE_NAME + " WHERE " + DatabaseHelper.PATH_COLUMN + "NOT IN (" +inClause + ")";
   Cursor cursor = photosDb.rawQuery(searchQuery);

   while(cursor.moveToNext()) 
   {
        notAlreadyProcessedPhotos.add(cursor.getString(0);
   }

   return notAlreadyProcessedPhotos;
} 

This is one loop, one query. I don't know where your photos array list comes from but I get the feeling there is room for further optimization there as well.

The answer to almost all sql (Sqlite, MySql, ....) speed issues is to create an index on the table. See: https://www.sqlite.org/lang_createindex.html My guess your doing a full table scan on the imgFile you just added, that is as slow as it gets.

Other things you can do ( But won't help near as much as an index)
1) Since you are not using the imgFile returned from Sqlite, change your sql to 'Select count() From ... ' which will return an integer that is greater than zero if present.

2) Add a limit clause to the select statement "Select .... limit 1;" This will allow Sqlite to return once the first record is found.

You already got the responses in form of the comments too!

First is the loop issue as how suggested e4c5. Of course that will make a huge boost.

The second is the SELECT * FROM table replace with SELECT field1WhatIreallyNeed, field2WhatIreallyNeed FROM table .

It helps adding to index the Where fields too.

I have integrated sqlite3 with NDK , so it is used C there and is even faster, but worth if your records are close to 1 million in 1 table.

The best answer is in comment: you don't need database for this! And that would be the fastest. Think about the database how is read, where is stored? - in a file not, just with another constraints, parsing, processing overheads.

I need the database, becuase my app overwrites the original photo, so the photo always exists

No, you don't need database for this!

There are eTags

There is a meta file info

You can store in an separate file downloaded_timestamp , processed_timestamp and you can calculate if needs to be processed or not, and that will take milliseconds and not 10-20 seconds.

So, drop your database and use a simple file, read the data from that file all at once, not line by line.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM