简体   繁体   English

SQLite:java / jdbc-sqlite和python / sqlite3之间的区别

[英]SQLite: Difference between java/jdbc-sqlite and python/sqlite3

I'm currently working on desktop application implemented in java(7) that, among other things, manages a large number of data records in a sql-database. 我目前正在研究用java(7)实现的桌面应用程序,该应用程序除其他外管理着sql数据库中的大量数据记录。 There are only a few tables, but they contain a large number of records. 只有几个表,但是它们包含大量记录。 I need to do complex queries on a single table but no complex join operations. 我需要在单个表上执行复杂的查询,但没有复杂的联接操作。

So far, I have been working with postgres. 到目前为止,我一直在与postgres合作。 But since it's a desktop single-user application, I also though about using sqlite (needless to say, that would also decrease the complexity of the setup). 但是,由于它是一个桌面单用户应用程序,因此我还是要使用sqlite(不用说,这也会降低设置的复杂性)。 So I wrote a simple python script in order make some performance tests. 因此,我编写了一个简单的python脚本以进行一些性能测试。 What surprised me is, first, how well sqlite actually performed and, second, that in python the query response time was much smaller than in java. 首先让我惊讶的是,sqlite的实际表现如何,其次,在python中,查询响应时间比在Java中小得多。

A common scenario is the selection of a batch of records according to a list of ids. 一种常见的情况是根据ID列表选择一批记录。 In python, I used the following code to test the response time: 在python中,我使用以下代码测试响应时间:

rand_selection = ','.join([str(int(random.random()* MAX_INDEX )) for i in xrange(PAGE_SIZE)])
start = time.time();
c = db.cursor();
res = c.execute("SELECT * FROM bigtable WHERE id in ("+rand_selection+")");
reslist = [str(t) for t in res]; c.close();
print( time.time() - start );

This gives me a delta of about 5 ms for MAX_INDEX=111000 and PAGE_SIZE=100. 对于MAX_INDEX = 111000和PAGE_SIZE = 100,这给了我约5毫秒的增量。

Well, great. 好吧,太好了。 Now, let's move to java: I use the jdbc-sqlite driver. 现在,让我们转到Java:我使用jdbc-sqlite驱动程序。 I performed the exact same query on the exact same table and the query time is always around 200 ms which is unacceptable for my use case. 我在完全相同的表上执行了完全相同的查询,并且查询时间始终约为200毫秒,这对于我的用例而言是不可接受的。

Am I missing out on something? 我错过了什么吗?

I know that this is a very general question. 我知道这是一个非常普遍的问题。 But maybe someone has some experience with jdbc-sqlite and knows from experience what's going on … 但是也许有人对jdbc-sqlite有一定的经验,并且从经验中知道发生了什么……

[Edit] : Using the timit.default_timer() as suggested (thanks, Martijn Pieters) gives me similar results. [编辑] :根据建议使用timit.default_timer()(感谢Martijn Pieters),我得到了类似的结果。

[Edit2] : Following CL's suggestion, I wrote a simplified version of the java code. [Edit2] :根据CL的建议,我编写了Java代码的简化版本。 Using this code I could not verify the results and the response times were about the same as for the python code. 使用此代码,我无法验证结果,并且响应时间与python代码大致相同。 However, I tested this is on a different machine, using a different jdk (openjdk7 vs. oracle jdk7). 但是,我在不同的机器上使用不同的jdk(openjdk7与oracle jdk7)进行了测试。 Admittedly, my other test code most probably has some issues. 诚然,我的其他测试代码很可能有一些问题。

[Edit 2013-08-16] : I now performed the same test using the original setup. [Edit 2013-08-16] :我现在使用原始设置执行了相同的测试。 I also compared it to postgres. 我也将它与postgres进行了比较。

Model Name:    MacBook Pro
Model Identifier: MacBookPro5,5
Processor Name:   Intel Core 2 Duo
Processor Speed:  2.53 GHz
Memory: 8GB
OS-Version: 10.8.4
Java:
Java(TM) SE Runtime Environment (build 1.7.0_21-b12)
Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)

The test code (please excuse the sloppy coding …): 测试代码(请原谅……):

package ch.dsd;

import java.sql.*;
import java.util.ArrayList;
import java.util.List;
import java.util.Properties;

public class Main {

    private static int COL_COUNT = 20;
    private static int TESTRUNS = 20;
    private static int INDEX_COUNT = 64;
    /*
    CREATE TABLE bigtable ( id INTEGER PRIMARY KEY ASC, prop0 real, prop1 real, ... , prop19 real );
     */
    static class Entity {
        private long id;
        private ArrayList<Double> properties = new ArrayList<Double>(COL_COUNT);

        public Entity() {
            for( int i = 0; i < COL_COUNT; i++) {
                properties.add(0.0);
            }
        }

        public long getId() {
            return id;
        }

        public void setId(long id) {
            this.id = id;
        }

        public void setProperty(int idx, double prop) {
            properties.set(idx, prop);
        }

        public double getProperty(int idx) {
            return properties.get(idx);
        }

        @Override
        public String toString() {
            StringBuilder sb = new StringBuilder();
            for( double prop: properties ) {
                sb.append(prop);
                sb.append(",");
            }
            sb.delete(sb.length()-1, sb.length());
            return sb.toString();
        }
    }

    private static String placeholders( int n ) {
        StringBuilder sb = new StringBuilder();
        if( n > 0 ) {
            sb.append("?");
            for( int i = 1; i < n; i++ )
                sb.append(",?");
            return sb.toString();
        }
        return "";
    }

    private static void setRandomIdcs( PreparedStatement ps, int start, int stop, int max ) throws SQLException {
        for( int i = start; i <= stop; i++ ) {
            ps.setLong(i, (long) ((double) max * Math.random()));
        }
    }

    private static void setRandomValues( PreparedStatement ps, int start, int stop ) throws SQLException {
        for( int i = start; i <= stop; i++ ) {
            ps.setDouble(i, Math.random());
        }
    }


    private static void readFromResultSet( ResultSet rs, List<Entity> lst ) throws SQLException {
        while(rs.next()) {
            final Entity e = new Entity();
            e.setId(rs.getLong(1));
            for( int i = 0; i < COL_COUNT; i++ )
                e.setProperty(i, rs.getDouble(i+2));
            lst.add(e);
        }
    }

    public static void performTest(Connection c) throws SQLException {
        final PreparedStatement ps = c.prepareStatement("SELECT * FROM bigtable WHERE id in ("+placeholders(INDEX_COUNT)+")");
        ArrayList<Entity> entities = new ArrayList<Entity>();
        for( int i = 0; i < TESTRUNS; i++ ) {
            setRandomIdcs( ps, 1, INDEX_COUNT, 1000000 ); // there are one million entries stored in the test table
            long start = System.currentTimeMillis();
            final ResultSet rs = ps.executeQuery();
            readFromResultSet(rs, entities);
            // System.out.println(entities.get(INDEX_COUNT-1));
            System.out.println("Time used:" + (System.currentTimeMillis() - start));
            System.out.println("Items read:" + entities.size());
            rs.close();
            entities.clear();
        }
        ps.close();
    }

    public static void createPSQLTable(Connection c) throws SQLException {
        final String create_stmt = "CREATE TABLE IF NOT EXISTS bigtable (id SERIAL PRIMARY KEY, " +
                "prop0 double precision,prop1 double precision,prop2 double precision,prop3 double precision,prop4 double precision,prop5 double precision,prop6 double precision,prop7 double precision,prop8 double precision,prop9 double precision,prop10 double precision,prop11 double precision,prop12 double precision,prop13 double precision,prop14 double precision,prop15 double precision,prop16 double precision,prop17 double precision,prop18 double precision,prop19 double precision)";
        final PreparedStatement ps = c.prepareStatement(create_stmt);
        ps.executeUpdate();
        ps.close();
    }

    public static void loadPSQLTable( Connection c ) throws SQLException {
        final String insert_stmt = "INSERT INTO bigtable VALUES (default, " + placeholders(20) + ")";
        final PreparedStatement ps = c.prepareStatement(insert_stmt);
        for( int i = 0; i < 1000000; i++ ) {
            setRandomValues(ps, 1, 20);
            ps.executeUpdate();
        }
        c.commit();
    }

    public static void main(String[] args) {
        Connection c = null;
        try {
            Class.forName("org.sqlite.JDBC");
            c = DriverManager.getConnection("jdbc:sqlite:/Users/dsd/tmp/sqlitetest/testdb.db");
            c.setAutoCommit(false);
            performTest(c);
            c.close();
            System.out.println("POSTGRES");
            System.out.println("========");
            final Properties props = new Properties();
            props.setProperty("user", "dsd");
            c = DriverManager.getConnection("jdbc:postgresql:testdb", props);
            c.setAutoCommit(false);
            createPSQLTable(c);
            // loadPSQLTable(c);
            performTest(c);
            c.close();
        } catch ( Exception e ) {
            System.err.println( e.getClass().getName() + ": " + e.getMessage() );
            System.exit(0);
        }
    }
}

Results: 结果:

Time used:348
Items read:64
Time used:407
Items read:64
Time used:259
Items read:64
Time used:341
Items read:64
Time used:325
Items read:64
Time used:145
Items read:64
Time used:70
Items read:64
Time used:98
Items read:64
Time used:91
Items read:64
Time used:134
Items read:64
Time used:68
Items read:64
Time used:51
Items read:64
Time used:51
Items read:64
Time used:51
Items read:64
Time used:55
Items read:64
Time used:67
Items read:64
Time used:56
Items read:64
Time used:90
Items read:64
Time used:56
Items read:64
Time used:51
Items read:64
POSTGRES
========
Time used:75
Items read:64
Time used:58
Items read:64
Time used:31
Items read:64
Time used:26
Items read:64
Time used:34
Items read:64
Time used:6
Items read:64
Time used:5
Items read:64
Time used:4
Items read:64
Time used:5
Items read:64
Time used:6
Items read:64
Time used:5
Items read:64
Time used:6
Items read:64
Time used:4
Items read:64
Time used:28
Items read:64
Time used:3
Items read:64
Time used:4
Items read:64
Time used:4
Items read:64
Time used:4
Items read:64
Time used:3
Items read:64
Time used:5
Items read:64

Python is written in C and has the "sqlite" executable, also written in C, linked in. Python是用C编写的,并且链接有也用C编写的“ sqlite”可执行文件。

There is no marshaling of data, or conversion between formats as both Python and the underlying sqlite library are using the same data types and encoding which arenative to whatever platform they are compiled on. 没有数据封送处理,也没有格式之间的转换,因为Python和底层sqlite库都使用相同的数据类型和编码,而这些数据类型和编码对于它们在任何平台上均是原生的。

Java (JVM also written in C but ....) on the other hand uses specific platform independent data types in particular all strings are unicode. 另一方面,Java(JVM也用C语言编写,但是....)使用特定于平台的数据类型,特别是所有字符串都是unicode。 In order to communicate with the underlying sqlite executable the java library must use JNI, which (usually) involves some conversion of data types and character encoding. 为了与底层sqlite可执行文件进行通信,Java库必须使用JNI,(通常)JNI涉及数据类型和字符编码的某种转换。 This can be very cpu intensive particulary when converting C strings to unicode and back again. 当将C字符串转换为unicode并再次返回时,这可能是非常占用CPU的时间。

Having said all that I have used the sqliteJDBC jar a lot and never really noticed any performance issues. 说了这么多,我已经大量使用了sqliteJDBC jar,但从未真正注意到任何性能问题。

You might try looking at JavaDB (aka. Derby) as an embedded Java database. 您可以尝试将JavaDB(又名Derby)视为嵌入式Java数据库。 It is written in pure Java, uses "native" java encodings and is "zero maintenance". 它是用纯Java编写的,使用“本机” java编码,并且是“零维护”。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM