简体   繁体   中英

Grails save not respect flush option

I'm using grails as a poor man's etl tool for migrating some relatively small db objects from 1 db to the next. I have a controller that reads data from one db (mysql) and writes it into another (pgsql). They use similar domain objects, but not exactly the same ones due to limitations in the multi-datasource support in grails 2.1.X.

Below you'll see my controller and service code:

class GeoETLController {

    def zipcodeService

      def migrateZipCode() {
        def zc = zipcodeService.readMysql();
        zipcodeService.writePgSql(zc);

        render{["success":true] as JSON}
    }
}

And the service:

class ZipcodeService {

    def sessionFactory
    def propertyInstanceMap = org.codehaus.groovy.grails.plugins.DomainClassGrailsPlugin.PROPERTY_INSTANCE_MAP

    def readMysql() {
        def zipcode_mysql = Zipcode.list();
        println("read, " + zipcode_mysql.size());
        return zipcode_mysql;
    }

    def writePgSql(zipcodes) {

        List<PGZipcode> zips = new ArrayList<PGZipcode>();
        println("attempting to save, " + zipcodes.size());
        def cntr = 0;
        zipcodes.each({ Zipcode zipcode ->
            cntr++;

            def props = zipcode.properties;
            PGZipcode zipcode_pg = new PGZipcode(zipcode.properties);

            if (!zipcode_pg.save(flush:false)) {
                zipcode_pg.errors.each {
                    println it
                }
            }
            zips.add(zipcode_pg)
            if (zips.size() % 100 == 0) {
                println("gorm begin" + new Date());
                // clear session here.
                this.cleanUpGorm();
                println("gorm complete" + new Date());

            }

        });
        //Save remaining
        this.cleanUpGorm();
        println("Final ." + new Date());
    }

    def cleanUpGorm() {
        def session = sessionFactory.currentSession
        session.flush()
        session.clear()
        propertyInstanceMap.get().clear()
    }
}

Much of this is taken from my own code and then tweaked to try and get similar performance as seen in http://naleid.com/blog/2009/10/01/batch-import-performance-with-grails-and-mysql/

So, in reviewing my code, whenever zipcode_pg.save() is invoked, an insert statement is created and sent to the database. Good for db consistency, bad for bulk operations.

What is the cause of my instant flushes (note: My datasource and congig groovy files have NO relevant changes)? At this rate, it takes about 7 seconds to process each batch of 100 (14 inserts per second), which when you are dealing with 10,000's of rows, is just a long time...

Appreciate the suggestions.

NOTE: I considered using a pure ETL tool, but with so much domain and service logic already built, figured using grails would be a good reuse of resources. However, didn't imagine this quality of bulk operations

Without seeing your domain objects, this is just a hunch, but I might try specifying validate:false as well in your save() call. Validate() is called by save(), unless you tell Grails not to do that. For example, if you have a unique constraint on any field in your PGZipcode domain object, Hibernate has to do an insert on every new record to leverage the DBMS's unique function and perform a proper validation. Other constraints might require DBMS queries as well, but only unique jumps to mind right now.

From Grails Persistence: Transaction Write-Behind

Hibernate caches database updates where possible, only actually pushing the changes when it knows that a flush is required, or when a flush is triggered programmatically. One common case where Hibernate will flush cached updates is when performing queries since the cached information might be included in the query results. But as long as you're doing non-conflicting saves, updates, and deletes, they'll be batched until the session is flushed.


Alternately, you might try setting the Hibernate session's flush mode explicitly:

sessionFactory.currentSession.setFlushMode(FlushMode.MANUAL);

I'm under the impression the default flush mode might be AUTO.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM