简体   繁体   中英

How to improve Neo4j 2.0 cypher/ExecutionResult performance under heavy load?

Background: We are noticing a drop in performance while retrieving data out of an ExecutionResult as number of concurrent threads go up. Our production application has 200 worker threads, uses Neo4j 2.0.0 Community in embedded mode. eg in milli-seconds.

  1. Threads: 1, Cypher time: 0, Extraction time: 188
  2. Threads: 10, Cypher time: 1, Extraction time: 188
  3. Threads: 50, Cypher time: 1, Extraction time: 2481
  4. Threads: 100, Cypher time: 1, Extraction time: 4466

Sample output of program (filtering results for 1 of the threads):

2013-12-23 14:39:31,137 [main] INFO  net.ahm.graph.CypherLab  - >>>>>>>>>>>>>>>>>>>>>>>>>>>>> NUMBER OF PARALLEL CYPHER EXECUTIONS: 1
2013-12-23 14:39:31,137 [main] INFO  net.ahm.graph.CypherLab  - >>>> STARTED GRAPHDB
2013-12-23 14:39:39,203 [main] INFO  net.ahm.graph.CypherLab  - >>>> CREATED NODES
2013-12-23 14:39:43,510 [main] INFO  net.ahm.graph.CypherLab  - >>>> WARMED UP
2013-12-23 14:39:43,510 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> CYPHER TOOK: 0 m-secs
2013-12-23 14:39:43,698 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> GETTING RESULTS TOOK: 188 m-secs
2013-12-23 14:39:43,698 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> CYPHER RETURNED ROWS: 50000
2013-12-23 14:39:43,698 [Thread-4] INFO  net.ahm.graph.CypherLab  - ### GRAPHDB SHUTDOWNHOOK INVOKED !!!



2013-12-23 14:40:10,470 [main] INFO  net.ahm.graph.CypherLab  - >>>>>>>>>>>>>>>>>>>>>>>>>>>>> NUMBER OF PARALLEL CYPHER EXECUTIONS: 10
...
2013-12-23 14:40:23,985 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> CYPHER TOOK: 1 m-secs
2013-12-23 14:40:25,219 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> GETTING RESULTS TOOK: 188 m-secs
2013-12-23 14:40:25,219 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> CYPHER RETURNED ROWS: 50000
2013-12-23 14:40:25,234 [Thread-4] INFO  net.ahm.graph.CypherLab  - ### GRAPHDB SHUTDOWNHOOK INVOKED !!!


2013-12-23 14:41:28,850 [main] INFO  net.ahm.graph.CypherLab  - >>>>>>>>>>>>>>>>>>>>>>>>>>>>> NUMBER OF PARALLEL CYPHER EXECUTIONS: 50
...
2013-12-23 14:41:41,781 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> CYPHER TOOK: 1 m-secs
2013-12-23 14:41:45,720 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> GETTING RESULTS TOOK: 2481 m-secs
2013-12-23 14:41:45,720 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> CYPHER RETURNED ROWS: 50000
2013-12-23 14:41:46,855 [Thread-4] INFO  net.ahm.graph.CypherLab  - ### GRAPHDB SHUTDOWNHOOK INVOKED !!!


2013-12-23 14:44:09,267 [main] INFO  net.ahm.graph.CypherLab  - >>>>>>>>>>>>>>>>>>>>>>>>>>>>> NUMBER OF PARALLEL CYPHER EXECUTIONS: 100
...
2013-12-23 14:44:22,077 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> CYPHER TOOK: 1 m-secs
2013-12-23 14:44:30,915 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> GETTING RESULTS TOOK: 4466 m-secs
2013-12-23 14:44:30,915 [pool-1-thread-1] INFO  net.ahm.graph.CypherLab  - >>>> CYPHER RETURNED ROWS: 50000
2013-12-23 14:44:31,680 [Thread-4] INFO  net.ahm.graph.CypherLab  - ### GRAPHDB SHUTDOWNHOOK INVOKED !!!

Test Program:

package net.ahm.graph;

import java.io.File;
import java.util.Map;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

import org.apache.log4j.Logger;
import org.neo4j.cypher.javacompat.ExecutionEngine;
import org.neo4j.cypher.javacompat.ExecutionResult;
import org.neo4j.graphdb.DynamicLabel;
import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.Node;
import org.neo4j.graphdb.RelationshipType;
import org.neo4j.graphdb.Transaction;
import org.neo4j.graphdb.factory.GraphDatabaseFactory;
import org.neo4j.graphdb.factory.GraphDatabaseSettings;
import org.neo4j.graphdb.schema.IndexDefinition;
import org.neo4j.graphdb.schema.Schema;
import org.neo4j.kernel.impl.util.FileUtils;
import org.neo4j.kernel.impl.util.StringLogger;

public class CypherLab {
    private static final Logger LOG = Logger.getLogger(CypherLab.class);
    private final static int CONCURRENCY = 100;

    public static void main(String[] args) throws Exception {
        FileUtils.deleteRecursively(new File("graphdb"));
        final GraphDatabaseService graphDb = new GraphDatabaseFactory().newEmbeddedDatabaseBuilder("graphdb")
                .setConfig(GraphDatabaseSettings.use_memory_mapped_buffers, "true").setConfig(GraphDatabaseSettings.cache_type, "strong")
                .newGraphDatabase();
        registerShutdownHook(graphDb);
        LOG.info(">>>>>>>>>>>>>>>>>>>>>>>>>>>>> NUMBER OF PARALLEL CYPHER EXECUTIONS: " + CONCURRENCY);
        LOG.info(">>>> STARTED GRAPHDB");
        createIndex("Parent", "name", graphDb);
        createIndex("Child", "name", graphDb);
        try (Transaction tx = graphDb.beginTx()) {
            Node parent = graphDb.createNode(DynamicLabel.label("Parent"));
            parent.setProperty("name", "parent");
            for (int i = 0; i < 50000; i++) {
                Node child = graphDb.createNode(DynamicLabel.label("Child"));
                child.setProperty("name", "child" + i);
                parent.createRelationshipTo(child, RelationshipTypes.PARENT_CHILD);
            }
            tx.success();
        }
        LOG.info(">>>> CREATED NODES");
        final ExecutionEngine engine = new ExecutionEngine(graphDb, StringLogger.SYSTEM);
        for (int i = 0; i < 10; i++) {
            try (Transaction tx = graphDb.beginTx()) {
                ExecutionResult result = engine.execute("match (n:Parent)-[:PARENT_CHILD]->(m:Child) return n.name, m.name");
                for (Map<String, Object> row : result) {
                    assert ((String) row.get("n.name") != null);
                    assert ((String) row.get("m.name") != null);
                }
                tx.success();
            }
        }
        LOG.info(">>>> WARMED UP");
        ExecutorService es = Executors.newFixedThreadPool(CONCURRENCY);
        final CountDownLatch cdl = new CountDownLatch(CONCURRENCY);
        for (int i = 0; i < CONCURRENCY; i++) {
            es.execute(new Runnable() {
                @Override
                public void run() {
                    try (Transaction tx = graphDb.beginTx()) {
                        long time = System.currentTimeMillis();
                        ExecutionResult result = engine.execute("match (n:Parent)-[:PARENT_CHILD]->(m:Child) return n.name, m.name");
                        LOG.info(">>>> CYPHER TOOK: " + (System.currentTimeMillis() - time) + " m-secs");
                        int count = 0;
                        time = System.currentTimeMillis();
                        for (Map<String, Object> row : result) {
                            assert ((String) row.get("n.name") != null);
                            assert ((String) row.get("m.name") != null);
                            count++;
                        }
                        LOG.info(">>>> GETTING RESULTS TOOK: " + (System.currentTimeMillis() - time) + " m-secs");
                        tx.success();
                        LOG.info(">>>> CYPHER RETURNED ROWS: " + count);
                    } catch (Throwable t) {
                        LOG.error(t);
                    } finally {
                        cdl.countDown();
                    }
                }
            });
        }
        cdl.await();
        es.shutdown();
    }

    private static void createIndex(String label, String propertyName, GraphDatabaseService graphDb) {
        IndexDefinition indexDefinition;
        try (Transaction tx = graphDb.beginTx()) {
            Schema schema = graphDb.schema();
            indexDefinition = schema.indexFor(DynamicLabel.label(label)).on(propertyName).create();
            tx.success();
        }
        try (Transaction tx = graphDb.beginTx()) {
            Schema schema = graphDb.schema();
            schema.awaitIndexOnline(indexDefinition, 10, TimeUnit.SECONDS);
            tx.success();
        }
    }

    private static void registerShutdownHook(final GraphDatabaseService graphDb) {
        Runtime.getRuntime().addShutdownHook(new Thread() {
            @Override
            public void run() {
                LOG.info("### GRAPHDB SHUTDOWNHOOK INVOKED !!!");
                graphDb.shutdown();
            }
        });
    }

    private enum RelationshipTypes implements RelationshipType {
        PARENT_CHILD
    }
}

Should be better when this commit is merged in . Which will be released as part of 2.0.1 There are still some other, smaller choke points.

Can you try to limit your webserver-threads to the # of cores times (or # of cores * 2) ? And see if that helps?

My understanding is that after warmup and having the hot dataset in the cache it is only CPU bound, not I/O bound anymore for reads. So you starve CPUs and workers with too many threads.

If I run your test with 8 and 100 cores I get these distributions for executing the query and fetching 50k results:

  • 8 threads: 50% percentile of 500ms and 90% 650ms
  • 100 threads: 50% percentile of 2600 ms and 90% at 6000ms

Code and detailed Histograms: https://gist.github.com/jexp/a164f6cf9686b8125872

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM