How can I optimize this neo4j query?

Question

The query is for loading the 1 million ratings from Grouplens dataset. I have already created nodes for users and movies, and now am merging them in relationships with movies.

load csv from "file:///ratings.csv" as row fieldterminator ';' 
MERGE (u:User {userID:toInt(row[0])} ) 
MERGE (m:Movie {movieID:toInt(row[1])} ) 
MERGE (u)-[r:RATING {value:toInt(row[3])} ]->(m)

This query takes a very long time when allocated 2GB RAM in the JVM (laptop, 4GB RAM), although runs reasonably fast with 4-6 GB RAM (desktop). Also, I have indexes on Users and Movies with their respective IDs.

The profile of this query looks like this-

The amount of db hits look perverse, and I think I can optimize this query.

(Follow up question): How could I run that optimized cypher query in neo4j-shell? Is this the correct syntax -

start [CYPHER_QUERY] ;

Answer 1

Try USING PERIODIC COMMIT . http://neo4j.com/docs/stable/query-periodic-commit.html

Also, consider using CREATE instead of MERGE for the last line to create the relationship, as I'm assuming ratings aren't repeated in your .csv file.

How can I optimize this neo4j query?

Question

1 answers

solution1
0 ACCPTED 2015-11-11 18:18:45

How can I optimize this neo4j query?

Question

1 answers

solution1 0 ACCPTED 2015-11-11 18:18:45

solution1
0 ACCPTED 2015-11-11 18:18:45