简体   繁体   中英

Writing a pyspark reduced dataframe to Neo4j using py2neo

I have a dataframe in pyspark with 2 columns (col1 and col2), col2 is a list of rows (dataframe is reduced on col1). Now I want to write this dataframe to neo4j using py2neo. How do I write and format the cypher query string? My query if I was to write the dataframe using spark connector looks like this -

query_sparkneo4j_connector = "MERGE (d:Node1 {Node1: event.col1}) \
        FOREACH (i in event.col2 | \
            CREATE (c:Node2 {Prop1: i.xx, Prop2: i.yy}) \
            CREATE (c)-[:Rel1]->(d));"

I tried two approaches but they don't work -

Approach1:

query1_py2neo = '''MERGE (d:Node1 {{Node1: '{col1val}'}})
        FOREACH (i in {col2val} |
            CREATE (c:Node2 {{Prop1: i.xx, Prop2: i.yy}})
            CREATE (c)-[:Rel1]->(d));'''

for row in df.collect():
    col1_val = row["col1_name"]
    col2_val = row["col2_name"] #this is a list of Row type
    graph.run(query1_py2neo.format(col1val=col1_val, col2val=col2_val))

Gives the error below -

py2neo.errors.ClientError: [Statement.SyntaxError] Variable `xx` not defined (line 2, column 31 (offset: 71))
"        FOREACH (i in [Row(xx='somevalue', yy='someothervalue')] |"

Approach2:

query2_py2neo = '''UNWIND $batch as row
        MERGE (d:Node1 {{Node1: row.col1_name}})
        FOREACH (i in row.certificates |
            CREATE (c:Node2 {{Prop1: i.xx, Prop2: i.yy}})
            CREATE (c)-[:Rel1]->(d));'''

graph.run(query2_py2neo, batch=df1)

Gives the error below -

TypeError: Values of type <class 'pyspark.sql.dataframe.DataFrame'> are not supported

The issue here seems to be that the parameters you are passing as col2val are of unexpected types. If you can convert your DataFrame so that col2val is a list of dicts you can use the following query

MERGE (d:Node1 {Node1: $col1val})
WITH d
UNWIND $col2val AS property_dict
CREATE (c:Node2)
SET c = property_dict
CREATE (c)-[:Rel1]->(d)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM