简体   繁体   中英

create AWS Neptune graph from raw csv

I saw a lot of tutorials about how to load csv (Gremlin) data in the format of vertices and edges into AWS Neptune. For a lot of reasons, I cannot create vertices and edges for data loading. Instead I have just the raw csv file where each row is a record (eg a person).

How can I create nodes and relationships from each row of record from the raw csv in Neptune from the notebook interface?

Given you mentioned wanting to do this in the notebooks, the examples below are all run from inside a Jupyter notebook. I don't have the data sets you mentioned to hand, so let's make a simple one in a Notebook cell using.

%%bash
echo "code,city,region
AUS,Austin,US-TX
JFK,New York,US-NY" > test.csv

We can then generate the openCypher CREATE steps for the nodes contained in that CSV file using a simple cell such as:

import csv
with open('test.csv', newline='') as csvfile:
    reader = csv.DictReader(csvfile, escapechar="\\")
    query = ""
    for row in reader:
        s = "CREATE (:Airport {"
        for k in row:
            s += f'{k}:"{row[k]}", '
        s = s[:-2] + '})\n'
        query += s 
    print(query)

Which yields

CREATE (:Airport {code:"AUS", city:"Austin", region:"US-TX"})
CREATE (:Airport {code:"JFK", city:"New York", region:"US-NY"})

Finally let's have the notebook oc cell magic run that query for us

ipython = get_ipython()
magic = ipython.run_cell_magic
magic(magic_name = "oc", line='', cell=query)

To verify that the query worked

%%oc
MATCH (a:Airport)
RETURN a.code, a.city

which returns:

    a.code     a.city
1   AUS        Austin
2   JFK        New York

There are many ways you could do this, but this is a simple way if you want to stay inside the notebooks. Given your question does not have a lot of detail or an example of what you have tried so far, hopefully this gives you some pointers.

I'm also trying to figure this out. Do you have a working solution now? Thanks in advance.

Anita

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM