简体   繁体   中英

Load CSV data from aws-s3 in dse Graph Loader

I have data on aws-s3(in csv format) and i want to load that data in dse graph using Graph Loader. i have search but nothing found on this topic. is it possible using dse graph Loader?

Here's how mapping looks for the graph loader when reading from csv's:

https://docs.datastax.com/en/latest-dse/datastax_enterprise/graph/dgl/dglCSV.html

Here's an HDFS example (also with csv files), S3 should be similar (just swap the dfs_url:

// Configures the data loader to create the schema
config create_schema: true, load_new: true, preparation: true
// Define the data input sources
// dfs_uri specifies the URI to the HDFS directory in which the files are stored.
dfs_uri = 'hdfs://host:port/path/'
authorInput = File.csv(dfs_uri + 'author.csv.gz').gzip().delimiter('|')
//Specifies what data source to load using which mapper (as defined inline)
load(authorInput).asVertices
{ label "author" key "name" }
// graphloader call
./graphloader myMap.groovy -graph testHDFS -address localhost
// start gremlin console and check the data
bin/dse gremlin-console
:remote config reset g testHDFS.g
schema.config().option('graph.schema_mode').set('Development')
g.V().hasLabel('author')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM