简体   繁体   中英

DDL statements and data imports using REST APIs

I'm new at arangodb and I'm considering using arangodb for a small application.

Because the application will be written using a no code platform, access to arangodb will only be done by issuing REST APIs. It will not have drivers for direct access.

Users of this application will need to be able to create collections and import CSV files using the application front end.

I've seen that one can issue AQL commands using the REST API. But AQL has no Data Definition functionality and thus I cannot CREATE a collection (or using SQL terms, DROP or ALTER it). Similarly, there is no AQL command to import a CSV.

Q1: can I issue REST APIs to create/alter/drop collections and import vertices and edges stored in CSV files?

Q2: If the answer to Q1 is affirmative, is there any documentation that explains how to do all that?

Many thanks in advance!

It's good you're looking at that design, the key advantage of having a REST API front end to your data layer is that it lets you validate the data going into it.

ArangoDB supports this out of the box with it's Foxx Microservice feature.

A Foxx service is able to mount a REST API endpoint (very similar to how Express does it in Node.js) and then you can write code in Foxx to perform any level of checks on the incoming data that you want.

Because Foxx is written in JavaScript (a slightly different one that what is in the latest Node.js, eg it does not support Promises, Async/Await) you can write code that performs security checking, schema validation, management of underlying ArangoDB collections, and even launching of out of band processes that reacts to the commands it received.

You can have Foxx load a CSV file by accessing the file system, but I would strongly recommend you don't follow that path, because of the way that Foxx service lifetime is managed by ArangoDB and not by your code.

It is much better to write a Foxx microservice that exposes an API that is able to take a JSON payload which represents the data you want to import. Then Foxx can perform data validation, determine data types for each field, respond to missing values in your data, and optionally reject data if it is not structured correctly.

You haven't mentioned how large your CSV files are and how well known the structure/schema of the data within the files are.

You also need to think about how long (in milliseconds) it would take to parse, format, error check, and insert the data you're sending.

Think about how your data will be stored, eg will 1 CSV row be 1 document in a collection? Will you know what the unique key is of a CSV row? How will you handle updates when duplicate rows are identified?

If the data is in the kilobytes and it takes < 1 second to process the data, just send it in the JSON payload to the endpoint.

If the data is in the megabytes, then have the client code slice the data and send it to the REST API in blocks that are efficient for your use case (eg each call should take <1 second to process).

If the data is so large that it takes > 30 seconds to send all the data, or your end users are giving up before the data is in, then you'll probably need another endpoint between your client and Foxx that is responsible for saving the data to disk and spawning workers to then slice and feed that data to Foxx.

The more complex your data model becomes, the more you have to deal with, so keep it simple to start with.

Small CSV files can be sent to a Foxx Microservice in a JSON payload, and Foxx will handle the validation and database updates from there.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM