简体   繁体   English

是否可以使用其ETL工具将逐行JSON导入OrientDB?

[英]Is it possible to import line-wise JSON into OrientDB using their ETL tool?

I have a bunch of files (~10Gb each) where each line represents a single JSON object. 我有一堆文件(每个文件约10Gb),其中每行代表一个JSON对象。 I want to import them in the streaming mode, but looks like it is not supported right now (OrientDB v.2.2.12). 我想以流模式导入它们,但是现在似乎不支持它(OrientDB v.2.2.12)。 Are there any workarounds? 有什么解决方法吗? And what is the recommended way for this case? 在这种情况下,推荐的方法是什么?

Looks like that JSON can be transformed to the ODocument in CODE block: 看起来JSON可以转换为CODE块中的ODocument:

{
    "code": {
        "language": "Javascript",
        "code": "(new com.orientechnologies.orient.core.record.impl.ODocument()).fromJSON(input);"
    }
}

If you experience errors like: 如果遇到以下错误:

Error in Pipeline execution: com.orientechnologies.orient.core.exception.OSerializationException: Found invalid } character at position 112 of text 管道执行错误:com.orientechnologies.orient.core.exception.OSerializationException:在文本的位置112处发现无效的}字符

Then just ensure that multiline option is set to off. 然后只需确保将多行选项设置为关闭。

"extractor": {
    "row": {
        "multiLine": false
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM