简体   繁体   English

带有JSON数据文件的OrientDB ETL

[英]OrientDB ETL with JSON data file

I am not seeing good documentation on how to use the OrientDB ETL function to load a JSON data file. 我没有看到有关如何使用OrientDB ETL函数加载JSON 数据文件的好的文档。

I am running this command: ./oetl.sh ../template_etl.json 我正在运行以下命令:./oetl.sh ../template_etl.json
The contents of template_etl.json looks like this: template_etl.json的内容如下所示:

{
    "config": {
        "log": "debug"
    },
    "begin": [
    ],
    "source" : {
        "file": {"path": "../repos.json", "lock" : true }
    },
    "extractor" : {
        "row": {}
    },
    "transformers" : [
        {"json"},
        { "vertex": { "class": "V" } }
    ],
    "loader" : {
        "orientdb": {
            "dbURL": "plocal../databases/template",
            "dbUser": "admin",
            "dbPassword": "admin",
            "dbAutoCreate": true,
            "tx": false,
            "batchCommit": 1000,
            "dbType": "graph"
        }
    }
}

I took this example from a csv example from https://www.udemy.com/orientdb-getting-started/#/lecture/1998370 where this line: {"json"}, was originally: {"csv": {"separator": ",", "multiValue": "NULL", "skipFrom": 1, "skipTo": 1 } }, 我从https://www.udemy.com/orientdb-getting-started/#/lecture/1998370的csv示例中获取了此示例,其中此行:{“ json”},最初是:{“ csv”:{“分隔符“:”,“,” multiValue“:” NULL“,” skipFrom“:1,” skipTo“:1}},

The Error I am getting is: orientdb-community-2.0/bin$ ./oetl.sh ../template_etl.json 我收到的错误是: orientdb-community-2.0 / bin $ ./oetl.sh ../template_etl.json

OrientDB etl v.2.0 (build @BUILD@) www.orientechnologies.com
Exception in thread "main" com.orientechnologies.orient.core.exception.OSerializationException: Error on unmarshalling JSON content for record: "config": {
        "log": "debug"
    },
    "begin": [
    ],
    "source" : {
        "file": {"path": "../repos.json", "lock" : true }
    },
    "extractor" : {
        "row": {}
    },
    "transformers" : [
        {"json"},
        { "vertex": { "class": "V" } }
    ],
    "loader" : {
        "orientdb": {
            "dbURL": "plocal../databases/template",
            "dbUser": "admin",
            "dbPassword": "admin",
            "dbAutoCreate": true,
            "tx": false,
            "batchCommit": 1000,
            "dbType": "graph"
        }
    }

    at   com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerJSON.fromString(ORecordSerializerJSON.java:304)
        at com.orientechnologies.orient.core.record.ORecordAbstract.fromJSON(ORecordAbstract.java:165)
        at com.orientechnologies.orient.core.record.impl.ODocument.fromJSON(ODocument.java:1712)
        at com.orientechnologies.orient.etl.OETLProcessor.main(OETLProcessor.java:147)
    Caused by: com.orientechnologies.orient.core.exception.OSerializationException: Error on unmarshalling JSON content: wrong format ""json"". Use <field> : <value>
        at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerJSON.fromString(ORecordSerializerJSON.java:181)
        at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerJSON.getValueAsRecord(ORecordSerializerJSON.java:595)
        at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerJSON.getValueAsObjectOrMap(ORecordSerializerJSON.java:565)
        at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerJSON.getValue(ORecordSerializerJSON.java:413)
        at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerJSON.parseCollection(ORecordSerializerJSON.java:677)
        at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerJSON.getValueAsEmbeddedCollection(ORecordSerializerJSON.java:659)
        at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerJSON.getValueAsCollection(ORecordSerializerJSON.java:638)
        at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerJSON.getValue(ORecordSerializerJSON.java:415)
        at com.orientechnologies.orient.core.serialization.serializer.record.string.ORecordSerializerJSON.fromString(ORecordSerializerJSON.java:249)
        ... 3 more

I'm hoping there is a way to load a JSON data file directly into OreintDB. 我希望有一种方法可以将JSON 数据文件直接加载到OreintDB中。

Thanks 谢谢

The json is not valid. json无效。 Try to validate with www.jsonlint.com. 尝试通过www.jsonlint.com进行验证。 Try replacing: 尝试更换:

{"json"},

With: 带有:

{"json": {} },

I'm no expert like Lvca but your source file has a json extension. 我不是Lvca的专家,但是您的源文件具有json扩展名。 Which means your extractor must be replaced with ("json": {}) and there are no "json" transformers. 这意味着您的提取器必须替换为(“ json”:{}),并且没有“ json”转换器。

"extractor" : {
    "json": {}
},
"transformers" : [
    { "vertex": { "class": "V" } }
],

http://orientdb.com/docs/last/Transformer.html http://orientdb.com/docs/last/Transformer.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM