简体   繁体   English

使用CSV文件中的ETL将OPoint数据导入OrientDB 2.2.x

[英]Importing OPoint data into OrientDB 2.2.x using ETL from a CSV file

This is related to my earlier questions 这与我之前的问题有关

  1. Spatial query with sub-select (I figured ths one out) 带有子选择的空间查询 (我想通了)
  2. OrientDB spatial query to find all pairs within X km of each other (still looking for a useful answer) OrientDB空间查询以查找彼此之间X公里之内的所有对 (仍在寻找有用的答案)

In response to (2), I am looking at modifying my nazca geoglyph dataset to use the WKT version to be consistent with the newer OrientDB 2.2.x Spatial Index functionality. 响应(2),我正在考虑修改我的nazca地理字形数据集以使用WKT版本,以与更新的OrientDB 2.2.x空间索引功能一致。

My input CSV file, nazca_lines_wkt.csv is this: 我输入的CSV文件nazca_lines_wkt.csv是这样的:

Name,Location
Hummingbird,POINT(-75.148892 -14.692131)
Monkey,POINT(-75.138532 -14.706940)
Condor,POINT(-75.126208 -14.697444)
Spider,POINT(-75.122381 -14.694145)
Spiral,POINT(-75.122746 -14.688277)
Hands,POINT(-75.113881 -14.694459)
Tree,POINT(-75.114520 -14.693898)
Astronaut,POINT(-75.079755 -14.745222)
Dog,POINT(-75.130788 -14.706401)
Wing,POINT(-75.100385 -14.680309)
Parrot,POINT(-75.107498 -14.689463)

I create an empty PLOCAL database, nazca-wkt.orientdb and define a GeoGlyphWKT vertex class: 我创建一个空的PLOCAL数据库nazca-wkt.orientdb并定义一个GeoGlyphWKT顶点类:

CREATE DATABASE PLOCAL:nazca-wkt.orientdb admin admin plocal graph

CREATE CLASS GeoGlyphWKT EXTENDS V

CREATE PROPERTY GeoGlyphWKT.Name      STRING
CREATE PROPERTY GeoGlyphWKT.Location  EMBEDDED OPoint
CREATE PROPERTY GeoGlyphWKT.Tag       EMBEDDEDSET STRING

I have two .json files that I use for the oetl script: 我有两个用于oetl脚本的.json文件:

nazca_lines_wkt.json nazca_lines_wkt.json

{
    "config": {
        "log": "info",
        "fileDirectory": "./",
        "fileName": "nazca_lines_wkt.csv"
    }
}

commonGeoGlyphWKT.json commonGeoGlyphWKT.json

{
    "begin": [ { "let": { "name": "$filePath",  "expression": "$fileDirectory.append($fileName )" } } ],
    "config": { "log": "debug" },
    "source": { "file": { "path": "$filePath" } },
    "extractor":
        {
        "csv": { "ignoreEmptyLines": true,
                 "nullValue": "N/A",
                 "separator": ",",
                 "columnsOnFirstLine": true,
                 "dateFormat": "yyyy-MM-dd"
               }
        },
    "transformers": [ { "vertex": { "class": "GeoGlyphWKT" } } ],
    "loader": {
        "orientdb": {
            "dbURL": "plocal:nazca-wkt.orientdb",
            "dbType": "graph",
            "batchCommit": 1000
        }
    }
}

I run oetl using this command: 我使用以下命令运行oetl:

$ oetl.sh commonGeoGlyphWKT.json nazca_lines_wkt.json

but this fails with the following output: 但这失败,并显示以下输出:

$ oetl.sh commonGeoGlyphWKT.json nazca_lines_wkt.json
OrientDB etl v.2.2.13 (build 2.2.x@r90d7caa1e4af3fad86594e592c64dc1202558ab1; 2016-11-15 12:04:05+0000) www.orientdb.com
BEGIN ETL PROCESSOR
[file] INFO Reading from file ./nazca_lines_wkt.csv with encoding UTF-8
Started execution with 1 worker threads
Error in Pipeline execution: com.orientechnologies.orient.core.exception.OValidationException: impossible to convert value of field "Location"
    DB name="nazca-wkt.orientdb"
ETL process has problem: java.util.concurrent.ExecutionException: com.orientechnologies.orient.core.exception.OValidationException: impossible to convert value of field "Location"
    DB name="nazca-wkt.orientdb"
END ETL PROCESSOR
+ extracted 9 rows (0 rows/sec) - 9 rows -> loaded 0 vertices (0 vertices/sec) Total time: 16ms [0 warnings, 1 errors]

I'm sure it's something silly that I'm missing... has anyone been able to import CSV files containing WKT strings for points, polygons, etc using ETL? 我确定我很想念这是一个愚蠢的事情……有人能够使用ETL导入包含点,面等的WKT字符串的CSV文件吗?

Any help is appreciated! 任何帮助表示赞赏!

this is working for me: 这为我工作:

commonGeoGlyphWKT.json commonGeoGlyphWKT.json

{
  "source": { "file": { "path": "./nazca_lines_wkt.csv" } },
  "extractor": { "csv": {
    "separator": ",",
    "columns": ["Name:String","Location:String"] } },
  "transformers": [
    { "command": { "command": "INSERT INTO GeoGlyphWKT(Name,Location) values('${input.Name}', St_GeomFromText('${input.Location}'))"} }
  ],
  "loader": {
    "orientdb": {
        "dbURL": "plocal:/home/ivan/OrientDB/db_installati/enterprise/orientdb-enterprise-2.2.13/databases/stack40982509-spatial",
        "dbUser": "admin",
        "dbPassword": "admin",
        "dbType": "graph",
        "batchCommit": 1000
    }
  }
}

nazca_lines_wkt.csv nazca_lines_wkt.csv

Name,Location
Hummingbird,POINT (-75.148892 -14.692131)
Monkey,POINT (-75.138532 -14.706940)
Condor,POINT(-75.126208 -14.697444)
Spider,POINT(-75.122381 -14.694145)
Spiral,POINT(-75.122746 -14.688277)
Hands,POINT(-75.113881 -14.694459)
Tree,POINT(-75.114520 -14.693898)
Astronaut,POINT(-75.079755 -14.745222)
Dog,POINT(-75.130788 -14.706401)
Wing,POINT(-75.100385 -14.680309)
Parrot,POINT(-75.107498 -14.689463)

[ivan@canemagico-pc bin]$ ./oetl.sh commonGeoGlyphWKT2.json

OrientDB etl v.2.2.13 (build 2.2.x@r90d7caa1e4af3fad86594e592c64dc1202558ab1; 2016-11-15 12:04:05+0000) www.orientdb.com
[csv] INFO column types: {Name=STRING, Location=STRING}
BEGIN ETL PROCESSOR
[file] INFO Reading from file ./nazca_lines_wkt.csv with encoding UTF-8
Started execution with 1 worker threads
[orientdb] INFO committing
END ETL PROCESSOR
+ extracted 11 rows (0 rows/sec) - 11 rows -> loaded 11 vertices (0 vertices/sec) Total time: 244ms [0 warnings, 0 errors]

orientdb {db=stack40982509-spatial}> select from GeoGlyphWKT                                                                                                           

+----+-----+-----------+-----------+-----------------------+
|#   |@RID |@CLASS     |Name       |Location               |
+----+-----+-----------+-----------+-----------------------+
|0   |#25:0|GeoGlyphWKT|Hummingbird|OPoint{coordinates:[2]}|
|1   |#25:1|GeoGlyphWKT|Spiral     |OPoint{coordinates:[2]}|
|2   |#25:2|GeoGlyphWKT|Dog        |OPoint{coordinates:[2]}|
|3   |#26:0|GeoGlyphWKT|Monkey     |OPoint{coordinates:[2]}|
|4   |#26:1|GeoGlyphWKT|Hands      |OPoint{coordinates:[2]}|
|5   |#26:2|GeoGlyphWKT|Wing       |OPoint{coordinates:[2]}|
|6   |#27:0|GeoGlyphWKT|Condor     |OPoint{coordinates:[2]}|
|7   |#27:1|GeoGlyphWKT|Tree       |OPoint{coordinates:[2]}|
|8   |#27:2|GeoGlyphWKT|Parrot     |OPoint{coordinates:[2]}|
|9   |#28:0|GeoGlyphWKT|Spider     |OPoint{coordinates:[2]}|
|10  |#28:1|GeoGlyphWKT|Astronaut  |OPoint{coordinates:[2]}|
+----+-----+-----------+-----------+-----------------------+

11 item(s) found. Query executed in 0.013 sec(s).

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM