简体   繁体   English

将 CSV 文件与 Neo4j 结果进行比较

[英]Compare CSV file with Neo4j results

I have a task to compare an oracle export(CSV like format but not comma as delimiter) with a neo4j export.我有一个任务来比较 oracle 导出(类似 CSV 的格式,但不是逗号作为分隔符)与 Neo4j 导出。

One oracle csv file(can have milion of rows) format is like:一个 oracle csv 文件(可以有数百万行)格式如下:

OBJECT_ID|'¦'|NAME|'¦'|SITE_LOCATION|'¦'|PARENT_ID|'¦'|LOCATION_CODE
9144735089013188062|¦|00|¦|9144735080313909184|¦|9144735085613119290|¦|O2GB

Here OBJECT_ID is unique and data is sorted by it.这里 OBJECT_ID 是唯一的,数据按它排序。

  1. Now my first approach was to create from cypher db a similar csv using some java code saving in a Map<String, Map<String, String>> variable the results of a cypher query like :现在我的第一种方法是使用一些保存在Map<String, Map<String, String>>变量中的 java 代码从密码数据库创建一个类似的 csv 密码查询的结果,如:

{"loc1"={ObjectId="9144735079813886326", NAME="locationName", SITE_LOCATION="Location", ParentId="9144735080313909184"}, "loc2"={ObjectId="9144735079813886326", NAME="locationName", SITE_LOCATION="Location", ParentId="9144735080313909184"}} {"loc1"={ObjectId="9144735079813886326", NAME="locationName", SITE_LOCATION="Location", ParentId="9144735080313909184"}, "loc2"={ObjectId="9144738079326", SITE_LOCATION="位置", "位置", ParentId="9144735080313909184"}}

and export it to a csv.并将其导出到 csv。

Then I have to load both csvs back to java in order to compare them and create some kind of report in which I need to have the name of the key if the value from the 2 csvs does not match.然后我必须将两个 csvs 加载回 java 以便比较它们并创建某种报告,如果 2 个 csvs 的值不匹配,我需要在其中包含密钥的名称。

  1. Second approach that I can think of is to load the oracle csv into a Map<String, Map<String, String>> or some datatype and compare it with my cypher results thus skipping the neo4j to csv conversion.我能想到的第二种方法是将 oracle csv 加载到Map<String, Map<String, String>>或某种数据类型中,并将其与我的密码结果进行比较,从而跳过 neo4j 到 csv 的转换。

Would it be possible to load in parallel from each csv line by line in a similar Map of something without the need of loading both csvs at the same time into memory?是否可以在类似的 Map 中从每个 csv 逐行加载,而无需同时将两个 csv 加载到内存中?

What would be the best approach of this?最好的方法是什么?

也许您应该将 CSV 加载到任何关系数据库(可能是您已经拥有的 Oracle)并使用 SQL 查询进行比较?

Are the JSON files the same? JSON 文件是否相同? You could just use a diff tool like Meld你可以使用像Meld这样的差异工具

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM