简体   繁体   中英

Compare CSV file with Neo4j results

I have a task to compare an oracle export(CSV like format but not comma as delimiter) with a neo4j export.

One oracle csv file(can have milion of rows) format is like:

OBJECT_ID|'¦'|NAME|'¦'|SITE_LOCATION|'¦'|PARENT_ID|'¦'|LOCATION_CODE
9144735089013188062|¦|00|¦|9144735080313909184|¦|9144735085613119290|¦|O2GB

Here OBJECT_ID is unique and data is sorted by it.

  1. Now my first approach was to create from cypher db a similar csv using some java code saving in a Map<String, Map<String, String>> variable the results of a cypher query like :

{"loc1"={ObjectId="9144735079813886326", NAME="locationName", SITE_LOCATION="Location", ParentId="9144735080313909184"}, "loc2"={ObjectId="9144735079813886326", NAME="locationName", SITE_LOCATION="Location", ParentId="9144735080313909184"}}

and export it to a csv.

Then I have to load both csvs back to java in order to compare them and create some kind of report in which I need to have the name of the key if the value from the 2 csvs does not match.

  1. Second approach that I can think of is to load the oracle csv into a Map<String, Map<String, String>> or some datatype and compare it with my cypher results thus skipping the neo4j to csv conversion.

Would it be possible to load in parallel from each csv line by line in a similar Map of something without the need of loading both csvs at the same time into memory?

What would be the best approach of this?

也许您应该将 CSV 加载到任何关系数据库(可能是您已经拥有的 Oracle)并使用 SQL 查询进行比较?

Are the JSON files the same? You could just use a diff tool like Meld

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM