简体   繁体   中英

How to create Neo4J relationship between nodes yelp dataset

I am new to Neo4j. I am trying to populate Yelp dataset in Neo4j. Basically, I am interested in three json file provided by them ie

user.json

{
    "user_id": "-lGwMGHMC_XihFJNKCJNRg",
    "name": "Gabe",
    "review_count": 277,
    "yelping_since": "2014-10-31",
    "friends": ["Oa84FFGBw1axX8O6uDkmqg", "SRcWERSl4rhm-Bz9zN_J8g", "VMVGukgapRtx3MIydAibkQ", "8sLNQ3dAV35VBCnPaMh1Lw", "87LhHHXbQYWr5wlo5W7_QQ"],
    "useful": 45,
    "funny": 4,
    "cool": 55,
    "fans": 17,
    "elite": [],
    "average_stars": 4.72,
    "compliment_hot": 5,
    "compliment_more": 1,
    "compliment_profile": 0,
    "compliment_cute": 1,
    "compliment_list": 0,
    "compliment_note": 11,
    "compliment_plain": 20,
    "compliment_cool": 15,
    "compliment_funny": 15,
    "compliment_writer": 1,
    "compliment_photos": 8
}

I have omitted several entries from friends array to make output readable

business.json

{
    "business_id": "YDf95gJZaq05wvo7hTQbbQ",
    "name": "Richmond Town Square",
    "neighborhood": "",
    "address": "691 Richmond Rd",
    "city": "Richmond Heights",
    "state": "OH",
    "postal_code": "44143",
    "latitude": 41.5417162,
    "longitude": -81.4931165,
    "stars": 2.0,
    "review_count": 17,
    "is_open": 1,
    "attributes": {
        "RestaurantsPriceRange2": 2,
        "BusinessParking": {
            "garage": false,
            "street": false,
            "validated": false,
            "lot": true,
            "valet": false
        },
        "BikeParking": true,
        "WheelchairAccessible": true
    },
    "categories": ["Shopping", "Shopping Centers"],
    "hours": {
        "Monday": "10:00-21:00",
        "Tuesday": "10:00-21:00",
        "Friday": "10:00-21:00",
        "Wednesday": "10:00-21:00",
        "Thursday": "10:00-21:00",
        "Sunday": "11:00-18:00",
        "Saturday": "10:00-21:00"
    }
}

review.json

{
    "review_id": "VfBHSwC5Vz_pbFluy07i9Q",
    "user_id": "-lGwMGHMC_XihFJNKCJNRg",
    "business_id": "YDf95gJZaq05wvo7hTQbbQ",
    "stars": 5,
    "date": "2016-07-12",
    "text": "My girlfriend and I stayed here for 3 nights and loved it.",
    "useful": 0,
    "funny": 0,
    "cool": 0
}

As we can see in the sample files that relationship between user and business is associated via the review.json file. How can I create a relationship edge between user and business using the review.json file.

I have also seen Mark Needham tutorial where he has shown StackOverflow data population but in that case, relationship file was already present with sample data. Do I need to build a similar file? If yes, how should I approach this problem? or is there any other way to build relationship between user & business?

It very much depends on your model as to what you want to do, but you could do 3 imports:

//Create Users - does assume the data is unique
CALL apoc.load.json('file:///c://temp//SO//user.json') YIELD value AS user
CREATE (u:User)
SET u = user

then add the businesses:

CALL apoc.load.json('file:///c://temp//SO//business.json') YIELD value AS business
CREATE (b:Business {
            business_id     : business.business_id,
            name            : business.name,
            neighborhood    : business.neighborhood,
            address         : business.address,
            city            : business.city,
            state           : business.state,
            postal_code     : business.postal_code,
            latitude        : business.latitude,
            longitude       : business.longitude,
            stars           : business.stars,
            review_count    : business.review_count,
            is_open         : business.is_open,
            categories      : business.categories
        })

For the businesses, we can't just do the SET b = business because the JSON has nested maps. So you might want to decide if you want them, and might have to go down a different route.

Lastly, the reviews, which is where we join it all up.

CALL apoc.load.json('file:///c://temp//SO//review.json') YIELD value AS review
CREATE (r:Review)
SET r = review
WITH r
//Match user to a review
MATCH (u:User {user_id: r.user_id})
CREATE (u)-[:HAS_REVIEW]->(r)
WITH r, u
//Match business to a review, and a user to a business
MATCH (b:Business {business_id: r.business_id})
//Merge here in case of multiple reviews
MERGE (u)-[:HAS_REVIEWED]->(b)
CREATE (b)-[:HAS_REVIEW]->(r)

Obviously - change labels/relationships to types you want, and it might need tuning depending on the size of data etc, so you might need to use apoc.periodic.iterate to work it.

Apoc is here if you need it (and you should use it!)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM