简体   繁体   中英

Neo4j cypher query from Training

I just finished the training at http://www.neo4j.org/learn/online_course and had a couple of questions about the lab answers.

The first is from the Advanced Graph Lab in Lesson 2. (no answer was given and it doesn't verify in the graph widget thingie)

The question is: Recommend 3 actors that Keanu Reeves should work with (but hasn't). The hint is that you should basically pick the three People that have ACTED_IN relationships with Movies that Keanu hasn't also ACTED_IN.

The graph has Person nodes and Movie nodes with ACTED_IN relationships and DIRECTED relationships.

I came up with this:

MATCH (a:Person)-[:ACTED_IN]->(movie:Movie)
WHERE NOT (:Person {name:"Keanu Reeves"})-[:ACTED_IN]->(movie)
RETURN a, count(movie)
ORDER BY count(movie) DESC
LIMIT 3

but I couldn't tell if that actually excluded the same movie or just keanu reeves (because the actors that were returned hadn't been in Keanu's movies, but they might have been returned anyway.

I've found two Solutions so far.

1: Recommend the busiest actors Keanu Reeves has not acted with.

MATCH (p:Person)-[:ACTED_IN]->(m)
WHERE p.name <> 'Keanu Reeves'
AND NOT (p)-[:ACTED_IN]->()<-[:ACTED_IN]-(:Person{name:'Keanu Reeves'})
RETURN p.name, count(m) AS rating
ORDER BY count(m) DESC
LIMIT 3;

Which yields

p.name          | rating
--------------------------
Tom Hanks       | 12
Meg Ryan        | 5
Cuba Gooding Jr.| 4

2: Recommend the actors keanu Reeves' co-stars have collaborated with the most

MATCH (f:Person)-[:ACTED_IN]->(m)<-[:ACTED_IN]-(c:Person),
(k:Person{name:'Keanu Reeves'})
WHERE c.name <> 'Keanu Reeves'
AND (f)-[:ACTED_IN]->()<-[:ACTED_IN]-(k)
AND NOT (c)-[:ACTED_IN]->()<-[:ACTED_IN]-(k)
RETURN c.name, count(c) AS  Rating
ORDER BY Rating desc
LIMIT 3;

Which yields

p.name          | rating
--------------------------
Danny DeVito    | 2
J.T. Walsh      | 2
Tom Hanks       | 2

Today i came across with this question, and here what i did

MATCH (keanu:Person)-[:ACTED_IN]->(movie),
      (playedwith:Person)-[:ACTED_IN]->(movie), 
      (playedwith)-[t:ACTED_IN]->(othermovie),
      (other:Person)-[:ACTED_IN]->(othermovie)
WHERE keanu.name = "Keanu Reeves"
      AND NOT (other)-[:ACTED_IN]->(movie)
      AND NOT (keanu)-[:ACTED_IN]->(othermovie)
RETURN other.name
      ,collect(DISTINCT othermovie)
      ,collect(DISTINCT playedwith)
      ,count(DISTINCT playedwith)
ORDER BY count(DISTINCT playedwith)desc
LIMIT 3

Since there is so many Distict i don't like it though but here is the result:

other.name    | collect(DISTINCT othermovie) | collect(DISTINCT playedwith)        | count(DISTINCT playedwith)
-----------------------------------------------------------------------------------------------------------------------------
Tom Hanks     | ["Cloud Atlas",              | ["Hugo Weaving","Charlize Theron"]  | 2
              |  "That Thing You Do"]        |
Tom Cruise    | ["A Few Good Men"]           | ["Jack Nicholson"]                  | 1
Robin Williams| ["The Birdcage"]             | ["Gene Hackman"]                    | 1

So I found 2 different ways that seem good. The first one finds the people who have the most "ACTED_IN the same movie" paths who the original person is not someone that Keanu Reeves has a "ACTED_IN the same movie" relationship with.

The second finds someone that hasn't ACTED_IN a Movie with Keanu Reeves, but is ordered by who's worked in the most movies.

Of course, it would be easiest to have created a "WORKED_WITH" relationship between all the actors that share this relationship and then searched for everyone Keanu hasn't WORKED_WITH, but that defeats the fun of the question I guess.

First solution that's pretty simple and seems pretty accurate:

MATCH (a:Person {name:"Keanu Reeves"})-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(b:Person)
WITH collect(b.name) AS FoF
MATCH (c:Person)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(d:Person)
WHERE not c.name IN FoF AND c.name <> "Keanu Reeves"
RETURN distinct c.name, count(distinct d)
ORDER BY count(distinct d) desc
limit 3

It return:

c.name          | count(distinct d)
-------------------------------
Tom Hanks       |    34
Cuba Gooding Jr.|    24
Tom Cruise      |    23

Where d is the number of people c has "ACTED_IN" with.


Edited to add:

After 's answer I used their much more streamlined query approach to come up with this:

MATCH (a:Person)-[:ACTED_IN]->()<-[:ACTED_IN]-(b:Person)
WHERE a.name <>'Keanu Reeves' 
AND NOT (a)-[:ACTED_IN]->()<-[:ACTED_IN]-(b:Person {name:'Keanu Reeves'})
RETURN a.name, count(Distinct b) AS Rating
ORDER BY Rating DESC
LIMIT 3

which returns the same thing as above.


Alternatively I used this for the people that have worked in the most movies:

MATCH (a:Person {name:"Keanu Reeves"})-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(b:Person)
WITH collect(b.name) AS FoF
MATCH (c:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(d:Person)
WHERE not c.name IN FoF AND c.name <> "Keanu Reeves"
RETURN distinct c.name, count(distinct m)
ORDER BY count(distinct m) desc
limit 3

which returns:

c.name           |  count(distinct m)
-------------------------------------------
Tom Hanks        |  11
Meg Ryan         |  5
Cuba Gooding Jr. |  4

where m is the number of movies they've worked in.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM