简体   繁体   中英

Friends recommendation algorithm with O(n) time complexity in JavaScript

const data = [
  {
    name: 'Bob',
    friends: ['Alice', 'Eve', 'Charlie']
  },
  {
    name: 'Alice',
    friends: ['Bob', 'Dan']
  },
  {
    name: 'Dan',
    friends: ['Alice', 'Eve']
  },
  {
    name: 'Charlie',
    friends: ['Bob']
  }
];

function get_top_k_recommended_friends(name, cutoff) {

}

get_top_k_recommended_friends('Alice', 2) // [Eve, Charlie]
get_top_k_recommended_friends('Alice', 1) // [Eve]

This function accepts two integers (user_id, cutoff_k) and provides recommendations of new friends (represented as a list of integers) to this particular user_id. The definition of a recommended friend is as below:

A recommended friend is a friend who has mutual friends with the original user_id’s friends. 

For example, assume Alice is friends with Bob and Bob is friends with Eve but Alice is not friends with Eve. So when you call get_recommended_friends(Alice), you get Eve. You also get Alice if you call get_recommended_friends(Eve). 

If Bob also is friends with Charlie but Alice is not friends with Charlie, then calling get_recommended_friends(Alice) should yield [Eve, Charlie].

Two IMPORTANT requirements for writing get_recommended_friends is that

  1. The returned list of recommended friends must be sorted by the most number of mutual friends they have with the requested user
  2. they should only return top k recommend friends (k is a cutoff)

Based on the provided data calling get_top_k_recommended_friends(Alice, 2) should yield [Eve, Charlie] where Eve is ordered before Charlie as Eve is friends with two of Alice's friends (Bob and Dan) and Charlie is only friends with one of Alice's friends (Bob). get_top_k_recommended_friends(Alice, 1) will yield [Eve].

Nodes on the graph that has a distance of 2 from the root(original user) and is connected to at least one node of distance 1 should be a mutual friend.

(eg Eve and Charlie are of distance 2 from Alice, and both are connected with nodes of distance 1 (Bob and Dan))

Therefore, for each friend of distance 2 ( f2 ), you can count how many nodes of distance 1 it is connected to ( mutual_count ).

get_top_k_recommended_friends(user, cutoff)
  for f1 in friends[user]:
    for f2 in friends[f1] - friends[user]: // friends of f1 but not of user
      for f3 in friends[f2]: // check each edge(f2, f3)
        if edge(f2, f3) not visited && f3 != f1 && f3 in friends[user]: 
          // f2 is a mutual friend of f1 and f3
          // f1 and f3 are both friends of user, so f2 is a recommended friend
          mutual_count[f2] += 1
          mark edge(f2, f3) and edge(f3, f2) as visited

  return k_largest(mutual_count, cutoff)

Checking if element exist in set can be done in O(1). The algorithm will only visit friends of distance 1 and 2 from the root (and their edges), so everything before return k_largest(mutual_count, cutoff) should be O(n+m), where n and m are the number of aforementioned nodes and edges, respectively.

For k_largest , you can use the quickselect algorithm to find the kth largest value, and then filter out and sort the k largest values, which should have an average complexity of complexity O(n + k log k)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM