My data is like this:
movies = [
"movie 1",
"movie 2",
"movie 3",
"movie 4",
"movie 5",
"movie 6",
"movie 7",
"movie 8",
"movie 9",
"movie 10",
"movie 11",
"movie 12",
"movie 13",
"movie 14",
"movie 15",
]
list_of_tuples = [
("movie 1", "movie 3"),
("movie 3", "movie 6"),
("movie 6", "movie 9"),
("movie 9", "movie 12"),
("movie 12", "movie 15"),
("movie 2", "movie 4"),
("movie 4", "movie 7"),
("movie 8", "movie 10"),
("movie 10", "movie 5"),
("movie 14", "movie 13"),
("movie 11", "movie 13"),
]
Output should be like this:
result_dict = {'movie 1' : ['movie 1' , 'movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'],
'movie 2' : ['movie 2', 'movie 4', 'movie 7'],
'movie 3' : ['movie 1' , 'movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'],
....}
Here elements in tuples are same so 'movie 1' is similar to 'movie 3' and 'movie 3' is similar to 'movie 6' and 'movie 6' is to 'movie 9' and 'movie 9' to 'movie 12' and 'movie 12' to ' movie 15'.
I want to get a dictionary which has all the similar items as values.
I have tried like this, but I am not getting result:
result_dict = {movie : list() for movie in movies}
for tup in list_of_tuples:
mov1, mov2 = tup
result_dict[mov1].append(mov2)
result_dict[mov2].append(mov1)
for x in result_dict[mov2]:
if x not in result_dict[mov1]:
result_dict[mov1].append(x)
for x in result_dict[mov1]:
if x not in result_dict[mov2]:
result_dict[mov2].append(x )
Please help me transform this with minimum time complexity.
Thanks in advance.
Thanks to @James Lin for helping to get this result, I am posting below how the code looks.
relationships = []
relationship = set()
for tuple_data in list_of_tuples:
tuple_data = set(tuple_data)
if tuple_data.intersection(relationship):
relationship |= tuple_data
else:
# broken link
relationship = set()
relationship |= tuple_data
relationships.append(relationship)
for idx in range(len(relationships)):
relationships[idx] = list(relationships[idx])
result_dict = {movie : list() for movie in movies}
for key in result_dict.keys():
for item in relationships:
if key in item:
result_dict[key] = item
and Output is:
{'movie 1': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 2': ['movie 7', 'movie 4', 'movie 2'], 'movie 3': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 4': ['movie 7', 'movie 4', 'movie 2'], 'movie 5': ['movie 10', 'movie 5', 'movie 8'], 'movie 6': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 7': ['movie 7', 'movie 4', 'movie 2'], 'movie 8': ['movie 10', 'movie 5', 'movie 8'], 'movie 9': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 10': ['movie 10', 'movie 5', 'movie 8'], 'movie 11': ['movie 14', 'movie 11', 'movie 13'], 'movie 12': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 13': ['movie 14', 'movie 11', 'movie 13'], 'movie 14': ['movie 14', 'movie 11', 'movie 13'], 'movie 15': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3']}
Please help me in understanding the complexity of this whole process. It would be also great to get it optimized.
Thanks
Assuming your relationships are ordered top down, your description is not exactly clear, I am going to give a try to give you some hint:
You need to loop through the list_of_tuples
to build the relationships between each element
relationships = []
relationship = set()
for tuple_data in list_of_tuples:
tuple_data = set(tuple_data)
if tuple_data.intersection(relationship):
relationship |= tuple_data
else:
# broken link
relationship = tuple_data
relationships.append(relationship)
print(relationships)
This will print out:
[{'movie 15', 'movie 12', 'movie 6', 'movie 9', 'movie 3', 'movie 1'}, {'movie 2', 'movie 7', 'movie 4'}, {'movie 8', 'movie 5', 'movie 10'}, {'movie 11', 'movie 14', 'movie 13'}]
From this list you will be able generate your desired dictionary.
UPDATE: use set() to solve movie 11 relate to movie 13
UPDATE: you can first try to profile your code, eg. _ldap.get_option(_ldap.OPT_API_INFO) is slow after upgrading to MacOS Mojave
You can use defaultdict
to get this done.
from collections import defaultdict
list_of_tuples = [
("movie 1", "movie 3"),
("movie 3", "movie 6"),
("movie 6", "movie 9"),
("movie 9", "movie 12"),
("movie 12", "movie 15"),
("movie 2", "movie 4"),
("movie 4", "movie 7"),
("movie 8", "movie 10"),
("movie 10", "movie 5"),
("movie 14", "movie 13"),
("movie 11", "movie 13"),
]
result_dict = defaultdict(list)
for k ,v in list_of_tuples:
#for value in the tuple, find out if this is already part of
#the existing dictionary. If yes, get the key so you can
#append to the key else start a new key item
a = ''.join([x for x, y in result_dict.items() for z in y if z == k])
#if found, above list comprehension will result in 1 element
if a = '' : #if not found, then create a new list for key
result_dict[k].append(v)
else: # value is part of a key list, so append value to key list
result_dict[a].append(v)
result_dict = dict(result_dict)
print (result_dict)
The output of the above code is:
{'movie 1': ['movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'], 'movie 2': ['movie 4', 'movie 7'], 'movie 8': ['movie 10', 'movie 5'], 'movie 14': ['movie 13'], 'movie 11': ['movie 13']}
Is this what you are looking for
You can always call dict(list_of_tuples)
to get a corresponding dictionary for these tuples.
I don't know if this is THE most efficient timewise, but I get what you're trying to get in ~ O(n) with the following code:
from collections import defaultdict
movie_dict = dict(list_of_tuples)
index = defaultdict(list)
for key, value in movie_dict.items():
index[key] += [value]
index[value] += [key]
The output is:
defaultdict(list,
{'movie 1': ['movie 3'],
'movie 3': ['movie 1', 'movie 6'],
'movie 6': ['movie 3', 'movie 9'],
'movie 9': ['movie 6', 'movie 12'],
'movie 12': ['movie 9', 'movie 15'],
'movie 15': ['movie 12'],
'movie 2': ['movie 4'],
'movie 4': ['movie 2', 'movie 7'],
'movie 7': ['movie 4'],
'movie 8': ['movie 10'],
'movie 10': ['movie 8', 'movie 5'],
'movie 5': ['movie 10'],
'movie 14': ['movie 13'],
'movie 13': ['movie 14', 'movie 11'],
'movie 11': ['movie 13']})
ETA: This gives you an index by movie of what it's similar to. If you want the movie equivalence classes, so to speak, you'll need to do some set operations. I'll add more info shortly.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.