简体   繁体   English

Python 到字典的元组列表具有作为元组列表中所有相似项的列表的值

[英]Python List of tuples to Dictionary having value as list of all similar item in list of tuples

My data is like this:我的数据是这样的:

movies = [
    "movie 1",
    "movie 2",
    "movie 3",
    "movie 4",
    "movie 5",
    "movie 6",
    "movie 7",
    "movie 8",
    "movie 9",
    "movie 10",
    "movie 11",
    "movie 12",
    "movie 13",
    "movie 14",
    "movie 15",
]
list_of_tuples = [
    ("movie 1", "movie 3"),
    ("movie 3", "movie 6"),
    ("movie 6", "movie 9"),
    ("movie 9", "movie 12"),
    ("movie 12", "movie 15"),
    ("movie 2", "movie 4"),
    ("movie 4", "movie 7"),
    ("movie 8", "movie 10"),
    ("movie 10", "movie 5"),
    ("movie 14", "movie 13"),
    ("movie 11", "movie 13"),
]

Output should be like this: Output 应该是这样的:

result_dict = {'movie 1' : ['movie 1' , 'movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'],
               'movie 2' : ['movie 2', 'movie 4', 'movie 7'],
               'movie 3' : ['movie 1' , 'movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'],
                ....}

Here elements in tuples are same so 'movie 1' is similar to 'movie 3' and 'movie 3' is similar to 'movie 6' and 'movie 6' is to 'movie 9' and 'movie 9' to 'movie 12' and 'movie 12' to ' movie 15'.这里元组中的元素是相同的,所以“电影 1”类似于“电影 3”,“电影 3”类似于“电影 6”,“电影 6”类似于“电影 9”,“电影 9”类似于“电影 12” '和'电影12'到'电影15'。

I want to get a dictionary which has all the similar items as values.我想得到一个字典,其中包含所有类似的项目作为值。

I have tried like this, but I am not getting result:我试过这样,但我没有得到结果:

result_dict = {movie : list() for movie in movies}

for tup in list_of_tuples:
  mov1, mov2 = tup

  result_dict[mov1].append(mov2)
  result_dict[mov2].append(mov1)

  for x in result_dict[mov2]:
    if x not in result_dict[mov1]:
    result_dict[mov1].append(x)
  
  for x in result_dict[mov1]:
    if x not in result_dict[mov2]:
      result_dict[mov2].append(x )

Please help me transform this with minimum time complexity.请帮助我以最小的时间复杂度进行转换。

Thanks in advance.提前致谢。

Thanks to @James Lin for helping to get this result, I am posting below how the code looks.感谢@James Lin 帮助获得了这个结果,我在下面发布了代码的外观。


relationships = []
relationship = set()
for tuple_data in list_of_tuples:
    tuple_data = set(tuple_data)
    if tuple_data.intersection(relationship):
       relationship |= tuple_data
    else:
       # broken link
       relationship = set()
       relationship |= tuple_data
       relationships.append(relationship)

for idx in range(len(relationships)):
  relationships[idx] = list(relationships[idx])



result_dict = {movie : list() for movie in movies}

for key in result_dict.keys():
  for item in relationships:
    if key in item:
      result_dict[key] = item

and Output is: Output 是:

{'movie 1': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 2': ['movie 7', 'movie 4', 'movie 2'], 'movie 3': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 4': ['movie 7', 'movie 4', 'movie 2'], 'movie 5': ['movie 10', 'movie 5', 'movie 8'], 'movie 6': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 7': ['movie 7', 'movie 4', 'movie 2'], 'movie 8': ['movie 10', 'movie 5', 'movie 8'], 'movie 9': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 10': ['movie 10', 'movie 5', 'movie 8'], 'movie 11': ['movie 14', 'movie 11', 'movie 13'], 'movie 12': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 13': ['movie 14', 'movie 11', 'movie 13'], 'movie 14': ['movie 14', 'movie 11', 'movie 13'], 'movie 15': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3']}

Please help me in understanding the complexity of this whole process.请帮助我理解整个过程的复杂性。 It would be also great to get it optimized.优化它也很棒。

Thanks谢谢

Assuming your relationships are ordered top down, your description is not exactly clear, I am going to give a try to give you some hint:假设您的关系是自上而下排列的,您的描述并不完全清楚,我将尝试给您一些提示:

You need to loop through the list_of_tuples to build the relationships between each element您需要遍历list_of_tuples以建立每个元素之间的关系

relationships = []
relationship = set()
for tuple_data in list_of_tuples:
    tuple_data = set(tuple_data)
    if tuple_data.intersection(relationship):
       relationship |= tuple_data
    else:
       # broken link
       relationship = tuple_data
       relationships.append(relationship)

print(relationships)

This will print out:这将打印出:

[{'movie 15', 'movie 12', 'movie 6', 'movie 9', 'movie 3', 'movie 1'}, {'movie 2', 'movie 7', 'movie 4'}, {'movie 8', 'movie 5', 'movie 10'}, {'movie 11', 'movie 14', 'movie 13'}]

From this list you will be able generate your desired dictionary.从此列表中,您将能够生成所需的字典。

UPDATE: use set() to solve movie 11 relate to movie 13更新:使用 set() 解决与电影 13 相关的电影 11

UPDATE: you can first try to profile your code, eg.更新:您可以先尝试分析您的代码,例如。 _ldap.get_option(_ldap.OPT_API_INFO) is slow after upgrading to MacOS Mojave _ldap.get_option(_ldap.OPT_API_INFO) 升级到 MacOS Mojave 后变慢

You can use defaultdict to get this done.您可以使用defaultdict来完成此操作。

from collections import defaultdict

list_of_tuples = [
    ("movie 1", "movie 3"),
    ("movie 3", "movie 6"),
    ("movie 6", "movie 9"),
    ("movie 9", "movie 12"),
    ("movie 12", "movie 15"),
    ("movie 2", "movie 4"),
    ("movie 4", "movie 7"),
    ("movie 8", "movie 10"),
    ("movie 10", "movie 5"),
    ("movie 14", "movie 13"),
    ("movie 11", "movie 13"),
]

result_dict = defaultdict(list)

for k ,v in list_of_tuples:

    #for value in the tuple, find out if this is already part of
    #the existing dictionary. If yes, get the key so you can
    #append to the key else start a new key item

    a = ''.join([x for x, y in result_dict.items() for z in y if z == k])

    #if found, above list comprehension will result in 1 element

    if a = '' : #if not found, then create a new list for key
        result_dict[k].append(v)

    else: # value is part of a key list, so append value to key list 
        result_dict[a].append(v)

result_dict = dict(result_dict)
print (result_dict)

The output of the above code is:上述代码的output为:

{'movie 1': ['movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'], 'movie 2': ['movie 4', 'movie 7'], 'movie 8': ['movie 10', 'movie 5'], 'movie 14': ['movie 13'], 'movie 11': ['movie 13']}

Is this what you are looking for这是你想要的

You can always call dict(list_of_tuples) to get a corresponding dictionary for these tuples.您可以随时调用dict(list_of_tuples)来获取这些元组的相应字典。

I don't know if this is THE most efficient timewise, but I get what you're trying to get in ~ O(n) with the following code:我不知道这是否是最有效的时间,但我得到了你想要得到的东西 ~ O(n) 使用以下代码:

from collections import defaultdict

movie_dict = dict(list_of_tuples)

index = defaultdict(list)
for key, value in movie_dict.items():
    index[key] += [value]
    index[value] += [key]

The output is: output 是:

defaultdict(list,
            {'movie 1': ['movie 3'],
             'movie 3': ['movie 1', 'movie 6'],
             'movie 6': ['movie 3', 'movie 9'],
             'movie 9': ['movie 6', 'movie 12'],
             'movie 12': ['movie 9', 'movie 15'],
             'movie 15': ['movie 12'],
             'movie 2': ['movie 4'],
             'movie 4': ['movie 2', 'movie 7'],
             'movie 7': ['movie 4'],
             'movie 8': ['movie 10'],
             'movie 10': ['movie 8', 'movie 5'],
             'movie 5': ['movie 10'],
             'movie 14': ['movie 13'],
             'movie 13': ['movie 14', 'movie 11'],
             'movie 11': ['movie 13']})

ETA: This gives you an index by movie of what it's similar to. ETA:这会给你一个电影的索引,它与它的相似之处。 If you want the movie equivalence classes, so to speak, you'll need to do some set operations.如果你想要电影等价类,可以这么说,你需要做一些集合操作。 I'll add more info shortly.我会尽快添加更多信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM