将一本字典的每个元素与它的其他元素进行比较-查找相似的元素

Question

编辑版本：我有一个cvs文件，其中包含两列和16000行。 我想检查每个单元格（地址）以及其他地址，以找到唯一的地址，并将它们放入单独的字典中（该字典又包含ID和Address作为键和值）。 我的csv文件是这样的，我想它是用定界符分隔的值（不确定这部分以及如何检查？），这是一个示例。

    ID     Address
    111    abcd
    112    def
    122    ghi
    113    gkl
    132    mno
    123    abc
    131    lnoghi
    134    mko
    135    mnoe
    136    dfo

我认为我需要将其作为字典，然后调用一个键及其值并将其与其余键进行比较，如果它是唯一的，则将其放入新的列表/ dic。 如果相同/相似的元素重复多次，会不会有任何问题？ 或不？ 您能帮我这个忙吗，如果您有更好的方法而不是将其作为字典，我将很高兴知道。

谢谢

Answer 1

正如@RoadRunner建议的那样，您可以执行以下操作：考虑到您已将csv读入两个列表中：

ID = [111,112,122,113,132,123,131]
Names = ['abc','def','ghi','mno','abc','mno']

dictionary = {}
for name in Names:
    dictionary[name]= []
for i in range(len(Names)):
    dictionary[Names[i]].append(ID[i])

print dictionary

Answer 2

由于它们可以是多个相同的名称和唯一的ID，因此您可以制作一个以名称为键，ID为值的字典。 这是我前一段时间写的一个示例函数：

from collections import defaultdict

def read_file(filename):

    # create the dictionary of lists
    data = defaultdict(list)

    # read the file
    with open(filename) as file:

        # skip headers
        next(file)

        # go over each line
        for line in file.readlines():

            # split lines on whitespace
            items = line.split()

            ids, name = int(items[0]), items[1]

            # append ids with name
            data[name].append(ids)

    return data

这将创建一个数据字典：

>>> print(dict(read_file("yourdata.txt")))
{'mno': [132, 131], 'ghi': [122], 'def': [112], 'gkl': [113], 'abc': [111, 123]}

然后，您可以简单地查找要比较ID的键（名称）。

将一本字典的每个元素与它的其他元素进行比较-查找相似的元素

问题描述

2 个解决方案

解决方案1
1 2017-11-28 06:58:57

解决方案2
1 2017-11-28 06:59:37

将一本字典的每个元素与它的其他元素进行比较-查找相似的元素

问题描述

2 个解决方案

解决方案1 1 2017-11-28 06:58:57

解决方案2 1 2017-11-28 06:59:37

解决方案1
1 2017-11-28 06:58:57

解决方案2
1 2017-11-28 06:59:37