简体   繁体   English

更改元组列表中的元组

[英]Change a tuple within a list of tuples

I am reading in data from multiple Excel files and writing them back to an aggregated Excel file.我正在从多个 Excel 文件中读取数据,并将它们写回到聚合的 Excel 文件中。

So I have this output, and it represents the relations of multiple entities within my company ( enity-ID ) with other companies ( debitor-name ):所以我有这个 output,它代表了我公司内的多个实体 ( enity-ID ) 与其他公司 ( debitor-name ) 的关系:

debitor_list = [
    ("1", "X AG"),
    ("1", "X AG"),
    ("1", "Z AG"),
    ("2", "X AG"),
    ("2", "X AG"),
    ("3", "LOL AG"),
    ("1", "Z AG"), 
    ("1", "HS AG"),
    ("2", "hs ag")
]

The tuples structure within this list is the following:此列表中的元组结构如下:

('entity-ID', 'debitor-name')

In addition, I have a list which represents the real names and information about debitors:此外,我还有一个列表,其中包含有关借方的真实姓名和信息:

real_file = ["LOLLIPOP AG", "HS AG", "X AG", "Z AG"]

Then I am checking for similarities between debitor name in debitor_list and real_file to replace with the real name:然后我检查debitor_listreal_file中的借方名称之间的相似性以替换为真实姓名:

import difflib as dif

for deb in debitor_list:
    for cam in cam_file:
        if deb[1] != cam:
            sequence = dif.SequenceMatcher(
                isjunk=None,
                a=deb[1].lower(),
                b=cam.lower()
            )
            match = sequence.ratio() * 100
            if (match >= 80):
                print(deb[1], cam, match)
                debitor_list.append((deb[0], cam))

Output: Output:

hs ag HS AG 100.0

How can I delete the ("2", "hs ag") tuple?如何删除("2", "hs ag")元组?

Either you replace the whole list, or you replace the element in place with some simple logic, see the 2 options below.要么替换整个列表,要么用一些简单的逻辑替换元素,请参阅下面的 2 个选项。

Note that tuples might be immutable, but the list itself is not...请注意,元组可能是不可变的,但列表本身不是......

import difflib as dif

debitor_list = [
    ("1", "X AG"),
    ("1", "X AG"),
    ("1", "Z AG"),
    ("2", "X AG"),
    ("2", "X AG"),
    ("3", "LOL AG"),
    ("1", "Z AG"),
    ("1", "HS AG"),
    ("2", "hs ag"),
]

real_file = ["LOLLIPOP AG", "HS AG", "X AG", "Z AG"]


def fix_stuff(d_list, c_list):
    result = []
    for deb in d_list:
        repl_val = None
        for cam in c_list:
            if deb[1] != cam:
                sequence = dif.SequenceMatcher(
                    isjunk=None, a=deb[1].lower(), b=cam.lower()
                )
                match = sequence.ratio() * 100
                if match >= 80:
                    repl_val = cam
        if repl_val:
            result.append((deb[0], repl_val))
        else:
            result.append(deb)
    return result


print(debitor_list)
new_deb_list = fix_stuff(debitor_list, real_file)
print(new_deb_list)


for idx, deb in enumerate(debitor_list):
    for cam in real_file:
        if deb[1] != cam:
            sequence = dif.SequenceMatcher(isjunk=None, a=deb[1].lower(), b=cam.lower())
            match = sequence.ratio() * 100
            if match >= 80:
                debitor_list[idx] = (deb[0], cam)
print(debitor_list)

output output

[('1', 'X AG'), ('1', 'X AG'), ('1', 'Z AG'), ('2', 'X AG'), ('2', 'X AG'), ('3', 'LOL AG'), ('1', 'Z AG'), ('1', 'HS AG'), ('2', 'hs ag')]
[('1', 'X AG'), ('1', 'X AG'), ('1', 'Z AG'), ('2', 'X AG'), ('2', 'X AG'), ('3', 'LOL AG'), ('1', 'Z AG'), ('1', 'HS AG'), ('2', 'HS AG')]
[('1', 'X AG'), ('1', 'X AG'), ('1', 'Z AG'), ('2', 'X AG'), ('2', 'X AG'), ('3', 'LOL AG'), ('1', 'Z AG'), ('1', 'HS AG'), ('2', 'HS AG')]

The if repl_val checks if the value needs to be replaced. if repl_val检查值是否需要被替换。 Since the variable repl_val gets set to None at the start of each for, if repl_val will only be true if it was changed during the loop.由于变量repl_val在每个 for 开始时被设置为Noneif repl_val只有在循环期间被更改时才会为真。

As for using result , when using the function, we're not modifying the incoming lists, but we return a new list result .至于使用result ,在使用 function 时,我们并没有修改传入的列表,而是返回了一个新的列表result


as for the second way to do this (and that is likely the better way), due to the usage of enumerate we get an index ( idx ) for each list element, as well as the value deb .至于第二种方法(这可能是更好的方法),由于使用了enumerate我们得到每个列表元素的索引( idx )以及值deb It allows for directly assigning to the original list by it's index, so it's a direct modification of the original list.它允许通过它的索引直接分配给原始列表,因此它是对原始列表的直接修改。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM