[英]Change a tuple within a list of tuples
I am reading in data from multiple Excel files and writing them back to an aggregated Excel file.我正在从多个 Excel 文件中读取数据,并将它们写回到聚合的 Excel 文件中。
So I have this output, and it represents the relations of multiple entities within my company ( enity-ID
) with other companies ( debitor-name
):所以我有这个 output,它代表了我公司内的多个实体 (
enity-ID
) 与其他公司 ( debitor-name
) 的关系:
debitor_list = [
("1", "X AG"),
("1", "X AG"),
("1", "Z AG"),
("2", "X AG"),
("2", "X AG"),
("3", "LOL AG"),
("1", "Z AG"),
("1", "HS AG"),
("2", "hs ag")
]
The tuples structure within this list is the following:此列表中的元组结构如下:
('entity-ID', 'debitor-name')
In addition, I have a list which represents the real names and information about debitors:此外,我还有一个列表,其中包含有关借方的真实姓名和信息:
real_file = ["LOLLIPOP AG", "HS AG", "X AG", "Z AG"]
Then I am checking for similarities between debitor name in debitor_list
and real_file
to replace with the real name:然后我检查
debitor_list
和real_file
中的借方名称之间的相似性以替换为真实姓名:
import difflib as dif
for deb in debitor_list:
for cam in cam_file:
if deb[1] != cam:
sequence = dif.SequenceMatcher(
isjunk=None,
a=deb[1].lower(),
b=cam.lower()
)
match = sequence.ratio() * 100
if (match >= 80):
print(deb[1], cam, match)
debitor_list.append((deb[0], cam))
Output: Output:
hs ag HS AG 100.0
How can I delete the ("2", "hs ag")
tuple?如何删除
("2", "hs ag")
元组?
Either you replace the whole list, or you replace the element in place with some simple logic, see the 2 options below.要么替换整个列表,要么用一些简单的逻辑替换元素,请参阅下面的 2 个选项。
Note that tuples might be immutable, but the list itself is not...请注意,元组可能是不可变的,但列表本身不是......
import difflib as dif
debitor_list = [
("1", "X AG"),
("1", "X AG"),
("1", "Z AG"),
("2", "X AG"),
("2", "X AG"),
("3", "LOL AG"),
("1", "Z AG"),
("1", "HS AG"),
("2", "hs ag"),
]
real_file = ["LOLLIPOP AG", "HS AG", "X AG", "Z AG"]
def fix_stuff(d_list, c_list):
result = []
for deb in d_list:
repl_val = None
for cam in c_list:
if deb[1] != cam:
sequence = dif.SequenceMatcher(
isjunk=None, a=deb[1].lower(), b=cam.lower()
)
match = sequence.ratio() * 100
if match >= 80:
repl_val = cam
if repl_val:
result.append((deb[0], repl_val))
else:
result.append(deb)
return result
print(debitor_list)
new_deb_list = fix_stuff(debitor_list, real_file)
print(new_deb_list)
for idx, deb in enumerate(debitor_list):
for cam in real_file:
if deb[1] != cam:
sequence = dif.SequenceMatcher(isjunk=None, a=deb[1].lower(), b=cam.lower())
match = sequence.ratio() * 100
if match >= 80:
debitor_list[idx] = (deb[0], cam)
print(debitor_list)
output output
[('1', 'X AG'), ('1', 'X AG'), ('1', 'Z AG'), ('2', 'X AG'), ('2', 'X AG'), ('3', 'LOL AG'), ('1', 'Z AG'), ('1', 'HS AG'), ('2', 'hs ag')]
[('1', 'X AG'), ('1', 'X AG'), ('1', 'Z AG'), ('2', 'X AG'), ('2', 'X AG'), ('3', 'LOL AG'), ('1', 'Z AG'), ('1', 'HS AG'), ('2', 'HS AG')]
[('1', 'X AG'), ('1', 'X AG'), ('1', 'Z AG'), ('2', 'X AG'), ('2', 'X AG'), ('3', 'LOL AG'), ('1', 'Z AG'), ('1', 'HS AG'), ('2', 'HS AG')]
The if repl_val
checks if the value needs to be replaced. if repl_val
检查值是否需要被替换。 Since the variable repl_val
gets set to None
at the start of each for, if repl_val
will only be true if it was changed during the loop.由于变量
repl_val
在每个 for 开始时被设置为None
, if repl_val
只有在循环期间被更改时才会为真。
As for using result
, when using the function, we're not modifying the incoming lists, but we return a new list result
.至于使用
result
,在使用 function 时,我们并没有修改传入的列表,而是返回了一个新的列表result
。
as for the second way to do this (and that is likely the better way), due to the usage of enumerate
we get an index ( idx
) for each list element, as well as the value deb
.至于第二种方法(这可能是更好的方法),由于使用了
enumerate
我们得到每个列表元素的索引( idx
)以及值deb
。 It allows for directly assigning to the original list by it's index, so it's a direct modification of the original list.它允许通过它的索引直接分配给原始列表,因此它是对原始列表的直接修改。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.