[英]How to find exact matches between strings in sublists
I have two lists composed of sublists, something like:我有两个由子列表组成的列表,例如:
conopt = [["element1","element2"],["bla"]]
mat = [["element1","elementA"],["bla & etc"]]
From that data I want to fill a matrix with dimensions len(conopt) x len(mat).从该数据中,我想填充尺寸为 len(conopt) x len(mat) 的矩阵。 Since python by default doesn't have a matrix I'll use another list with sublists:
由于默认情况下 python 没有矩阵,我将使用另一个带有子列表的列表:
finalmat = [ ["-","X",...,"X"],[...]]
I want finalmat
to have an "X" where there is a full match (when a subelement of conopt matches a subelement of mat) and a "-" where there isn't.我希望
finalmat
在完全匹配的地方有一个“X”(当 conopt 的子元素与 mat 的子元素匹配时)和一个“-”,在那里没有。 However I only care about the first full match for each sublist.但是我只关心每个子列表的第一个完整匹配。 If a sublist of conopt has 1 or more matches on a sublist of mat, the result should be the same, only one "X".
如果 conopt 的子列表在 mat 的子列表上有 1 个或多个匹配项,则结果应该相同,只有一个“X”。
I've tried the following:我尝试了以下方法:
for i in mat:
for j in conopt:
for item in j:
if item in i:
finalmat[mat.index(i)][conopt.index(j)] = "X"
However the result is not correct, because I've manually checked some data points and it doesn't give the correct result.但是结果不正确,因为我手动检查了一些数据点,但没有给出正确的结果。
some extra (less important) info:一些额外的(不太重要的)信息:
the string elements are composed of letters, numbers, spaces (only spaces between words) and "#&" characters.字符串元素由字母、数字、空格(仅单词之间的空格)和“#&”字符组成。
the sublists have an arbitrary number of strings.子列表具有任意数量的字符串。
this data comes from an excel file.这个数据来自一个excel文件。 I've manually extracted and modified it to match the python syntax.
我已经手动提取并修改了它以匹配 python 语法。
the output (finalmat) is going back to excel.输出(finalmat)将返回到excel。 I'm doing this step manually because this is an one off task and I don't want to complicate my code even more.
我正在手动执行此步骤,因为这是一项一次性任务,我不想让我的代码更加复杂。
mat.index(i)
and conopt.index(j)
will not find i
and j
in sublists . mat.index(i)
和conopt.index(j)
不会在sublists 中找到i
和j
。 I suggest using enumerate
.我建议使用
enumerate
。 Also make sure you correctly initialize finalmat
.还要确保正确初始化
finalmat
。
finalmat = [ [ "-" for j in range(len(conopt) ] for i in range(len(mat)) ]
for indexmat, itemmat in enumerate(mat):
for indexconopt, itemconopt in enumerate(conopt):
for item in itemconopt:
if item in itemmat:
finalmat[indexmat][indexconopt] = "X"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.