简体   繁体   English

for循环中的意外输出-Python

[英]Unexpected output in for loop - Python

I have this list: 我有这个清单:

t=[['universitario de deportes'],['lancaster'],['universitario de'],['juan aurich'],['muni'],['juan']]

I want to reorder the list according to the jaccard distance. 我想根据抽卡距离对列表重新排序。 If I reorder t the expected ouput should be: 如果我重新排序t的预期输出中应该是:

[['universitario de deportes'],['universitario de'],['lancaster'],['juan aurich'],['juan'],['muni']]

The code of the jackard distance is working OK, but the rest of the code doesn't give the expected output.The code is below: 插孔距离的代码可以正常工作,但其余代码未提供预期的输出,代码如下:

def jack(a,b):
    x=a.split()
    y=b.split()
    k=float(len(set(x)&set(y)))/float(len((set(x) | set(y))))
    return k
t=[['universitario de deportes'],['lancaster'],['universitario de'],['juan aurich'],['muni'],['juan']]

import copy as cp


b=cp.deepcopy(t)

c=[]

while (len(b)>0):
    c.append(b[0][0])
    d=b[0][0]
    del b[0]
    for m in range (0 , len(b)+1):
        if m > len(b):
            break
            if jack(d,b[m][0])>0.3:
                c.append(b[m][0])
                del b[m]

Unfortunately, the unexpected output is the same list : 不幸的是,意外的输出是相同的列表:

print c
['universitario de deportes', 'lancaster', 'universitario de', 'juan aurich', 'muni', 'juan']

EDIT: 编辑:

I tried to correct my code but it didn't work too but I got a little closer to the expected output: 我试图更正我的代码,但是它也没有起作用,但是我离预期的输出有点接近:

t=[['universitario de deportes'],['lancaster'],['universitario de'],['juan aurich'],['muni'],['juan']]

import copy as cp


b=cp.deepcopy(t)

c=[]

while (len(b)>0):
    c.append(b[0][0])
    d=b[0][0]
    del b[0]
    for m in range(0,len(b)-1):
        if jack(d,b[m][0])>0.3:
            c.append(b[m][0])
            del b[m]

The "close" output is: “关闭”输出为:

['universitario de deportes', 'universitario de', 'lancaster', 'juan aurich', 'muni', 'juan']

Second edit: 第二编辑:

Finally, I came up with a solution that has quite fast computational. 最后,我想出了一种计算速度非常快的解决方案。 Currently, I'll use the code to order 60 thousands names. 目前,我将使用该代码订购6万个名称。 The code is below: 代码如下:

t=['universitario de deportes','lancaster','lancaste','juan aurich','lancaster','juan','universitario','juan franco']

import copy as cp


b=cp.deepcopy(t)

c=[]

while (len(b)>0):
    c.append(b[0])
    e=b[0]
    del b[0]
    for val in b:
        if jack(e,val)>0.3:
            c.append(val)
            b.remove(val)

print c
['universitario de deportes', 'universitario', 'lancaster', 'lancaster', 'lancaste', 'juan aurich', 'juan', 'juan franco'

Firstly, not sure why you've got everything in single-item lists, so I suggest flattening it out first: 首先,不确定为什么要在单项列表中包含所有内容,因此我建议先将其展平:

t = [l[0] for l in t]

This gets rid of the extra zero indices everywhere, and means you only need shallow copies (as strings are immutable). 这消除了各处多余的零索引,这意味着您只需要浅表副本(因为字符串是不可变的)。

Secondly, the last three lines of your code never run: 其次,代码的最后三行永远不会运行:

if m > len(b):
    break # nothing after this will happen
    if jack(d,b[m][0])>0.3:
       c.append(b[m][0])
       del b[m]

I think what you want is: 我认为您想要的是:

out = [] # this will be the sorted list
for index, val1 in enumerate(t): # work through each item in the original list
    if val1 not in out: # if we haven't already put this item in the new list
        out.append(val1) # put this item in the new list
    for val2 in t[index+1:]: # search the rest of the list
        if val2 not in out: # if we haven't already put this item in the new list
            jack(val1, val2) > 0.3: # and the new item is close to the current item
                out.append(val2) # add the new item too

This gives me 这给我

out == ['universitario de deportes', 'universitario de', 
      'lancaster', 'juan aurich', 'juan', 'muni']

I would generally recommend using better variable names than a , b , c , etc.. 我通常建议使用比abc等更好的变量名。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM