简体   繁体   English

将列表中的值与所有其他值进行比较

[英]Comparing value in a list to all other values

I have a list of latitudes, lats.我有一个纬度列表,lats。 I am trying to compare each latitude to each other latitude and find each combination for list items that fall within 0.01 of each other.我正在尝试将每个纬度与彼此的纬度进行比较,并找到彼此在 0.01 范围内的列表项的每个组合。 The code I currently have does just that, however, it is also comparing each list value to itself.我目前拥有的代码就是这样做的,但是,它还将每个列表值与其自身进行比较。

lats = [79.826, 79.823, 79.855, 79.809]

for i in lats:
    for j in lats:
        if (i - 0.1) <= j <= (i + 0.1):
            print(str(i) +" and "+ str(j))

This returns the output:这将返回 output:

79.826 and 79.826
79.826 and 79.823
79.826 and 79.855
79.826 and 79.809
79.823 and 79.826
79.823 and 79.823
79.823 and 79.855
79.823 and 79.809
79.855 and 79.826
79.855 and 79.823
79.855 and 79.855
79.855 and 79.809
79.809 and 79.826
79.809 and 79.823
79.809 and 79.855
79.809 and 79.809

You are implicitly computing a cross product;您正在隐式计算叉积; you could have written你可以写

for i, j in itertools.product(lats, repeat=2):
    if i - 0.1 <= j <= 1 + 0.1:
        ...

instead.反而。 What you want, though, are the 2-element combinations from the list:但是,您想要的是列表中的 2 元素组合

for i, j in itertools.combinations(lats, 2):

For iterating and producing the lats combinations, while the itertools solution should be the preferred way, you may be interested into some way of coding this "by hand".对于迭代和生成lats组合,虽然itertools解决方案应该是首选方式,但您可能会对“手动”编码的某种方式感兴趣。 Assuming that what you really want is just any two lats in any order, but just not couple duplicated, you can simply progressively restrict the second loop:假设您真正想要的只是任意顺序的任意两个lats ,但只是不重复,您可以简单地逐步限制第二个循环:

for i, x in enumerate(lats):
    for y in lats[i + 1:]:
        ...

Also, the condition as currently written is a bit too complex than needed.此外,当前编写的条件比所需的条件有点过于复杂。 What you really want is that the two values x and y are less than some value d apart, hence you could write the condition:您真正想要的是两个值xy小于某个值d分开,因此您可以编写条件:

(x - d) <= y <= (x + d):

as:作为:

abs(x - y) <= d

Just add and i != j :只需添加and i != j

lats = [79.826, 79.823, 79.855, 79.809]

for i in lats:
    for j in lats:
        if (i - 0.1) <= j <= (i + 0.1) and i != j:
            print(str(i) +" and "+ str(j))

outputs:输出:

79.826 and 79.823
79.826 and 79.855
79.826 and 79.809
79.823 and 79.826
79.823 and 79.855
79.823 and 79.809
79.855 and 79.826
79.855 and 79.823
79.855 and 79.809
79.809 and 79.826
79.809 and 79.823
79.809 and 79.855

There is this terse version using itertools.combinations and abs有一个使用itertools.combinationsabs的简洁版本

from itertools import combinations
lats = [79.826, 79.823, 79.855, 79.809]
print([c for c in combinations(lats, 2) if abs(c[0] - c[1]) > 0.01])

which gives:这使:

[(79.826, 79.855), (79.826, 79.809), (79.823, 79.855), (79.823, 79.809), (79.855, 79.809)]

Or with the formatting:或使用格式:

from itertools import combinations
lats = [79.826, 79.823, 79.855, 79.809]
close_lats = [c for c in combinations(lats, 2) if abs(c[0] - c[1]) > 0.01]
for combo in close_lats:
    print(f"{combo[0]} and {combo[1]}")

giving:给予:

79.826 and 79.855
79.826 and 79.809
79.823 and 79.855
79.823 and 79.809
79.855 and 79.809

As an aside, your question says you seek those that are within 0.01 of each other, but your code sample seems to look within 0.1 or each other.顺便说一句,你的问题是你寻找那些在 0.01 以内的,但你的代码示例似乎在 0.1 以内。

For efficiency you can use one of the Combinatoric iterators(depending on what you what the final result to be) from itertools and isclose from the math module:为了提高效率,您可以使用来自itertools的组合迭代器之一(取决于最终结果是什么)和来自数学模块的isclose

from itertools import permutations
from math import isclose

lats = [79.826, 79.823, 79.855, 79.809]

for l1, l2 in permutations(lats, r=2):
    if isclose(l1, l2, rel_tol=0.01):
        print(f"{l1} and {l2}")

Output: Output:

79.826 and 79.823
79.826 and 79.855
79.826 and 79.809
79.823 and 79.826
79.823 and 79.855
79.823 and 79.809
79.855 and 79.826
79.855 and 79.823
79.855 and 79.809
79.809 and 79.826
79.809 and 79.823
79.809 and 79.855

I think you should change your algorithm first to solve your problem and avoid counting multiple lats (eg 79.826 and 79.823 and 79.823 and 79.826 ) and second improve your code performance and reduce the complexity from O(n^2) to O(nlog(n)) (for sorting the list).我认为你应该首先改变你的算法来解决你的问题并避免计算多个纬度(例如79.826 and 79.82379.823 and 79.826 ),然后提高你的代码性能并将复杂度从O(n^2)降低到O(nlog(n)) (用于对列表进行排序)。

It's best to sort your list of lats and set two pointers to track the lower bound and upper bound of the list, which items fall within the range of 0.1.最好对你的 lats 列表进行排序,并设置两个指针来跟踪列表的下限和上限,哪些项目在 0.1 的范围内。

Here is the code:这是代码:

lats = [79.826, 79.823, 79.855, 79.809]
lats.sort()

i = 0
j = 1
while j < len(lats):
    if lats[j] - lats[i] <= 0.1:
        print(lats[i: j], lats[j])
        j += 1
    else:
        i += 1

Output: Output:

[79.809] 79.823
[79.809, 79.823] 79.826
[79.809, 79.823, 79.826] 79.855

If you sort your list in the first step, you can make a much more efficient comparison and you can break the inner loop, when the first comparison fails.如果您在第一步中对列表进行排序,则可以进行更有效的比较,并且可以在第一次比较失败时打破内部循环。 Because all next values will be even larger.因为所有下一个值都会更大。

lats = [79.809, 79.823, 79.826, 79.855]
lats_sorted = sorted(lats)
for index, lat1 in enumerate(lats_sorted[:-1]):
    for lat2 in lats_sorted[index+1:]:
        if (lat2 - lat1 ) < 0.1:
            print(str(lat1) + " and " + str(lat2))
        else:
            break

I made a small runtime comparison for large lists (5000 elements)我对大型列表(5000 个元素)进行了小型运行时比较

def func1(lats):
    pairs = []
    lats_sorted = sorted(lats)
    for index, lat1 in enumerate(lats_sorted[:-1]):
        for lat2 in lats_sorted[index+1:]:
            if lat2 - lat1 <= 0.1:
                pairs.append((lat1, lat2))
            else:
                break
    return pairs


def func2(lats):
    pairs = []
    for i in lats:
        for j in lats:
            if (i - 0.1) <= j <= (i + 0.1):
                pairs.append((i, j))
    return pairs


def func3(lats):
    pairs = []
    for i, j in itertools.combinations(lats, 2):
        if (i - 0.1) <= j <= (i + 0.1):
            pairs.append((i, j))
    return pairs

def func4(lats):
    pairs = []
    for i in lats:
        for j in lats:
            if (i - 0.1) <= j <= (i + 0.1) and i != j:
                pairs.append((i, j))
    return pairs


lats = np.random.randint(0, 100000, 5000) / 1000

print(lats)

func_list = [func1, func2, func3, func4]

for func in func_list:

    start = time.time()
    pairs = func(lats)
    end = time.time()
    print(f"{func.__name__}: time = {end - start} s, pair count = {len(pairs)}")

The output is output 是

[79.759 45.091 19.409 ... 24.691  5.114 64.561]
func1: time = 0.033899545669555664 s, pair count = 24972
func2: time = 6.784521102905273 s, pair count = 55155
func3: time = 2.624063491821289 s, pair count = 25077
func4: time = 6.442306041717529 s, pair count = 49929

showing, that my proposed algorithm (func1) is way faster than the others.表明我提出的算法(func1)比其他算法快得多。 The slight count difference between func1 and func3 (itertools solution) seems to be a numerical precision issue. func1 和 func3 (itertools 解决方案)之间的微小计数差异似乎是一个数值精度问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM