简体   繁体   English

Python浮点确定性

[英]Python floating point determinism

The code below (to compute cosine similarity), when run repeatedly on my computer, will output 1.0, 0.9999999999999998, or 1.0000000000000002. 下面的代码(计算余弦相似度),在我的计算机上重复运行时,将输出1.0,0.9999999999999998或1.0000000000000002。 When I take out the normalize function, it will only return 1.0. 当我取出normalize函数时,它只返回1.0。 I thought floating point operations were supposed to be deterministic. 我认为浮点运算应该是确定性的。 What would be causing this in my program if the same operations are being applied on the same data on the same computer each time? 如果每次在同一台计算机上对相同的数据应用相同的操作,我的程序会导致什么? Is it maybe something to do with where on the stack the normalize function is being called? 是否可能与堆栈中的哪个位置调用normalize函数有关? How can I prevent this? 我怎么能阻止这个?

#! /usr/bin/env python3

import math

def normalize(vector):
    sum = 0
    for key in vector.keys():
        sum += vector[key]**2
    sum = math.sqrt(sum)
    for key in vector.keys():
        vector[key] = vector[key]/sum
    return vector

dict1 = normalize({"a":3, "b":4, "c":42})
dict2 = dict1

n_grams = list(list(dict1.keys()) + list(dict2.keys()))
numerator = 0
denom1 = 0
denom2 = 0

for n_gram in n_grams:
    numerator += dict1[n_gram] * dict2[n_gram]
    denom1 += dict1[n_gram]**2
    denom2 += dict2[n_gram]**2

print(numerator/(math.sqrt(denom1)*math.sqrt(denom2)))

Floating-point math may be deterministic, but the ordering of dictionary keys is not. 浮点数学可能是确定性的,但字典键的顺序不是。

When you call .keys() , the order of the resulting list is potentially random. 当您调用.keys() ,结果列表的顺序可能是随机的。

Thus the order of your math operations inside the loops are also potentially random, and thus the result is not going to be deterministic because while any single floating-point operation might be deterministic, the result of a series of operations is very much dependent on ordering. 因此,循环中数学运算的顺序也可能是随机的,因此结果不会是确定性的,因为任何单个浮点运算都可能是确定性的,因此一系列运算的结果在很大程度上取决于排序。

You could enforce a consistent order by sorting your key lists. 您可以通过对键列表进行排序来强制执行一致的订单。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM