简体   繁体   English

在Python中索引浮点值

[英]Indexing float values in Python

I have a list of floats generated from a data structure which is a list of dictionaries - ie I've iterated over the whole list and selected for certain values in the given dictionary. 我有一个从数据结构生成的浮点数列表,这是一个字典列表 - 即我迭代整个列表并选择给定字典中的某些值。 Now, I want to actually do something with these data points, for which I need some reference to the original position. 现在,我想实际对这些数据点做一些事情,我需要对原始位置进行一些参考。 I tried to simply use the data point as a key, but after trying and failing I did some digging and realized that floats aren't precisely represented due to the way computers work. 我试图简单地使用数据点作为关键,但在尝试和失败后,我做了一些挖掘,并意识到由于计算机的工作方式,浮动没有精确表示。

So, what I need is some way to assign a unique value to each dictionary in the list, eg: 所以,我需要的是为列表中的每个字典分配唯一值的一些方法,例如:

list = [...]
vallist = []
index = {}
for i in range(0, len(list)):
value = i+0.123
vallist.append(value)
index[value] = i

Except I evidently need to assign each value a unique item to be able to point back to their position in the list object. 除了我显然需要为每个值分配一个唯一的项目,以便能够指回它们在列表对象中的位置。 I'm imagining I could possibly create a new object called "valuelist" or something and then int over that, but this seems like something that probably has an obvious workaround that I'm just too thick to figure out. 我想象我可能会创建一个名为“valuelist”的新对象,然后将其转换为int,但这似乎是一个可能有一个明显的解决方法,我只是太厚了,无法弄明白。

To reiterate, what I want is a way to make the values point back to their original position in the list - in my data structure, my list contains a ton of dictionaries, and the way I handle it is somewhat more complicated, so I'm sort of stuck with my possibly impractical structure. 重申一下,我想要的是一种方法,使值回到列表中的原始位置 - 在我的数据结构中,我的列表包含大量的词典,我处理它的方式有点复杂,所以我'有点像我可能不切实际的结构。

Thanks! 谢谢!

Firstly, let's address the problems posed by using floating point. 首先,让我们解决使用浮点所带来的问题。

floats aren't precisely represented due to the way computers work. 由于计算机的工作方式,浮点数没有精确表示。

Floating point numbers are precisely represented in computers. 浮点数在计算机精确表示。 There are, however, some limitations: 但是,有一些限制:

  • Resolution is finite. 分辨率是有限的。 It's impossible to represent a irrational number in finite memory, and typical floating points can only represent a couple dozen digits. 在有限的存储器中表示无理数是不可能的,典型的浮点只能代表几十个数字。
  • Some decimal (base10) numbers have no exact representation in binary . 一些十进制(base10)数字在二进制中没有精确表示 For example, 0.1 cannot be represented in base 2 exactly. 例如,0.1不能精确地表示在基数2中。 Running "{0:.20f}".format(0.1) in python will return 0.10000000000000000555 . 在python中运行"{0:.20f}".format(0.1)将返回0.10000000000000000555

Now, depending on the source of your numbers, and the kind of computations you want to perform, there are different possible solutions for indexing them. 现在,根据您的数字来源和您想要执行的计算类型,有不同的可能解决方案来索引它们。

For numbers that can be described precisely in base10, you can use a Decimal . 对于可以在base10中精确描述的数字,可以使用Decimal This represents numbers in base10 exactly: 这表示base10中的数字:

>>> from decimal import Decimal
>>> "{0:.20f}".format(Decimal('0.1'))
'0.10000000000000000000'

If you're dealing exclusively with rational numbers (even those without exact decimal representation), you can use fractions . 如果您只处理有理数(即使那些没有精确十进制表示的数字),您也可以使用分数

Note that if you use decimals or fractions, you'll need to use them as soon as possible in your processing. 请注意,如果使用小数或分数,则需要在处理过程中尽快使用它们。 Converting from a float to a decimal/fraction in the late stages defeats their purpose - you can't get data that isn't there: 在后期阶段从浮点数转换为小数/分数会破坏它们的目的 - 您无法获得不存在的数据:

>>> "{0:.20f}".format(Decimal('0.1'))
'0.10000000000000000000'
>>> "{0:.20f}".format(Decimal(0.1))
'0.10000000000000000555'

Also, using decimals or fractions will come at a significant performance penalty. 此外,使用小数或分数将显着降低性能。 For serious number crunching you'll want to always use float, or even integers in their place 对于严重的数字运算,你需要总是使用float,甚至是整数

Finally, if your numbers are irrational, or if you're getting indexing mishaps even while using decimals or fractions, your best choice is probably indexing rounded versions of the numbers. 最后,如果您的数字是不合理的,或者即使在使用小数或分数时您正在编制索引事件,您最好的选择可能是索引圆形版本的数字。 Use buckets if necessary. 必要时使用 collections.defaultdict may be useful for this. collections.defaultdict可能对此有用。

You could also keep a tree, or use binary search over a list with a custom comparison function, but you won't have O(1) lookup 你也可以保留一棵树,或者在带有自定义比较功能的列表上使用二进制搜索 ,但你不会有O(1)查找

If I understand correctly, you have generated a list of floats, each one from one of the dicts in the original list. 如果我理解正确,你已经生成了一个浮动列表,每个浮点数来自原始列表中的一个序列。 Instead of generating a list of floats, why not generate a list of 2-tuples, being the float and it's corresponding dictionary-list-index... 为什么不生成一个2元组的列表,而不是生成一个浮点列表,作为浮点数,它是相应的字典列表索引...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM