简体   繁体   English

额外的数据会降低Python对象的性能吗?

[英]Does extra data reduce performance in Python objects?

I could probably run experiments to measure this myself, but I'm hoping somebody can provide a detailed answer. 我可能可以自己进行实验来衡量这一点,但我希望有人可以提供详细的答案。

Suppose I have two Python arrays: 假设我有两个Python数组:

small_array = [{"x": i, "foo": "hamburger"} for i in range(0,10000)]
big_array = [{"x": i, "foo": "hamburger", "bar": "hotdog"} for i in range(0,10000)]

My question is: will basic array operations (such as iteration or accessing by index) that involve only the "x" parameter be faster for small_array compared to big_array ? 我的问题是:将只涉及的“X”参数基本阵列操作(例如迭代或通过索引访问)是用于更快small_array相比big_array

I'm asking because I often find myself building a complex data structure X on which I will perform expensive operations A and B such that the overlap between the attributes of X used by A and B is small. 我之所以问是因为我经常发现自己要构建一个复杂的数据结构X,在该结构上我将执行昂贵的操作A和B,从而使A和B使用的X属性之间的重叠很小。 So I'm wondering if there are performance advantages associated with separating X into Y and Z so that A can operate on Y and B can operate on Z. 所以我想知道将X分为Y和Z是否具有相关的性能优势,以便A可以在Y上运行,而B可以在Z上运行。

每个列表的迭代和索引编制速度相同(对于以后的Google搜索,它们是列表,而不是数组)。

The performance difference on reasonably-sized (1000 or fewer items) dicts is statistically insignificant. 在合理大小(1000个或更少的项目)上,性能差异在统计上不明显。 There may be some degenerate cases where many of the keys hash to the same value but the hash randomization found in modern versions of Python sweeps this under the rug. 在某些情况下,许多键会散列为相同的值,但在现代版本的Python中发现的散列随机化却使这种情况大打折扣。

As for the list itself, each element is only a reference to another object so the size of the underlying object does not affect list operations. 对于列表本身,每个元素只是对另一个对象的引用,因此基础对象的大小不会影响列表操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM