简体   繁体   English

Numpy 妨碍了 int -> float 类型转换

[英]Numpy getting in the way of int -> float type casting

Apologies in advance - I seem to be having a very fundamental misunderstanding that I can't clear up.提前道歉 - 我似乎有一个非常根本的误解,我无法澄清。 I have a fourvector class with variables for ct and the position vector.我有一个带有 ct 变量和位置向量的 Fourvector 类。 I'm writing code to perform an x-direction lorentz boost.我正在编写代码来执行 x 方向的洛伦兹提升。 The problem I'm running in to is that I, as it's written below, ct returns with a proper float value, but x does not.我遇到的问题是,正如下面所写的,ct 返回一个适当的浮点值,但 x 没有。 Messing around, I find that tempx is a float, but assigning tempx to r[0] does not make that into a float, instead it rounds down to an int.乱七八糟,我发现 tempx 是一个浮点数,但将 tempx 分配给 r[0] 并不会使其成为浮点数,而是向下舍入为整数。 I have previously posted a question on mutability vs immutability, and I suspect this is the issue.我之前发布了一个关于可变性与不变性的问题,我怀疑这就是问题所在。 If so I clearly have a deeper misunderstanding than expected.如果是这样,我显然有比预期更深的误解。 Regardless, there are a couple of questions I have;无论如何,我有几个问题;

1a) If instantiate a with a = FourVector(ct=5,r=[55,2.,3]), then type(a._r[0]) returns numpy.float64 as opposed to numpy.int32. 1a) 如果用 a = FourVector(ct=5,r=[55,2.,3]) 实例化 a,则 type(a._r[0]) 返回 numpy.float64 而不是 numpy.int32。 What is going on here?这里发生了什么? I expected just a._r[1] to be a float, and instead it changes the type of the whole list?我希望 a._r[1] 是一个浮点数,而是改变了整个列表的类型?

1b) How do I get the above behaviour (The whole list being floats), without having to instantiate the variables as floats? 1b)如何获得上述行为(整个列表是浮点数),而不必将变量实例化为浮点数? I read up on the documentation and have tried various methods, like using astype(float), but everything I do seems to keep it as an int.我阅读了文档并尝试了各种方法,例如使用 astype(float),但我所做的一切似乎都将其保留为 int。 Again, thinking this is the mutable/immutable problem I'm having.再次,认为这是我遇到的可变/不可变问题。

2) I had thought, in the tempx=... line, multiplying by 1.0 would convert it to a float, as it appears this is the reason ct converts to a float, but for some reason it doesn't. 2)我曾想过,在 tempx=... 行中,乘以 1.0 会将其转换为浮点数,因为这似乎是 ct 转换为浮点数的原因,但由于某种原因它没有。 Perhaps the same reason as the others?也许和其他人一样的原因?

import numpy as np

class FourVector():
    def __init__(self, ct=0, x=0, y=0, z=0, r=[]):
        self._ct = ct
        self._r = np.array(r)
        if r == []:
            self._r = np.array([x,y,z])

    def boost(self, beta):
        gamma=1/np.sqrt(1-(beta ** 2))
        tempct=(self._ct*gamma-beta*gamma*self._r[0])
        tempx=(-1.0*self._ct*beta*gamma+self._r[0]*gamma)
        self._ct=tempct
        print(type(self._r[0]))
        self._r[0]=tempx.astype(float)
        print(type(self._r[0]))

a = FourVector(ct=5,r=[55,2,3])
b = FourVector(ct=1,r=[4,5,6])
print(a._r)
a.boost(.5)
print(a._r)

All your problems are indeed related.你所有的问题确实都是相关的。

A numpy array is an array that holds objects efficiently. numpy 数组是一个有效保存对象的数组。 It does this by having these objects be of the same type , like strings (of equal length) or integers or floats.它通过让这些对象具有相同的类型来实现这一点,例如字符串(等长)或整数或浮点数。 It can then easily calculate just how much space each element needs and how many bytes it must "jump" to access the next element (we call these the "strides").然后它可以轻松计算每个元素需要多少空间以及它必须“跳转”多少字节才能访问下一个元素(我们称之为“步幅”)。

When you create an array from a list, numpy will try to determine a suitable data type ("dtype") from that list, to ensure all elements can be represented well.当您从列表创建数组时,numpy 将尝试从该列表中确定合适的数据类型(“dtype”),以确保可以很好地表示所有元素。 Only when you specify the dtype explicitly, will it not make an educated guess.只有当您明确指定 dtype 时,它​​才会做出有根据的猜测。

Consider the following example:考虑以下示例:

>>> import numpy as np
>>> integer_array = np.array([1,2,3])  # pass in a list of integers
>>> integer_array
array([1, 2, 3])
>>> integer_array.dtype
dtype('int64')

As you can see, on my system it returns a data type of int64 , which is a representation of integers using 8 bytes.如您所见,在我的系统上,它返回数据类型int64 ,这是使用 8 个字节的整数表示。 It chooses this, because:它选择这个,因为:

  1. numpy recognizes all elements of the list are integers numpy 识别列表的所有元素都是整数
  2. my system is a 64-bit system我的系统是64位系统

Now consider an attempt at changing that array:现在考虑尝试更改该数组:

>>> integer_array[0] = 2.4  # attempt to put a float in an array with dtype int
>>> integer_array # it is automatically converted to an int!
array([2, 2, 3])

As you can see, once a datatype for an array was set, automatic casting to that datatype is done.如您所见,一旦设置了数组的数据类型,就会自动转换为该数据类型。 Let's now consider what happens when you pass in a list that has at least one float:现在让我们考虑当你传入一个至少有一个浮点数的列表时会发生什么:

>>> float_array = np.array([1., 2,3])
>>> float_array
array([ 1.,  2.,  3.])
>>> float_array.dtype
dtype('float64')

Once again, numpy determines a suitable datatype for this array.再次,numpy 为这个数组确定一个合适的数据类型。

Blindly attempting to change the datatype of an array is not wise:盲目地尝试改变数组的数据类型是不明智的:

>>> integer_array.dtype = np.float32
>>> integer_array
array([  2.80259693e-45,   0.00000000e+00,   2.80259693e-45,
         0.00000000e+00,   4.20389539e-45,   0.00000000e+00], dtype=float32)

Those numbers are gibberish you might say.你可能会说这些数字是胡言乱语。 That's because numpy tries to reinterpret the memory locations of that array as 4-byte floats (the skilled people will be able to convert the numbers to binary representation and from there reinterpret the original integer values).这是因为 numpy 试图将该数组的内存位置重新解释为 4 字节浮点数(技术人员将能够将数字转换为二进制表示,并从那里重新解释原始整数值)。

If you want to cast, you'll have to do it explicitly and numpy will return a new array:如果你想投射,你必须明确地进行,numpy 将返回一个数组:

>>> integer_array.dtype = np.int64 # go back to the previous interpretation
>>> integer_array
array([2, 2, 3])
>>> integer_array.astype(np.float32)
array([ 2.,  2.,  3.], dtype=float32)

Now, to address your specific questions:现在,要解决您的具体问题:

1a) If instantiate a with a = FourVector(ct=5,r=[55,2.,3]), then type(a._r[0]) returns numpy.float64 as opposed to numpy.int32. 1a) 如果用 a = FourVector(ct=5,r=[55,2.,3]) 实例化 a,则 type(a._r[0]) 返回 numpy.float64 而不是 numpy.int32。 What is going on here?这里发生了什么? I expected just a._r[1] to be a float, and instead it changes the type of the whole list?我希望 a._r[1] 是一个浮点数,而是改变了整个列表的类型?

That's because numpy has to determine a datatype for the entire array (unless you use a structured array ), ensuring all elements fit in that datatype.这是因为 numpy 必须确定整个数组的数据类型(除非您使用结构化数组),确保所有元素都适合该数据类型。 Only then can numpy iterate over the elements of that array efficiently.只有这样,numpy 才能有效地迭代该数组的元素。

1b) How do I get the above behaviour (The whole list being floats), without having to instantiate the variables as floats? 1b)如何获得上述行为(整个列表是浮点数),而不必将变量实例化为浮点数? I read up on the documentation and have tried various methods, like using astype(float), but everything I do seems to keep it as an int.我阅读了文档并尝试了各种方法,例如使用 astype(float),但我所做的一切似乎都将其保留为 int。 Again, thinking this is the mutable/immutable problem I'm having.再次,认为这是我遇到的可变/不可变问题。

Specify the dtype when you are creating the array.在创建数组时指定dtype In your code, that would be:在您的代码中,这将是:

self._r = np.array(r, dtype=np.float)

2) I had thought, in the tempx=... line, multiplying by 1.0 would convert it to a float, as it appears this is the reason ct converts to a float, but for some reason it doesn't. 2)我曾想过,在 tempx=... 行中,乘以 1.0 会将其转换为浮点数,因为这似乎是 ct 转换为浮点数的原因,但由于某种原因它没有。 Perhaps the same reason as the others?也许和其他人一样的原因?

That is true.那是真实的。 Try printing the datatype of tempx , it should be a float.尝试打印tempx的数据类型,它应该是一个浮点数。 However, later on, you are reinserting that value into the array self._r , which has the dtype of int.但是,稍后,您将该值重新插入数组self._r ,该数组的数据self._r为 int。 And as you saw previously, that will cast the float back to an integer type.正如您之前看到的,这会将浮点数转换回整数类型。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM