np.array([5.3, 1.2, 76.1, 'Alice', 'Bob', 'Claire'])
我想知道为什么这给出了 dtype=U32 的 dtype,但是下面的代码给出了 U6 的 dtype。
np.array(['Alice', 'Bob', 'Claire', 5.3, 1.2, 76.1])
Numpy tries to be efficient when storing datatypes by calculating how many bits it will take to store an object.
import np
a = np.array([5.3, 1.2, 76.1, 'Alice', 'Bob', 'Claire'])
b = np.array(['Alice', 'Bob', 'Claire', 5.3, 1.2, 76.1])
print(a.dtype, b.dtype)
>>> <U32 <U6
Numpy sees 5.3
and puts it into a datatype which is a 32-codepoint data-type due to the datatype conversion rules:
Type of the data (integer, float, Python object, etc.)
Size of the data (how many bytes is in eg the integer)
Byte order of the data (little-endian or big-endian)
If the data type is structured data type, an aggregate of other data types, (eg, describing an array item consisting of an integer and a float),
what are the names of the “fields” of the structure, by which they can be accessed,
what is the data-type of each field, and
which part of the memory block each field takes.
If the data type is a sub-array, what is its shape and data type.
When it sees the other strings in the array, they can fit within the 32-codepoint data-type and so it doesn't have to be changed.
Now, consider the second example. Numpy sees Alice
and puts it into a datatype which can hold six bits. Numpy continues along and sees 5.3
, which can also be fit into a 6-codepoint data-type. So no upgrading is required.
Similarly, when running np.array(['Alice', 'Bob', 'Claire', 5.3, 1.2, 76.1, 'Bobby', 2.3000000000001])
it results in a U15
as Numpy sees 2.3000000000001
and finds out that the datatype that it is using is not large enough to hold 2.3000000000001
and then upgrades it.
https://numpy.org/devdocs/reference/arrays.dtypes.html#arrays-dtypes
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.