简体   繁体   English

python numpy结构化数组问题

[英]python numpy structured array issue

I'm relatively new to numpy. 我对numpy比较陌生。 I have imported data from .csv file with dates in fromat YYYY,MM,DD and some other stuff. 我已经从.csv文件中导入数据,其中的日期为fromat YYYY,MM,DD和其他一些东西。 I would like to put everything into one array, with dates in the "proper" datetime format. 我想将所有内容放在一个数组中,日期为“正确”的日期时间格式。 This is my code: 这是我的代码:

na_trades = np.zeros((number_of_orders,), dtype = ('datetime64,a5,a5,i4'))
for row in range(number_of_orders):
    order = na_trades_csv[row]
    order_date = dt.datetime(order[0],order[1],order[2])
    order_date64 =  np.datetime64(order_date)
    na_trades[row] = (order_date64,order[3],order[4],order[5])

But I'm getting error ValueError: error setting an array element with a sequence . 但我收到错误ValueError: error setting an array element with a sequence Any idea as to why is that? 知道为什么会这样吗? Thanks for help in advance! 提前感谢您的帮助!

Using numpy version 1.6.2, dtype = 'datetime64,a5,a5,i4' does not result in the intended dtype: 使用numpy版本1.6.2, dtype = 'datetime64,a5,a5,i4'不会产生预期的dtype:

In [36]: na_trades = np.zeros((number_of_orders,), dtype = 'datetime64,a5,a5,i4')
In [37]: na_trades
Out[37]: array([1970-01-01 00:00:00], dtype=datetime64[us])

This looks like a bug to me -- though I could be wrong. 这看起来像是一个错误 - 虽然我可能是错的。 Try instead: 尝试改为:

na_trades = np.empty(number_of_orders,
                     dtype = [
                         ('dt', 'datetime64'),
                         ('foo','a5'),
                         ('bar', 'a5'),
                         ('baz', 'i4')])

This is because in numpy arrays (unlike python lists) you cannot assign a sequence to a single element in the array. 这是因为在numpy数组(与python列表不同)中,您无法将序列分配给数组中的单个元素。 Python arrays are nonhomogenous (eg different elements can be of different types) and don't really care what you throw into them, whereas Numpy arrays have a specfic type. Python数组是非同构的(例如,不同的元素可以是不同的类型)并且并不真正关心你投入它们的内容,而Numpy数组具有特定类型。 You're trying to set the type to be a composite type (eg something with a datetime , two strings and an int) but numpy is ignoring everything after the datetime64 in your dtype string because your syntax is a little off. 您正在尝试将类型设置为复合类型(例如,具有datetime ,两个字符串和一个int的内容)但是numpy忽略了dtype字符串中datetime64之后的所有内容,因为您的语法有点偏差。

Try the following: 请尝试以下方法:

z = np.zeros((5,), dtype = np.dtype([('time','datetime64'),('year','a5'),('month','a5'),('day','i4')]))

This creates a numpy.void type that acts like a dictionary. 这会创建一个numpy.void类型,就像字典一样。 Eg you can then do the following: 例如,您可以执行以下操作:

>>> z[0]
(datetime.datetime(1970, 1, 1, 0, 0), '', '', 0)

>>> z[0]['time']
1970-01-01 00:00:00

>>> z[0][0]
1970-01-01 00:00:00

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM