简体   繁体   English

用pytables构造巨大的numpy数组

[英]Construct huge numpy array with pytables

I generate feature vectors for examples from large amount of data, and I would like to store them incrementally while i am reading the data. 我从大量数据生成示例的特征向量,我想在读取数据时以增量方式存储它们。 The feature vectors are numpy arrays. 特征向量是numpy数组。 I do not know the number of numpy arrays in advance, and I would like to store/retrieve them incrementally. 我事先不知道numpy数组的数量,我想以增量方式存储/检索它们。

Looking at pytables, I found two options: 查看pytables,我发现了两个选择:

  1. Arrays : They require predetermined size and I am not quite sure how much appending is computationally efficient. 数组 :它们需要预定的大小,我不确定多少附加在计算上是有效的。
  2. Tables : The column types do not support list or arrays. 表格 :列类型不支持列表或数组。

If it is a plain numpy array, you should probably use Extendable Arrays (EArray) http://pytables.github.io/usersguide/libref/homogenous_storage.html#the-earray-class 如果它是一个普通的numpy数组,则可能应该使用可扩展数组(EArray) http://pytables.github.io/usersguide/libref/homogenous_storage.html#the-earray-class

If you have a numpy structured array, you should use a Table. 如果您有一个numpy结构化数组,则应使用一个Table。

Can't you just store them into an array? 您不能仅将它们存储到数组中吗? You have your code and it should be a loop that will grab things from the data to generate your examples and then it generates the example. 您已经有了代码,它应该是一个循环,该循环将从数据中获取内容以生成示例,然后生成示例。 create an array outside the loop and append your vector into the array for storage! 在循环外创建一个数组,然后将向量附加到数组中进行存储!

array = []
for row in file:
    #here is your code that creates the vector
    array.append(vector)

then after you have gone through the whole file, you have an array with all of your generated vectors! 然后,在遍历整个文件之后,您将拥有一个包含所有生成的向量的数组! Hopefully that is what you need, you were a bit unclear...next time please provide some code. 希望这是您所需要的,您有点不清楚...下一次,请提供一些代码。

Oh, and you did say you wanted pytables, but I don't think it's necessary, especially because of the limitations you mentioned 哦,您确实说过您想要pytables,但我认为没有必要,尤其是因为您提到的限制

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM