简体   繁体   English

在 for 循环中创建多个数组 (Python)

[英]Creating multiple arrays within a for loop (Python)

I'm currently having an issue with Numpy arrays.我目前在使用 Numpy 数组时遇到问题。 If this question has already been asked elsewhere, I apologize, but I feel that I have looked everywhere.如果这个问题已经在别处问过了,我很抱歉,但我觉得我到处都看过。

My initial issue was that I was attempting to create an array and fill it with multiple sets of station data of different sizes.我最初的问题是我试图创建一个数组并用多组不同大小的站数据填充它。 Since I cannot fill the same array with data sets that vary in size, I decided I need to create a new array for each station data set by defining the array inside the for loop I'm using to iterate through each station data set.由于我无法用大小不同的数据集填充同一个数组,我决定需要通过在我用来迭代每个站数据集的 for 循环中定义数组来为每个站数据集创建一个新数组。 The problem with this is that, while looping through, each data set will overwrite the previous data set, returning only the final instance of the for loop.这样做的问题是,在循环时,每个数据集都会覆盖之前的数据集,只返回 for 循环的最终实例。

Then, I tried using the + and then the join operations to concatenate a new title for each array, but turns out that is illegal when defining arrays.然后,我尝试使用 + 然后连接操作来为每个数组连接一个新标题,但结果证明在定义数组时这是非法的。 This is the instance of the program where each data array overwrites the previous one.这是每个数据数组覆盖前一个数据数组的程序实例。 Note that not all the code is included and that this is part of a definition.请注意,并非所有代码都包含在内,这是定义的一部分。

for k in range(len(stat_id)):

    ## NOTE - more code precedes this final portion of the for loop, but was
    ## not included as it is unrelated to the issue at hand.

    # Bring all the data into one big array.
    metar_dat = np.zeros((len(stat_id),len(temp),7), dtype='object')
    for i in range(len(temp)):
        metar_dat[k,i] = np.dstack((stat_id[k], yr[i], month[i], day[i], time[i], temp[i], dwp[i]))
    #print np.shape(metar_dat[k])
    #print metar_dat[k]

#print np.shape(metar_dat) # Confirm success with shape read.
return metar_dat

Upon running and printing the array from this definition, I get this (two empty arrays and a final filled array):从这个定义运行和打印数组后,我得到了这个(两个空数组和一个最终填充的数组):

[[[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
..., 
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]]

[[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
..., 
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]
[0 0 0 ..., 0 0 0]]

[[\TZR 2015 7 ..., 2342 58 48]
[\TZR 2015 7 ..., 2300 59 47]
[\TZR 2015 7 ..., 2200 60 48]
..., 
[\TZR 2015 7 ..., 0042 56 56]
[\TZR 2015 7 ..., 0022 56 56]
[\TZR 2015 7 ..., 0000 56 56]]]

My question is this:我的问题是这样的:

How can I create an array for each set of station data such that I do not overwrite any previous data?我怎样才能为每组站数据创建一个数组,这样我就不会覆盖任何以前的数据?

Or或者

How can I create a single array that contains data sets with varying numbers of rows?如何创建包含行数不同的数据集的单个数组?

I am still new to Python (and new to posting here) and any ideas would be much appreciated.我仍然是 Python 的新手(并且是在这里发帖的新手),任何想法都将不胜感激。

You're setting your 2D array to zero inside your k-loop each time.您每次都k 循环中将 2D 数组设置为零。 Set it to zero (or empty, if all elements get filled, as in your case) once outside your nested loop, and you should be fine:一旦在嵌套循环之外将其设置为零(或空,如果所有元素都被填充,就像你的情况一样),你应该没问题:

metar_dat = np.empty((len(stat_id),len(temp),7), dtype='object')
for k in range(len(stat_id)):
    for i in range(len(temp)):
        metar_dat[k,i] = np.dstack((stat_id[k], yr[i], month[i], day[i], time[i], temp[i], dwp[i]))
return metar_dat

You get a metar_dat array that is mostly 0 because it is the one you created at the last k iteration.你得到一个metar_dat阵列主要是0,因为它是你在最后创造了一个k迭代。 It was len(stat_id) long (in the 1st dimensions) but you only inserted data for the last k .它是len(stat_id)长(在第一维中),但您只插入了最后一个k数据。 You threw away the results for the earlier k .您丢弃了较早的k的结果。

I would suggest collecting the data in a dictionary, rather than object array.我建议在字典中收集数据,而不是对象数组。

metar_dat = dict()  # dictionary rather than object array
for id in stat_id:
    # Bring all the data into one big array.
    data = np.column_stack([yr, month, day, time,temp, dwp])
    # should produce as (len(temp),6) integer array
    # or float is one or mo    for k in range(len(stat_id)):
    metar_dat[id] = data

If len(temp) varies for each id , you can't make a meaningful 3d array with shape (len(stat_id), len(temp), 7) - unless you pad every one to the same maximum length.如果len(temp)因每个id不同而不同,则无法制作具有形状(len(stat_id), len(temp), 7)的有意义的 3d 数组 - 除非将每个数组都填充到相同的最大长度。 When thinking about arrays, thing rectangles, not ragged lists.在考虑数组时,事物是矩形,而不是参差不齐的列表。

A Python dictionary is a much better way of collecting information by some sort of unique id. Python 字典是一种通过某种唯一 id 收集信息的更好方法。

Object arrays let you generalize the concept of numeric arrays, but they don't give much added power compared to lists or dictionaries.对象数组可让您概括数值数组的概念,但与列表或字典相比,它们并没有提供太多额外的功能。 You can't for example, add values across the 'id' dimension.例如,您不能跨“id”维度添加值。

You need to describe what you hope to do with this data once you collect it.您需要描述您希望在收集这些数据后如何处理这些数据。 That will help guide our recommendations regarding the data representation.这将有助于指导我们关于数据表示的建议。

There are other ways of defining the data structure for each id .还有其他方法可以为每个id定义数据结构。 It looked like yr , time , temp were equal length arrays.看起来yrtimetemp是等长数组。 If they are all numbers they could be collected into an array with 6 columns.如果它们都是数字,则可以将它们收集到一个具有 6 列的数组中。 If it is important to keep some integer, while others are floats (or even strings) you could use a structured array.如果保留一些整数很重要,而其他则是浮点数(甚至是字符串),您可以使用结构化数组。

Structured arrays are often produced by reading column data from a csv file.结构化数组通常是通过从 csv 文件中读取列数据来生成的。 Some columns will have string data (ids) others integers or even dates, others float data.一些列将具有字符串数据(id),其他的整数甚至日期,其他的浮点数据。 np.genfromtxt is a good tool for loading that sort of file. np.genfromtxt是加载此类文件的好工具。

You might also take a look into this post,你也可以看看这篇文章,

How can I make multiple empty arrays in python? 如何在python中创建多个空数组?

Lookup list comprehensions查找列表理解

listOfLists = [[] for i in range(N)] Now, listOfLists has N empty lists in it listOfLists = [[] for i in range(N)] 现在,listOfLists 中有 N 个空列表

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM