简体   繁体   English

动态为多个数据集创建数组

[英]Dynamically creating arrays for multiple datasets

This is a quality of life query that I feel like there is an answer to, but can't find (maybe I'm using the wrong terms) 这是一种生活质量查询,我觉得可以解决,但是找不到(也许我使用的是错误的字词)

Essentially, I have multiple sets of large data files that I would like to perform analysis on. 本质上,我要对多组大型数据文件进行分析。 This involves reading each of these datafiles and storing them as an array (of variable length). 这涉及读取每个数据文件,并将它们存储为数组(可变长度)。

So far I have been doing 到目前为止,我一直在做

import numpy as np

input1 = np.genfromtxt('data1.dat')
input2 = np.genfromtxt('data2.dat')

etc. I was wondering if there is a method of dynamically assigning an array to each of these datasets. 等等。我想知道是否有一种方法可以动态地将数组分配给每个数据集。 Since you can read these dynamically with a for loop, 由于您可以使用for循环动态读取这些内容,

for i in xrange(2):

input = np.genfromtxt('data%i.dat'%i)

I was hoping to combine the above to create a bunch of arrays; 我希望结合以上内容来创建一堆数组。 input1, input2, etc. without myself typing out genfromtxt multiple times. input1,input2等,而无需自己多次键入genfromtxt。 Surely there is a method if I had 100 datasets (aptly named data0, data1, etc) to import. 如果我要导入100个数据集(恰当地命名为data0,data1等),肯定有一种方法。

A solution I can think of is maybe creating a function, 我能想到的解决方案可能是创建一个函数,

import numpy as np

def input(a):
    return np.genfromtxt('data%i.dat'%a)

But obviously, I would prefer to store this in memory instead of constantly regenerate a list, and would be extremely grateful to know if this is possible in Python. 但是显然,我宁愿将其存储在内存中,而不是不断地重新生成列表,并且非常感谢您知道在Python中是否可行。

You can choose to store your arrays in either a dict or a list : 您可以选择将数组存储在dictlist

Option 1 选项1

Using a dict . 使用dict

data = {}
for i in xrange(2):
    data['input{}'.format(i)] = np.genfromtxt('data{}.dat'.format(i))

You can access each array by key. 您可以通过键访问每个阵列。


Option 2 选项2

Using a list . 使用list

data = []
for i in xrange(2):
    data.append(np.genfromtxt('data{}.dat'.format(i)))

Alternatively, using a list comprehension: 或者,使用列表推导:

data = [np.genfromtxt('data{}.dat'.format(i)) for i in xrange(2)]

You can also use a map , it returns a list: 您还可以使用map ,它返回一个列表:

data = map(lambda x: np.genfromtxt('data{}.dat'.format(x)), xrange(2))

Now you can access each array by index. 现在,您可以按索引访问每个数组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM