[英]How to convert 2D arrays in dictionary into one single array?
I have the following code:我有以下代码:
import random
import numpy as np
import pandas as pd
num_seq = 100
len_seq = 20
nts = 4
sequences = np.random.choice(nts, size = (num_seq, len_seq), replace=True)
sequences = np.unique(sequences, axis=0) #sorts the sequences
d = {}
pr = 5
for i in range(num_seq):
globals()['seq_' + str(i)] = np.tile(sequences[i,:],(pr,1))
d['seq_' + str(i)] = np.tile(sequences[i,:],(pr,1))
pool = np.empty((0,len_seq),dtype=int)
for i in range(num_seq):
pool = np.concatenate((pool,eval('seq_' +str(i))))
I want to convert the dictionary d
into a Numpy array (or a dictionary with just one entry).我想将字典
d
转换为 Numpy 数组(或只有一个条目的字典)。 My code works, producing pool
.我的代码有效,产生
pool
。 However, at bigger values for num_seq
, len_seq
and pr
, it takes a very long time.但是,在
num_seq
、 len_seq
和pr
的值较大时,需要很长时间。
The execution time is critical, thus my question: is there a more efficient way of doing this?执行时间很关键,因此我的问题是:有没有更有效的方法来做到这一点?
Here is a list of important points:以下是要点列表:
np.concatenate
runs in O(n)
so your second loop is running in O(n^2)
time. np.concatenate
在O(n)
中运行,因此您的第二个循环在O(n^2)
时间内运行。 You can append the value to a list and np.vstack
all the values in the end (in O(n)
time).np.vstack
将所有值放在最后(在O(n)
时间内)。globals()
is slow and known to be a bad practice (because it can easily break your code in nasty ways);globals()
很慢并且被认为是一种不好的做法(因为它很容易以令人讨厌的方式破坏您的代码);eval(...)
is slow too and also unsafe, so avoid it;eval(...)
也很慢而且也不安全,所以避免它; Here is an example of faster code (in replacement of the second loop):这是一个更快的代码示例(代替第二个循环):
pool = np.vstack([d[f'seq_{i}'] for i in range(num_seq)])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.