简体   繁体   English

如何在不使用循环的情况下从四个不同长度的 arrays 创建 np.ndarray?

[英]How can I create np.ndarray from four arrays of different lengths without using loops?

I have 5 arrays, that I need to create two large arrays of shape ((1000,5000)) to use in regression analysis.我有 5 个 arrays,我需要创建两个大的 arrays 形状 ((1000,5000)) 用于回归分析。 I'm currently using the following code.我目前正在使用以下代码。

# data shapes are as follows
bbeta.shape = ((5000,2))
rand1.shape = ((1000))
rand2.shape = ((1000))
rand3.shape = ((1000))
depths.shape = ((1000))

# creating x and y for regression
for i in np.arange(0,1000,1):
    for j in np.arange(0,5000,1):
        y[i,j] = np.log((rand1[i] * (depths[i]/3500)**bbeta[j,0]))
        x[i,j] = np.log(rand2[i] + (bbeta[j,1] * rand3[i]))

This code takes ~30 seconds to run, which is fine I guess, but I need to run this over 1000 times to find the bootstrap standard error, which means my code will currently take > 8 hours.这段代码需要大约 30 秒才能运行,我猜这很好,但我需要运行 1000 多次才能找到引导程序标准错误,这意味着我的代码目前需要超过 8 小时。 I've cut down the number created datasets significantly (from bbeta.shape = ((17801,2)), but it's not sufficient.我已经显着减少了创建数据集的数量(来自 bbeta.shape = ((17801,2)),但这还不够。

If I can stick with numpy, or at least if it's faster to convert from and back to numpy, as the rest of my code is written using numpy. If I can stick with numpy, or at least if it's faster to convert from and back to numpy, as the rest of my code is written using numpy.

I wondered if anyone knew of a way to do the same thing as above but faster, as I'm aware that loops aren't computationally efficient.我想知道是否有人知道一种方法可以做与上述相同但更快的事情,因为我知道循环的计算效率不高。 I've looked through stackoverflow but I couldn't find anyone that answers this question (at least in a way that I can recognise helps me).我查看了stackoverflow,但找不到任何人回答这个问题(至少以我可以识别的方式帮助我)。

Sorry if any of this isn't clear - I'm dyslexic and a new-ish to python.抱歉,如果有任何不清楚的地方 - 我有阅读障碍,并且对 python 很陌生。 Any help anyone can give me would be greatly appreciated任何人都可以给我的任何帮助将不胜感激

You could take advantage of broadcasting and do the following:您可以利用广播并执行以下操作:

y2 = np.power((depths / 3500)[:, np.newaxis], bbeta[:, 0])
y2 = np.log(rand1[:, np.newaxis] * y2)

x2 = bbeta[:, 1] * rand3[:, np.newaxis]
x2 = np.log(rand2[:, np.newaxis] + x2)

print(np.allclose(y, y2))
print(np.allclose(x, x2))

Output Output

True
True

The test data was generated with the following code:使用以下代码生成测试数据:

bbeta = np.random.random((5000, 2))
rand1 = np.random.random((1000))
rand2 = np.random.random((1000))
rand3 = np.random.random((1000))
depths = np.random.random((1000))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM