简体   繁体   English

在 Python 中循环遍历二维数组的最佳有效方法是什么

[英]What is the best efficient way to loop through 2d array in Python

I am new to Python and machine learning.我是 Python 和机器学习的新手。 I can't find best way on the internet.我在互联网上找不到最好的方法。 I have a big 2d array (distance_matrix.shape= (47, 1328624)).我有一个大的二维数组(distance_matrix.shape= (47, 1328624))。 I wrote below code but it takes too long time to run.我写了下面的代码,但运行时间太长。 For loop in for loop takes so time. for 循环中的 for 循环需要很长时间。

distance_matrix = [[0.21218192, 0.12845819, 0.54545613, 0.92464129, 0.12051526, 0.0870853 ], [0.2168166 , 0.11174682, 0.58193855, 0.93949729, 0.08060061, 0.11963891], [0.23996999, 0.17554854, 0.60833433, 0.93914766, 0.11631545, 0.2036373]]
                    
iskeleler = pd.DataFrame({
    'lat':[40.992752,41.083202,41.173462],
    'lon':[29.023165,29.066652,29.088163],
    'name':['Kadıköy','AnadoluHisarı','AnadoluKavağı']
}, dtype=str)

for i in range(len(distance_matrix)):
    for j in range(len(distance_matrix[0])):
        if distance_matrix[i][j] < 1:
            iskeleler.loc[i,'Address'] = distance_matrix[i][j]
        
print(iskeleler)

To explain, I am sharing the first 5 rows of my array and showing my dataframe.为了解释,我分享了我的数组的前 5 行并展示了我的 dataframe。 İskeleler dataframe distance_matrix İskeleler dataframe distance_matrix

The "İskeleler" dataframe has 47 rows. “İskeleler”dataframe 有 47 行。 I want to add them to the 'Address' column in row i in the "İskeleler" by looking at all the values in row i in the distance_matrix and adding the ones less than 1. I mean if we look at the first row in the distance_matrix photo, I want to add the numbers like 0.21218192 + 0.12845819 + 0.54545613.... and put them in the 'address' column in the i'th row in the İskeleler dataframe.我想通过查看 distance_matrix 中第 i 行中的所有值并添加小于 1 的值,将它们添加到“İskeleler”中第 i 行的“地址”列。我的意思是,如果我们查看第一行distance_matrix 照片,我想添加 0.21218192 + 0.12845819 + 0.54545613.... 之类的数字,并将它们放在 İskeleler dataframe 的第 i 行的“地址”列中。

My intend is to loop through distance_matrix and find some values which smaller than 1. The code takes too long.我的意图是遍历 distance_matrix 并找到一些小于 1 的值。代码花费的时间太长。 How can i do this with faster way?我怎样才能以更快的方式做到这一点?

I think you mean this:我想你的意思是:

import numpy as np

# Set up some dummy data in range 0..100
distance = np.random.rand(47,1328624) * 100.0

# Boolean mask of all values < 1
mLessThan1 = distance<1

# Sum elements <1 across rows 
result = np.sum(distance*mLessThan1, axis=1)

That takes 168ms on my Mac.在我的 Mac 上这需要 168 毫秒。

In [47]: %timeit res = np.sum(distance*mLessThan1, axis=1)
168 ms ± 914 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM