Sorting through very large numpy arrays

Question

Im new to python (3.6) and currently using it for scientific data analysis. I have used the search function but haven't been able to find anything helpful as I'm unsure of the terms used to describe this type of sorting. I have a numpy array with values from 0 to ~40,000,000 that has a length of ~4,000,000 (list_a). I also have a second array of the same length with values between 0-1000 (list_b) which corresponds to the first array. Im looking to create a new numpy array (list_c) with values from list_a that have their corresponding value in list_b outside of a defined timegate.

import numpy as np
def function(list_a,list_b,timegate=(0,200)):
    list_c = np.array([])
    for x in range(len(list_a)):
        if list_a[x] in range (timegate[0],timegate[1]):
            list_c=np.append(list_c, list_b[x])
    return list_c

I have this function written which works for much smaller length arrays, how ever as arrays get longer it slows down considerably. Im looking for a way to speed this process up if possible, any help would be greatly appreciated.

Thanks,

Jordan

Answer 1

You can easily vectorize your operation:

def function(a, b, gate=(0, 200)):
    return b[(a >= gate[0]) & (a <= gate[1])]

Sorting through very large numpy arrays

Question

1 answers

solution1
0 2018-03-09 15:06:04

Sorting through very large numpy arrays

Question

1 answers

solution1 0 2018-03-09 15:06:04

solution1
0 2018-03-09 15:06:04