简体   繁体   中英

Faster way to iteratively replace values in relatively large NumPy array

I have a relatively large NumPy array (1212,1612) that contains OBJECTID values corresponding to unique segments of an RGB image. Furthermore, I have a different Pandas Dataframe of 107305 rows that contains all OBJECTID values and their corresponding vegetation type class resulting from a Random Forest classification. I want to (iteratively) replace the OBJECTID values in the NumPy array with the vegetation type class corresponding to that specific OBJECTID value.

The NumPy array ('array') looks like this:

[1,1,1,1,2,2,2,2,3,3,...
 1,1,1,1,1,2,2,2,3,3,...
 1,1,2,4,4,4,2,2,3,3,...] # values 1-4 correspond to OBJECTID

Where the pandas dataframe ('vegdata') looks like:

VEG_TYPE    OBJECTID
Shrub (S)   1
Grass (G)   2
Moss  (M)   3
Grass (G)   4
...   ...   ...

What I want the data to eventually look like is as follows:

[S,S,S,S,G,G,G,G,M,M,...
 S,S,S,S,S,G,G,G,M,M,...
 S,S,G,G,G,G,G,G,M,M,...]

What I am currently doing is:

for row in vegdata.itertuples():
    np.where(array[[array == row.OBJECTID]], row.VEG_TYPE,
             array[[array == row.OBJECTID]])

This code snippet works, yet it is very slow and it takes approximately 30 seconds per 1000 rows in the vegdata dataframe, whereas I have 107305 rows and in some images about ten times as much. I am investigating whether there is an alternative way of performing this analysis, but in the meantime I have not been able to find a suitable approach, so I am wondering whether there is anyone who might know how to do such an analysis in a faster way.

I am new to StackOverflow so I tried to make my answer as clear as possible by providing some code snippets, but if anything is unclear please let me know. Thanks so much in advance!

Here you go:

import numpy as np
import pandas as pd

VEG_TYPE = ['Shrub (S)','Grass (G)','Moss  (M)','Grass (G)']
OBJECTID = [1 ,2 ,3 ,4]

mapping= {k:v for k,v in zip(OBJECTID, VEG_TYPE)}

input_array = np.random.randint(1,5, (10,10))

out = np.empty(input_array.shape, dtype=np.dtype('U100'))
for key,val in mapping.items():
    out[input_array==key] = val

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM