简体   繁体   中英

How to efficiently mutate certain num of values in an array?

Given an initial 2-D array:

initial = [
 [0.6711999773979187, 0.1949000060558319],
 [-0.09300000220537186, 0.310699999332428],
 [-0.03889999911189079, 0.2736999988555908],
 [-0.6984000205993652, 0.6407999992370605],
 [-0.43619999289512634, 0.5810999870300293],
 [0.2825999855995178, 0.21310000121593475],
 [0.5551999807357788, -0.18289999663829803],
 [0.3447999954223633, 0.2071000039577484],
 [-0.1995999962091446, -0.5139999985694885],
 [-0.24400000274181366, 0.3154999911785126]]

The goal is to multiply some random values inside the array by a random percentage. Lets say only 3 random numbers get replaced by a random multipler, we should get something like this:

output = [
 [0.6711999773979187, 0.52],
 [-0.09300000220537186, 0.310699999332428],
 [-0.03889999911189079, 0.2736999988555908],
 [-0.6984000205993652, 0.6407999992370605],
 [-0.43619999289512634, 0.5810999870300293],
 [0.84, 0.21310000121593475],
 [0.5551999807357788, -0.18289999663829803],
 [0.3447999954223633, 0.2071000039577484],
 [-0.1995999962091446, 0.21],
 [-0.24400000274181366, 0.3154999911785126]]

I've tried doing this:

def mutate(array2d, num_changes):
    for _ in range(num_changes):
        row, col = initial.shape
        rand_row = np.random.randint(row)
        rand_col = np.random.randint(col)
        cell_value = array2d[rand_row][rand_col] 
        array2d[rand_row][rand_col] =  random.uniform(0, 1) * cell_value
    return array2d

And that works for 2D arrays but there's chance that the same value is mutated more than once =(

And I don't think that's efficient and it only works on 2D array.

Is there a way to do such "mutation" for array of any shape and more efficiently?

There's no restriction of which value the "mutation" can choose from but the number of "mutation" should be kept strict to the user specified number.

One fairly simple way would be to work with a raveled view of the array. You can generate all your numbers at once that way, and make it easier to guarantee that you won't process the same index twice in one call:

def mutate(array_anyd, num_changes):
    raveled = array_anyd.reshape(-1)
    indices = np.random.choice(raveled.size, size=num_changes, replace=False)
    values = np.random.uniform(0, 1, size=num_changes)
    raveled[indices] *= values

I use array_anyd.reshape(-1) in favor of array_anyd.ravel() because according to the docs , the former is less likely to make an inadvertent copy.

The is of course still such a possibility. You can add an extra check to write back if you need to. A more efficient way would be to use np.unravel_index to avoid creating a view to begin with:

def mutate(array_anyd, num_changes):
    indices = np.random.choice(array_anyd.size, size=num_changes, replace=False)
    indices = np.unravel_indices(indices, array_anyd.shape)
    values = np.random.uniform(0, 1, size=num_changes)
    raveled[indices] *= values

There is no need to return anything because the modification is done in-place. Conventionally, such functions do not return anything. See for example list.sort vs sorted .

Using shuffle instead of random_choice , this would be a different solution. It works on an array of any shape.

def mutate(arrayIn, num_changes):
    mult = np.zeros(arrayIn.ravel().shape[0])
    mult[:num_changes] = np.random.uniform(0,1,num_changes)
    np.random.shuffle(mult)
    mult = mult.reshape(arrayIn.shape)
    arrayIn = arrayIn + mult*arrayIn
    return arrayIn

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM