繁体   English   中英

随机选择字符串中元素的 % 并更改值

[英]Randomly selecting a % of elements in a string and changing the value

我有一个字符串值数组,需要循环遍历它们,随机替换每个元素中 5% 的元素,如果它们是 1,则将它们翻转为 0,如果它们是 0,则将它们翻转为 1。

我有一个字符串值数组,如下所示:

['10011000000100000000011101100010001000110111101100100101100000111010000110011111001101001110100111110110000110001001010001010001110000000000000111110000111100010011101001011001111111011010001001100100110110000001000001010100111111110010011001100001001100011001111010010011000101000101111001100000110011101100000010110111011000010001111011010000111010100001101011000110111000000010000000111100010100100110101101111011001010000001110010110100011110000010001101110001101000100011001110101000100011111010',
 '11010000000110110011011110111010011011111010000101101101111010101000100000001010011100011011101111001000100000011110000011001100101011100111111001001101111110001101001100010111000100100010010111001110010110010101010110100000110011011110100110010011110000101001111111001001001101011000111001101101011000111101010010000011001001011110011010101111110001010100001011000001011110001011100100010011001101111100001111101000000010001010001100001010000010000000000001010101001110110111000010010001001001010101',
 '10011000000100000000011101100010001000110111101100100101100000111010000110011111001101001110100111110110000110001001010001010001110000000000000111110000111100010011101001011001111111011010001001000001011010100011000101000001100101000101010000001100111100101000011010000001011000000000000000011010100111100111010001111010000101100101010000110011111011111110100011111000001110111111001011011111101011110100000011101101101110010101001010100110111010000111000000111000110010110110001101111010011110000111',
 '11010000000110110011011110111010011011111010000101101101111010101000100000001010011100011011101111001000100000011110000011001100101011100111111001001101111110001101001100010111000100100010010111001110010110010101010110100000110011011110100110010011110000101001111111001001001101011000111001101101011000111101010010011100101111001010010000010010101101001000001111010110000111110100100001101101111011110101001000001101101100110110001110011000010000110110011100100001001101011010101100010011110111000000']

实际上,字符串中 5% 的值将从 0 变为 1,反之亦然。

试试这个循环:

for idx,i in enumerate(l):
    y=list(i)
    for x in random.sample(range(len(i)),(len(i)*5)//100):
        y[x]=str(abs(int(y[x])-1))
    l[idx]=''.join(y)

是否从 1 翻转到 0,反之亦然,并且只有 5%。

使用 random.choices 获取 5% 的索引

import random
[[i for i in random.choices(range(len(arr[j])), k=int(len(arr[j]) * 0.05))] for j in range(len(arr))]

为了有效地生成新的字符串数组,您必须避免在每次修改时改变字符串。 因此,我提供了我的解决方案( method2 ),如果需要,可以对其进行更优化。

method 12很接近,不同之处在于使用理解列表而不是 for 循环。

method 3较慢,因为生成长度为 500 的修改字符串所需的时间比生成线程的时间短。 但对于更长的字符串,这种方法可能是最快的。

方法4来自U9-Forward

#!/usr/bin/env python3
import random
import timeit
from typing import List
from multiprocessing.pool import Pool
from statistics import mean


def get_partial_str(my_binary_string: str, mutated_bit):
    start = 0
    for index, bit_value in mutated_bit:
        yield my_binary_string[start:index] + str(bit_value)
        start = index + 1
    if index != len(my_binary_string):
        yield my_binary_string[start:]


def replace_x_percent(my_binary_string: str, percent: float):
    nb__bit_to_replace = int(percent * len(my_binary_string))
    index_to_mutate = sorted(
        random.sample(range(len(my_binary_string)), nb__bit_to_replace))
    mutated_bit = map(lambda x: (x, 0) if my_binary_string[x] == 1 else (x, 1),
                      index_to_mutate)
    return ''.join(( partial_bit_str for partial_bit_str in  get_partial_str(my_binary_string, mutated_bit)))


def method1(arr: List[str]):
    for i, my_binary_string in enumerate(arr):
        arr[i] = replace_x_percent(my_binary_string, 0.05)
    return arr


def method2(arr: List[str]):
    arr = [replace_x_percent(my_binary_string, 0.05)
           for my_binary_string in arr]
    return arr


def method3(arr: List[str]):
    with Pool(processes=4) as pool:
        arr = pool.starmap(replace_x_percent, ((my_binary_string, 0.05)
                                               for my_binary_string in arr))
    return arr


def method4(arr: List[str]):
    for idx, i in enumerate(arr):
        y = list(i)
        for x in random.sample(range(len(i)), len(i) // 5):
            y[x] = str(abs(int(y[x]) - 1))
        arr[idx] = ''.join(y)
    return arr


if __name__ == '__main__':
    arr = [
        '10011000000100000000011101100010001000110111101100100101100000111010000110011111001101001110100111110110000110001001010001010001110000000000000111110000111100010011101001011001111111011010001001100100110110000001000001010100111111110010011001100001001100011001111010010011000101000101111001100000110011101100000010110111011000010001111011010000111010100001101011000110111000000010000000111100010100100110101101111011001010000001110010110100011110000010001101110001101000100011001110101000100011111010',
        '11010000000110110011011110111010011011111010000101101101111010101000100000001010011100011011101111001000100000011110000011001100101011100111111001001101111110001101001100010111000100100010010111001110010110010101010110100000110011011110100110010011110000101001111111001001001101011000111001101101011000111101010010000011001001011110011010101111110001010100001011000001011110001011100100010011001101111100001111101000000010001010001100001010000010000000000001010101001110110111000010010001001001010101',
        '10011000000100000000011101100010001000110111101100100101100000111010000110011111001101001110100111110110000110001001010001010001110000000000000111110000111100010011101001011001111111011010001001000001011010100011000101000001100101000101010000001100111100101000011010000001011000000000000000011010100111100111010001111010000101100101010000110011111011111110100011111000001110111111001011011111101011110100000011101101101110010101001010100110111010000111000000111000110010110110001101111010011110000111',
        '11010000000110110011011110111010011011111010000101101101111010101000100000001010011100011011101111001000100000011110000011001100101011100111111001001101111110001101001100010111000100100010010111001110010110010101010110100000110011011110100110010011110000101001111111001001001101011000111001101101011000111101010010011100101111001010010000010010101101001000001111010110000111110100100001101101111011110101001000001101101100110110001110011000010000110110011100100001001101011010101100010011110111000000']

    print( 'Starting the benchmark:' )
    t1 = mean(timeit.repeat('method1(arr)', number=1, repeat=10, globals=globals()))
    print('- method 1: {:.5f}'.format(t1))
    t2 = mean(timeit.repeat('method2(arr)', number=1, repeat=10, globals=globals()))
    print('- method 2: {:.5f}'.format(t2))
    t3 = mean(timeit.repeat('method3(arr)', number=1, repeat=10, globals=globals()))
    print('- method 3: {:.5f}'.format(t3))
    t4 = mean(timeit.repeat('method4(arr)', number=1, repeat=10, globals=globals()))
    print('- method 4: {:.5f}'.format(t4))
#!/usr/bin/env python3
import random
import timeit
from typing import List
from multiprocessing.pool import Pool
from statistics import mean


def get_partial_str(my_binary_string: str, mutated_bit):
    start = 0
    for index, bit_value in mutated_bit:
        yield my_binary_string[start:index] + str(bit_value)
        start = index + 1
    if index != len(my_binary_string):
        yield my_binary_string[start:]


def replace_x_percent(my_binary_string: str, percent: float):
    nb__bit_to_replace = int(percent * len(my_binary_string))
    index_to_mutate = sorted(
        random.sample(range(len(my_binary_string)), nb__bit_to_replace))
    mutated_bit = map(lambda x: (x, 0) if my_binary_string[x] == 1 else (x, 1),
                      index_to_mutate)
    return ''.join(( partial_bit_str for partial_bit_str in  get_partial_str(my_binary_string, mutated_bit)))


def method1(arr: List[str]):
    for i, my_binary_string in enumerate(arr):
        arr[i] = replace_x_percent(my_binary_string, 0.05)
    return arr


def method2(arr: List[str]):
    arr = [replace_x_percent(my_binary_string, 0.05)
           for my_binary_string in arr]
    return arr


def method3(arr: List[str]):
    with Pool(processes=4) as pool:
        arr = pool.starmap(replace_x_percent, ((my_binary_string, 0.05)
                                               for my_binary_string in arr))
    return arr


def method4(arr: List[str]):
    for idx, i in enumerate(arr):
        y = list(i)
        for x in random.sample(range(len(i)), len(i) // 5):
            y[x] = str(abs(int(y[x]) - 1))
        arr[idx] = ''.join(y)
    return arr


if __name__ == '__main__':
    arr = [
        '10011000000100000000011101100010001000110111101100100101100000111010000110011111001101001110100111110110000110001001010001010001110000000000000111110000111100010011101001011001111111011010001001100100110110000001000001010100111111110010011001100001001100011001111010010011000101000101111001100000110011101100000010110111011000010001111011010000111010100001101011000110111000000010000000111100010100100110101101111011001010000001110010110100011110000010001101110001101000100011001110101000100011111010',
        '11010000000110110011011110111010011011111010000101101101111010101000100000001010011100011011101111001000100000011110000011001100101011100111111001001101111110001101001100010111000100100010010111001110010110010101010110100000110011011110100110010011110000101001111111001001001101011000111001101101011000111101010010000011001001011110011010101111110001010100001011000001011110001011100100010011001101111100001111101000000010001010001100001010000010000000000001010101001110110111000010010001001001010101',
        '10011000000100000000011101100010001000110111101100100101100000111010000110011111001101001110100111110110000110001001010001010001110000000000000111110000111100010011101001011001111111011010001001000001011010100011000101000001100101000101010000001100111100101000011010000001011000000000000000011010100111100111010001111010000101100101010000110011111011111110100011111000001110111111001011011111101011110100000011101101101110010101001010100110111010000111000000111000110010110110001101111010011110000111',
        '11010000000110110011011110111010011011111010000101101101111010101000100000001010011100011011101111001000100000011110000011001100101011100111111001001101111110001101001100010111000100100010010111001110010110010101010110100000110011011110100110010011110000101001111111001001001101011000111001101101011000111101010010011100101111001010010000010010101101001000001111010110000111110100100001101101111011110101001000001101101100110110001110011000010000110110011100100001001101011010101100010011110111000000']

    print( 'Starting the benchmark:' )
    t1 = mean(timeit.repeat('method1(arr)', number=1, repeat=10, globals=globals()))
    print('- method 1: {:.5f}'.format(t1))
    t2 = mean(timeit.repeat('method2(arr)', number=1, repeat=10, globals=globals()))
    print('- method 2: {:.5f}'.format(t2))
    t3 = mean(timeit.repeat('method3(arr)', number=1, repeat=10, globals=globals()))
    print('- method 3: {:.5f}'.format(t3))
    t4 = mean(timeit.repeat('method4(arr)', number=1, repeat=10, globals=globals()))
    print('- method 4: {:.5f}'.format(t4))

结果:

Starting the benchmark:
- method 1: 0.00038
- method 2: 0.00034
- method 3: 0.11711
- method 4: 0.00207

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM