简体   繁体   English

替换python字符串/列表中的索引列表中的字符

[英]Replace characters at a list of indices in a python string/list

I have a dictionary as follows: 我有以下字典:

s_dict = {'s' : 'ATGCGTGACGTGA'}

I want to change the string stored as the value of the dictionary for key 's' at positions 4, 6, 7 and 10 to h, k, p and r. 我想将存储在位置4、6、7和10处的键's'的字典值存储的字符串更改为h,k,p和r。

pos_change = {'s' : ['4_h', '6_k', '7_p', '10_r']}

The way I can think about it is in a loop: 我可以考虑的方式是循环的:

for key in s_dict:
    for position in pos_change[key]:
        pos = int(position.split("_")[0])
        char = position.split("_")[1]
        l = list(s_dict[key])
        l[pos]= char
        s_dict[key] = "".join(l)

Output: 输出:

s_dict = {'s': 'ATGChTkpCGrGA'}

This works fine but my actual s_dict file is about 1.5 Gb. 这可以正常工作,但我的实际s_dict文件约为1.5 Gb。 Is there a faster way of replacing a list of characters at specific indices in a string or list? 有没有一种更快的方法来替换字符串或列表中特定索引处的字符列表?

Thanks! 谢谢!

as an option of solution you can use s_dict['s'] = '%s%s%s' % (s_dict['s'][:pos], char, s_dict['s'][pos+1:]) instead of do list and join 作为解决方案的一种选择,您可以使用s_dict['s'] = '%s%s%s' % (s_dict['s'][:pos], char, s_dict['s'][pos+1:])而不是列出并加入

In [1]: s_dict = {'s' : 'ATGCGTGACGTGA' * 10}
   ...: pos_change = {'s' : ['4_h', '6_k', '7_p', '10_r']}
   ...: 
   ...: def list_join():
   ...:     for key in s_dict:
   ...:         for position in pos_change[key]:
   ...:             pos = int(position.split("_")[0])
   ...:             char = position.split("_")[1]
   ...:             l = list(s_dict[key])
   ...:             l[pos]= char
   ...:             s_dict[key] = "".join(l)
   ...: 
   ...: def by_str():
   ...:     for key in s_dict:
   ...:         for position in pos_change[key]:
   ...:             pos = int(position.split("_")[0])
   ...:             char = position.split("_")[1]
   ...:             values = s_dict['s'][:pos], char, s_dict['s'][pos+1:]
   ...:             s_dict['s'] = '%s%s%s' % values
   ...:             

In [2]: %timeit list_join()
11.7 µs ± 191 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [3]: %timeit by_str()
4.29 µs ± 46.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Here is my take on your interesting problem: 这是我对您有趣的问题的看法:

s_dict = {'s' : 'ATGCGTGACGTGA'}    
pos_change = {'s' : ['4_h', '6_k', '7_p', '10_r']}

# 1rst change `pos_change` into something more easily usable
pos_change = {k: dict(x.split('_') for x in v) for k, v in pos_change.items()}
print(pos_change)  # {'s': {'4': 'h', '6': 'k', '7': 'p', '10': 'r'}}

# and then... 
for k, v in pos_change.items():
  temp = set(map(int, v))
  s_dict[k] = ''.join([x if i not in temp else pos_change[k][str(i)] for i, x in enumerate(s_dict[k])])

print(s_dict)  # {'s': 'ATGChTkpCGrGA'}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM