简体   繁体   中英

Insert multiple python substrings by index 'at same time'

Suppose I have a string

a = 'The dog in the street.' (so len(a)=8).
     01234567  (just adding indices for extra illustration)

Now I want to change that string to include some arbitrary words in arbitrary places, say, from the (arbitrarily sized) dict:

d = {
        'w1': {'begin':'0', 'end':'3', 'w':'BIG'}
        'w2': {'being':'4', 'end':'7', 'w':'BARKED
    }

where wx contains info about a word to insert, with the fields meaning:

  • being: the start index of the word we want to insert after (inclusive)

  • end: the end index of the word we want to insert after (exclusive)

  • w: the word to insert

So 'applying' the dict d to string a, we would get:

a = 'TheBIGdogBARKEDin the street.'
     0123456789...

Note that, though I have ordered the dictionary values here so that the words to be inserted are in left-to-right order, this is not always the case.

I was initially trying to to do this with something like:

for word in d:
    insertion_loc = word['end']
    a = "{}{}{}".format(a[:insertion_loc], word['w'], a[insertion_loc:]) 

But when doing this, each iteration changes the total length of the string, so the begin and end indices no longer are applicable for the next word in the dict that wants to be inserted into the string. The only other way the immediately comes to mind is calculating new offsets for insertion based on the previously inserted substring(s) length(s) and whether the current string to be inserted is going to be inserted before or after the previously inserted substrings' locations (which seems like it would look a bit ugly).

Is there another way to do this? Thanks.

您可以从末尾向前插入,这样您就不必考虑增加的索引

You can use re to find the characters that occur at d[word]['end'] and use str.format to replace those characters with the desired 'w' value:

import re
s = "The dog.\n01234567"
d = {
    'w1': {'begin':'0', 'end':'3', 'w':'BIG'},
    'w2': {'being':'7', 'end':'7', 'w':'BARKED'}
}
final_s = re.sub('|'.join('\{}'.format(s[int(b['end'])]) for _, b in d.items()), "{}", s).format(*[c['w'] for _, c in sorted(d.items(), key=lambda x:int(x[0][-1]))])

Output:

TheBIGdogBARKED
01234567

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM