简体   繁体   中英

Python - replace multipe matches in a string with different replacements

I have two text files and replace the XXX placeholders with the actual matches from the second file - in the order that is given in the second file.

The first text is a file with multiple lines and multiple placeholders in one line.

The European Union consists of the following states XXX, XXX, XXX, XXX, XXX, .... The three biggest nations within the European Union are XXX, XXX, XXX.

The second file is a list with one match per line:

Poland Netherlands Denmark Spain Italy Germany France

I'd like to have it replaced as following:

The European Union consists of the following states Poland, Netherlands, Denmark, Spain, Italy, .... The three biggest nations within the European Union are Germany, France, XXX.

So far I've got this coded:

import re
file1 = open("text.txt")

file2 = open("countries.txt") 
output = open("output.txt", "w")
countrylist = []

i=0
for line in file2:
    countrylist[i:] = verweise
    i=i+1

j=0
for line in file1:
    if "XXX" in line:
        line = re.sub("XXX", countrylist[j], line)
        j=j+1
    output.write(line)
    output.flush()
output.close

My problem is that the regular expression replacement is valid not only for the first occurrence/match but for the whole first line. So my output looks like this right now:

The European Union consists of the following states Poland, Poland, Poland, Poland, Poland, .... The three biggest nations within the European Union are Netherlands, Netherlands, Netherlands.

How can I match every single occurrence of XXX to one line of my country list?

Thanks for any help!

在re模块.sub(replacement, string[, count=0]) count = 1应该仅替换第一次出现的情况。

You can call a function for each match the sub finds:

countries = [ 'Poland', 'Netherlands', 'Denmark', 'Spain', 'Italy' ]

def f(match, countriesIter=iter(countries)):
    return countriesIter.next()

line = "The European Union consists of the following states XXX, XXX, XXX, XXX, XXX"

print re.compile('XXX').sub(f, line)

This will print:

The European Union consists of the following states Poland, Netherlands, Denmark, Spain, Italy

Depending on your knowledge it might be better to use a global counter to step through the list of country names:

count = 0
def f(match):
  global count
  result = countries[count]
  count += 1
  return result

This is less elegant but way better to understand in case you have no deeper experience with the Python internals and generators etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM