简体   繁体   中英

Regex to find only pairs of numbers and then concatenate value

I have a data set with records like the following

Tenochtitlan 1519
Tetzcoco 20
Tlacopan 21

I need a regex that will return only number that exist in pairs (ie in the above example 20 and 21) - ultimately so I can a prefix to the numbers and end up with:

Tenochtitlan 1519
Tetzcoco 1520
Tlacopan 1521

I've tried this, just having trouble with the match (matching '15' from the first record) and then getting the match as a string output:

list = ["Tenochtitlan 1519","Tetzcoco 20","Tlacopan 21"]
    
for x in list:
     m = re.compile("(\d\D*?){2}")
     match_val = m.search(x)
     concat = "15" + str(match_val)
     re.sub(str(match_val), x, concat)

for x in list:
    print(x)
     
 

Result -

Tenochtitlan 1519
Tetzcoco 20
Tlacopan 21

First, str(match_val) is not doing what you think it's doing. From the debugger:

(Pdb) str(match_val)
"<re.Match object; span=(13, 15), match='15'>"

Secondly, the value of x is never being changed. sub() only returns the new string. Demonstrating in iPython:

In [1]: import re

In [2]: x = "string"

In [3]: re.sub("ing", "ingthing", x)
Out[3]: 'stringthing'

In [4]: x
Out[4]: 'string'

You'll also run into difficulty replacing the original value in a for... in loop.

Third, you've got your arguments to sub() in the wrong order. It goes: regex, replacement string, original string.

Fourth: Your original regex is kind of strange and probably not matching what you expect. \\s\\d\\d$ or \\s\\d{2}$ is probably closer to what you expect.

One way to do this would be to use a capture group (parenthesis) and a backreference (a backslash and a digit) to do the substitution all in one go:

import re

list = ["Tenochtitlan 1519","Tetzcoco 20","Tlacopan 21"]
new_list = []

for x in list:
     new_list.append(re.sub('\s(\d\d)$', r' 15\1', x))

for x in new_list:
    print(x)

Output:

Tenochtitlan 1519
Tetzcoco 1520
Tlacopan 1521

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM