I have the following string:
str = "MMX Lions Television Inc"
And I need to convert it into:
conv_str = "2010 Lions Television Inc"
I have the following function to convert a roman numeral into its integer equivalent:
numeral_map = zip(
(1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4, 1),
('M', 'CM', 'D', 'CD', 'C', 'XC', 'L', 'XL', 'X', 'IX', 'V', 'IV', 'I')
)
def roman_to_int(n):
n = unicode(n).upper()
i = result = 0
for integer, numeral in numeral_map:
while n[i:i + len(numeral)] == numeral:
result += integer
i += len(numeral)
return result
How would I use re.sub
to do the get the correct string here?
(Note: I tried using the regex
described here: How do you match only valid roman numerals with a regular expression? but it was not working.)
Always try the Python Package Index when looking for a common function/library.
This is the list of modules related to the keyword 'roman' .
For example 'romanclass' has a class that implement the conversion, quoting the documentation:
So a programmer can say:
>>> import romanclass as roman
>>> two = roman.Roman(2)
>>> five = roman.Roman('V')
>>> print (two+five)
and the computer will print:
VII
re.sub()
can accept a function as the replacement, the function will receive a single argument which is the Match object, and should return a replacement string. You already have a function to convert a Roman numeral string to an int so this won't be difficult.
In your case you would want a function like this:
def roman_to_int_repl(match):
return str(roman_to_int(match.group(0)))
Now you can modify the regex from the question you linked so that it will find matches within a larger string:
s = "MMX Lions Television Inc"
regex = re.compile(r'\b(?=[MDCLXVI]+\b)M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b')
print regex.sub(roman_to_int_repl, s)
Here is a version of the regex that would not replace "LLC" in a string:
regex = re.compile(r'\b(?!LLC)(?=[MDCLXVI]+\b)M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b')
You could also use the original regex with a modified replacement function:
def roman_to_int_repl(match):
exclude = set(["LLC"]) # add any other strings you don't want to replace
if match.group(0) in exclude:
return match.group(0)
return str(roman_to_int(match.group(0)))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.