简体   繁体   中英

How to replace multiple strings with one string with downcase in python?

I need to write a function which replaces multiple format strings into downcase.

For example, a paragraph contains a word 'something' in different formats like 'Something', 'SomeThing', 'SOMETHING', 'SomeTHing' need to convert all format words into downcase 'something'.

How to write a function with replacing with downcase?

You can split your paragraph into different words, then use the slugify module to generate a slug of each word, compare it with "something" , and if there is a match, replace the word with "something".

In [1]: text = "This paragraph contains Something, SOMETHING, AND SomeTHing"

In [2]: from slugify import slugify

In [3]: for word in text.split(" "): # Split the text using space, and iterate through the words
   ...:     if slugify(unicode(word)) == "something": # Compare the word slug with "something"
   ...:           text = text.replace(word, word.lower())

In [4]: text
Out[4]: 'This paragraph contains something, something AND something'

Split the text into single words and check whether a word in written in lower case is "something". If yes, then change the case to lower

if word.lower() == "something":
    text = text.replace(word, "something")

To know how to split a text into words, see this question .

Another way is to iterate through single letters and check whether a letter is the first letter of "something":

text = "Many words: SoMeThInG, SOMEthING, someTHing"
for n in range(len(text)-8):
    if text[n:n+9].lower() == "something": # check whether "something" is here
        text = text.replace(text[n:n+9], "something")

print text

You can also use re.findall to search and split the paragraph into words and punctuation, and replace all the different cases of "Something" with the lowercase version:

import re

text = "Something, Is: SoMeThInG, SOMEthING, someTHing."

to_replace = "something"

words_punct = re.findall(r"[\w']+|[.,!?;: ]", text)

new_text = "".join(to_replace if x.lower() == to_replace else x for x in words_punct)

print(new_text)

Which outputs:

something, Is: something, something, something.

Note: re.findall requires a hardcoded regular expression to search for contents in a string. Your actual text may contain characters that are not in the regular expression above, you will need to add these as needed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM