简体   繁体   中英

Replace characters in a string with whitespaces

I am writing a simple Python script that retrieves the latest tweet of any twitter user (in this case BBC) and uses the integrated text-to-speech system on Mac to read out the content of that particular tweet.

Everything is running as it should, but there are certain things I want to improve. For instance, if a tweet contains the character "#", the computer will speak this as "number". Eg, if the tweet were to read "#BBC covers the latest news", the computer speaks "number BBC covers the latest news".

I have declared a string to hold the content of the tweet, and wish to find a way to replace unwanted characters with white spaces. So far, I have the following:

for char in data_content: #data_content is the string holding the tweet
    if char in "#&/": # does not replace #
        mod_data = data_content.replace(char, '')
print(mod_data)
system('say ' + mod_data)

This seems to be working correctly with the "/" character, but does not replace the "#" character. So, any help on this matter is very much appreciated!

PS I have tried replacing the "#" character alone, in which case I get the desired result. However, when I try to provide a series of characters to replace, it only replaces the "/" character.

Thanks!

Your loop always transforms data_content to mod_data so you will always only see the last change.

Say your string is "#BBC covers the latest issues with G&F. See bbc.co.uk/gf"

First time a char in your list is found is the # so:

mod_data = "BBC covers the latest issues with G&F. See bbc.co.uk/gf"

Next the & is found but it is found in data_content so the changes you made earlier are ignored and you get:

mod_data = "#BBC covers the latest issues with GF. See bbc.co.uk/gf"

The same happens when the / is found and you get:

mod_data = "#BBC covers the latest issues with G&F. See bbc.co.ukgf"

That's why it looks like it is only working for the / .

You can simply do what you want using regular expressions like this:

import re

string = "#BBC covers the latest issues with G&F. See bbc.co.uk/gf"
mod_data = re.sub(r"[#&/]", " ", string)
print(mod_data)
system('say ' + mod_data)

I have an additional suggestion. Since replace() works for all occurrences of the character in the string, you don't need that outer loop, so you could change your code to something like this:

mod_data = data_content
for char in "#&/":
    mod_data = mod_data.replace(char, '')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM