Using Regex to remove everything except words, digits and spaces.
This is the function I defined:
def remove(text):
return re.sub(r'[^\w\d\s]', '', text)
Is there anything extra or something missed out
\\w
actually catches all the alphabets ( [A-Za-z]
), numbers ( \\d
), and underscores _
So, better try this code (with a different Regex)
def remove(text):
return re.sub(r'[^A-Za-z\d\s]+', '', text)
Tell me if its not working...
Your approach will work. For example:
import re
text = ' !"(/£hello world1!!!!%"& '
def remove(text):
return re.sub(r'[^\w\d\s]', '', text)
print (remove(text))
Your output will be:
>>> hello world1
See this example here .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.