简体   繁体   中英

How can I remove the non-alphanumeric (English) characters in a series containing strings while retaining spaces?

Currently, I have:

[re.sub(r'\W', '', i) for i in training_data.loc[:, 'Text']]

However with this the Hindi characters remain and all the spaces are removed. Any ideas?

Negation might help

import re
import string    

re.sub(f'[^{string.printable}]', '', 'asdf #$שדגכ')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM