简体   繁体   中英

unwanted characters in pandas dataframe column

I want to delete "\n" and "[" characters from jobDescription column in dataframe. I try this code. But it s not working data['jobDescription'] = data['jobDescription'].str.replace(r'\n',' ', regex=True)

you can see the df in the picture below; 在此处输入图像描述

how do i solve this problem? Thanks.

You can make use of python regular expression.

import re

data.jobDescription.apply(lambda x : '.join(re.findall("[a-zA-Z0-9 ]",text)))

The regex pattern will only allow alphabets and numbers, if you want to include symbols, you can add in the pattern to achieve that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM