简体   繁体   中英

How to remove non-specific char of a string/dataframe[i] in Python

in my data cleaning process i found some strings with inhbit a single char that might bias my analysis

ie 'hello please help r me with this s question'.

Until now i only found tools to remove specific chars, like

char= 's'
def char_remover(text: 
    spec_char = ''.join (i for i in text if i not in s text)
    return spec_char

or the rsplit(), split() functions, which are good for deleting first /last char of a string.

In the end, I want to code a function that removes all single chars (whitespace char whitespace) from my string/dataframe.

My own thoughts on that question:

def spec_char_remover(text):
    spec_char_rem= ''.join(i for i in text if i not len(i) <= 1) 
    return spec_char_rem

But that obviously didn´t work.

Thanks in advance.

You could use regex:

>>> import re
>>> s = 'hello please help r me with this s question'
>>> re.sub(' . ', ' ', s)
'hello please help me with this question'

" . " in regex matches any character. So " . " matches any character surrounded by spaces. You could also use " \s.\s " to match any character surrounded by any whitespace.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM