简体   繁体   中英

How to use re.sub in pandas datafram

def not_value(x):
    if type(x) == str:
        re.sub(r'(\s+)', np.nan, x)
    else:
        pass

df_copy=df.copy()
df_copy.astype(str).applymap(lambda x: not_value(x))

I have checked the value in the dataframe is a string. But it always shows that TypeError: decoding to str: need a bytes-like object, float found. What is the problem with it?

Thank you for giving me an answer.

If you just want to replace values in a certain string column with np.nan , when the value of the string is all whitespace, you can do the following. You may want to edit the regular expression if it doesn't matter that it is all whitespace or not:

import pandas as pd
import re
import numpy as np

d = {'col1': [1, 2], 'col2': [3, 4], 'col3': ['s ', '  ']}

df = pd.DataFrame(data=d)

spaces = df['col3'].str.contains('^\s+$')
df.loc[spaces, 'col3'] = np.nan
df

Result:

   col1  col2 col3
0     1     3   s 
1     2     4  NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM