How to use re.sub in pandas datafram

Question

def not_value(x):
    if type(x) == str:
        re.sub(r'(\s+)', np.nan, x)
    else:
        pass

df_copy=df.copy()
df_copy.astype(str).applymap(lambda x: not_value(x))

I have checked the value in the dataframe is a string. But it always shows that TypeError: decoding to str: need a bytes-like object, float found. What is the problem with it?

Thank you for giving me an answer.

Answer 1

If you just want to replace values in a certain string column with np.nan , when the value of the string is all whitespace, you can do the following. You may want to edit the regular expression if it doesn't matter that it is all whitespace or not:

import pandas as pd
import re
import numpy as np

d = {'col1': [1, 2], 'col2': [3, 4], 'col3': ['s ', '  ']}

df = pd.DataFrame(data=d)

spaces = df['col3'].str.contains('^\s+$')
df.loc[spaces, 'col3'] = np.nan
df

Result:

   col1  col2 col3
0     1     3   s 
1     2     4  NaN

How to use re.sub in pandas datafram

Question

1 answers

solution1
0 ACCPTED 2021-04-16 17:44:19

How to use re.sub in pandas datafram

Question

1 answers

solution1 0 ACCPTED 2021-04-16 17:44:19

solution1
0 ACCPTED 2021-04-16 17:44:19