Pandas Dataframe - Replace all cell value subject to regex condition

Question

I am solving a question where in a column there are few values which are repetitions of "." , eg-"....." or"............." .

So I want to use the .loc function to replace all such values by np.NaN . I want to use the regex function to identify any cell value having at least one repetition of "." .

So i used the below code in Python -

energy.loc[bool(re.match('.+', energy['Energy Supply'])),'Energy Supply']=np.NaN

Please help

Answer 1

您需要如下转义点，因为点代表任何字符，加号是一个或多个，试一试:)

re.match('\\.+', energy['Energy Supply']))

Answer 2

You could make use of str.contains to check for a dot, and escape it to match it literally.

You don't need the + quantifier because it means 1 or more. So matching a single dot is sufficient.

import pandas as pd
import numpy as np

data = [
    "test",
    "test.",
    "..."
]
energy = pd.DataFrame(data, columns=["Energy Supply"])
energy.loc[energy['Energy Supply'].str.contains(r'\.'), 'Energy Supply'] = np.NaN
print(energy)

Output

  Energy Supply
0          test
1           NaN
2           NaN

Pandas Dataframe - Replace all cell value subject to regex condition

Question

2 answers

solution1
0 2021-07-17 10:37:41

solution2
0 2021-07-17 11:52:37

Pandas Dataframe - Replace all cell value subject to regex condition

Question

2 answers

solution1 0 2021-07-17 10:37:41

solution2 0 2021-07-17 11:52:37

solution1
0 2021-07-17 10:37:41

solution2
0 2021-07-17 11:52:37