Python remove part of the string from column in a dataframe

Question

Hi I am working on python. I created a dataframe from a csv file. One column "name" which is a text column, has inside in different places this pattern ' (some_number + %)', example:

"145 wefwignweon (100%) , 1rberbebe (50%) , vwrbvwrbe (100%) , 140 ewggrrwrg"

I need to delete from this column where says: ' (100%)', '(100%), '(50%') In other columns are different percentage values

import pandas as pd

path_to_dir="/Users/user/Documents/file/"
name='owner.csv'
df_owner = pd.read_csv(path_to_dir+name, encoding='windows-1252') 
#df_owner["name"] =  df_owner["name"] drop where says => (' (@some_number%)')

How I can create like a kind of regular expression to drop where find this kind of values something like this? delete where says '( some_number + %)' in name column from df_owner dataframe

Regards

Answer 1

You can use the regular expression \(\d+%\) :

df = df[~df['name'].str.contains(r' \(\d+%\)', regex=True)]

Answer 2

Capture all numbers up to three digits gives r'\d{1,3}'

But you also seem to want the parentheses, and they and the percentage sign have to be escaped, so that will be r'\(\d{1,3}\)\%' . You can then replace occurrences of that regex with the null string with lambda x: re.sub(r'\(\d{1,3}\)\%', '', x) . You also might want to add the leading space to the regex.

Python remove part of the string from column in a dataframe

Question

2 answers

solution1
0

solution2
0 2022-03-19 02:28:31

Python remove part of the string from column in a dataframe

Question

2 answers

solution1 0

solution2 0 2022-03-19 02:28:31

solution1
0

solution2
0 2022-03-19 02:28:31