简体   繁体   中英

Pandas/Python function str.contains returns an error

I am trying to make a function where I feed my dataframe into - the purpose of the function is to categorize account postings into either "accept" or "ignore.

The problem I then have is that on some accounts I need to only look for a partial part of a text string. If I do that without a function it works, but in a function I get an error.

So this works fine:

ekstrakt.query("Account== 'Car_sales'").Tekst.str.contains("Til|Fra", na=False)

But this doesn't:

def cleansing(df):

    if df['Account'] == 'Car_sales':
        if df.Tekst.str.contains("Til|Fra", na=False)  : return 'Ignore'

ekstrakt['Ignore'] = ekstrakt.apply(cleansing, axis = 1)

It results in an error: "AttributeError: 'str' object has no attribute 'str'"

I need the "cleansing" function to take more arguments afterwards, but I am struggling getting past this first part.

If use function processing each row separately, so cannot use pandas functon working with columns like str.contains .

Possible solution is create new column by chained mask by & for bitwise AND with numpy.where :

df = pd.DataFrame({'Account':['car','Car_sales','Car_sales','Car_sales'],
                   'Tekst':['Til','Franz','Text','Tilled']})

m1 = df['Account'] == 'Car_sales'
m2 = df.Tekst.str.contains("Til|Fra", na=False)
df['new'] = np.where(m1 & m2, 'Ignore', 'Accept')
print (df)
     Account   Tekst     new
0        car     Til  Accept
1  Car_sales   Franz  Ignore
2  Car_sales    Text  Accept
3  Car_sales  Tilled  Ignore

If need processing in function, you can use in statement with or , because working with scalars:

def cleansing(x):

    if x['Account'] == 'Car_sales':
        if pd.notna(x.Tekst):
            if ('Til' in x.Tekst) or ('Fra' in x.Tekst):
                return 'Ignore'


df['Ignore'] = df.apply(cleansing, axis = 1)

print (df)
     Account   Tekst     new  Ignore
0        car     Til  Accept    None
1  Car_sales   Franz  Ignore  Ignore
2  Car_sales    Text  Accept    None
3  Car_sales  Tilled  Ignore  Ignore

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM