简体   繁体   中英

Applying a function to every row in a dataframe column in Pandas

I have a text based column in a dataframe similar to the following format:

  Text
0 I am me
1 I am not you
2 I will be him

with which I am trying to run a string function to remove anything after the last space (including the space). For example 'I am me' would become 'I am'

Code:

df['Text'] = df['Test'].apply(lambda x: x.str.split(' ').str[:-1].str.join(' '))

However, this gives the error:

AttributeError: 'str' object has no attribute 'str'

Which I am not quite understanding as the apply function works on its own, it only seems to fail when applying it to a specific column in a dataframe? (As strange and as wrong as I may be...)

When you're working with vanilla strings, you call the functions directly . When working with pandas columns directly, use the str accessor methods.

Case 1
As mentioned in my comment, use the str methods:

df

            Text
0        I am me
1   I am not you
2  I will be him    

df['Text'] = df['Text'].str.split().str[:-1].str.join(' ')

        Text
0       I am
1   I am not
2  I will be

Case 2
Alternatively, when working with apply on a single column, the lambda receives a string , (not a pd.Series ), so .str accessor methods aren't involved.

I think you want rewrite pandas string functions what are a bit slowier, but support NaNs values:

df['Text'] = use_data['Test'].str.split().str[:-1].str.join(' ')

to python string functions:

df['Text'] = use_data['Test'].apply(lambda x: ' '.join(x.split(' ')[:-1]))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM