简体   繁体   中英

Apply a user defined function to all rows of a specific column in python dataframe

I have hard times to apply a user defined function to a specific column in a python dataframe. The dataframe is as fellow:

Year    state   Narrative
----------------------------------
2015      WV   a roof fall occurred at 10:05 am at 10+50 entry 6 in 8lms mmu 010, .. more text
2016      AL   a rib rolled out striking him on his left foot resulting ...... more text
2017      CO   a non-injury mountain bump occurred inby the 5n longwall. additional ... more text

I want to predict the type of ground failure based on "Narrative", such that a new column is added to the dataframe as shown below. I predict the ground fall through looking for some keywords in the "narrative", for example: if the "narrative" includes any of the following words ['roof fall', 'roof broke', 'rock fell from the top'] , the ground fall prediction should be "roof fall".

This is the user defined function that I generated, but it is not working.

def predict_groundFall(narrative):
    fall_dict = {'roof fall': ['Roof fall', 'roof broke', 'rock fell from the top'],
                 'rib fall': ['rib fall ', 'rib rolled', 'rib dislodged'],
                 'outburst': ['outburst', 'bounce', 'rockburst']}
    for key, values in fall_dict.iteritems():
        if values in narrative:
            return key
            break
df['predicted_failure'] = df.apply( lambda row:  predict_groundFall( row['Narrative']), axis=1)

this is what I want to achieve: adding a new column to predict the failure from the narrative.

Year    state   Narrative                                        predicted_failure
------------------------------------------------------------- ---------------------
2015      WV   a roof fall occurred ....... more text....                roof fall
2016      AL   a rib rolled out striking ......more text ....             rib fall
2017      CO   a non-injury mountain ....... more text....                 outburst

I am new to Python, so I hope you help me fix the code to make it work. A better method to achieve my goal is highly appreciated. thank you in advance,

Your function isn't working as expected. You want to try the following:

def predict_groundFall(narrative):
    fall_dict = {'roof fall': ['Roof fall', 'roof broke', 'rock fell from the top'],
                 'rib fall': ['rib fall ', 'rib rolled', 'rib dislodged'],
                 'outburst': ['outburst', 'bounce', 'rockburst']}
    for key in fall_dict:
        if any(v.lower() in narrative.lower() for v in fall_dict[key]):
            return key

Then change your column assignment to the following:

df['predicted_failure'] = df["Narrative"].apply(lambda x: predict_groundFall(x))

I think the problem is in your apply function.

change this line df['predicted_failure'] = df.apply( lambda row: predict_groundFall( row['Narrative']), axis=1)

to

df['predicted_failure'] = df.Narrative.apply(predict_groundFall)

this will send each value of Narrative to your custom function and then populate the new column with the return from that function

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM