I have hard times to apply a user defined function to a specific column in a python dataframe. The dataframe is as fellow:
Year state Narrative
----------------------------------
2015 WV a roof fall occurred at 10:05 am at 10+50 entry 6 in 8lms mmu 010, .. more text
2016 AL a rib rolled out striking him on his left foot resulting ...... more text
2017 CO a non-injury mountain bump occurred inby the 5n longwall. additional ... more text
I want to predict the type of ground failure based on "Narrative", such that a new column is added to the dataframe as shown below. I predict the ground fall through looking for some keywords in the "narrative", for example: if the "narrative" includes any of the following words ['roof fall', 'roof broke', 'rock fell from the top']
, the ground fall prediction should be "roof fall".
This is the user defined function that I generated, but it is not working.
def predict_groundFall(narrative):
fall_dict = {'roof fall': ['Roof fall', 'roof broke', 'rock fell from the top'],
'rib fall': ['rib fall ', 'rib rolled', 'rib dislodged'],
'outburst': ['outburst', 'bounce', 'rockburst']}
for key, values in fall_dict.iteritems():
if values in narrative:
return key
break
df['predicted_failure'] = df.apply( lambda row: predict_groundFall( row['Narrative']), axis=1)
this is what I want to achieve: adding a new column to predict the failure from the narrative.
Year state Narrative predicted_failure
------------------------------------------------------------- ---------------------
2015 WV a roof fall occurred ....... more text.... roof fall
2016 AL a rib rolled out striking ......more text .... rib fall
2017 CO a non-injury mountain ....... more text.... outburst
I am new to Python, so I hope you help me fix the code to make it work. A better method to achieve my goal is highly appreciated. thank you in advance,
Your function isn't working as expected. You want to try the following:
def predict_groundFall(narrative):
fall_dict = {'roof fall': ['Roof fall', 'roof broke', 'rock fell from the top'],
'rib fall': ['rib fall ', 'rib rolled', 'rib dislodged'],
'outburst': ['outburst', 'bounce', 'rockburst']}
for key in fall_dict:
if any(v.lower() in narrative.lower() for v in fall_dict[key]):
return key
Then change your column assignment to the following:
df['predicted_failure'] = df["Narrative"].apply(lambda x: predict_groundFall(x))
I think the problem is in your apply function.
change this line df['predicted_failure'] = df.apply( lambda row: predict_groundFall( row['Narrative']), axis=1)
to
df['predicted_failure'] = df.Narrative.apply(predict_groundFall)
this will send each value of Narrative
to your custom function and then populate the new column with the return from that function
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.