I have the following dataframe:
df = pd.DataFrame([['50030', '36 @ 3159 W/270, LWD[GR,RES,PWD] @ 4015', '3159'],
['50030', '36 @ 3159 W/270, LWD[GR,RES,PWD] @ 4015', '3994'],
['50030', '36 @ 3159 W/270, LWD[GR,RES,PWD] @ 4015', '5401'],
['50030', '26 @ 3994, LWD[GR,RES,PWD] @ 5430, 20 @ 5401', '3159'],
['50030', '26 @ 3994, LWD[GR,RES,PWD] @ 5430, 20 @ 5401', '3994'],
['50030', '26 @ 3994, LWD[GR,RES,PWD] @ 5430, 20 @ 5401', '5401']],
columns = ["WKEY", "Description", "DEPTH"])
I want to compare the value in the DEPTH column (which is a string value) to the string in the Description column only for the same row . A new column called "Compare" would have a yes or no depending on if it exists.
Based on this post: python pandas - Check if partial string in column exists in other column , I tried this code:
df['Compare'] = df['DEPTH'].apply(lambda x: 'Yes' if df['Description'].str.contains(x).any() else 'No')
However, it returns all rows with Yes:
It should be Yes, No, No, No, Yes, Yes.
The ultimate plan is to delete all rows which do not have the value in the DEPTH column also listed somewhere in the Description column in the same row.
I feel like I'm one step away from getting this to work, so any direction would be appreciated.
Thanks!
In your case
df['Compare'] = df.apply(lambda x: 'Yes' if x['DEPTH'] in x['Description'] else 'No',axis=1)
df
Out[133]:
WKEY Description DEPTH Compare
0 50030 36 @ 3159 W/270, LWD[GR,RES,PWD] @ 4015 3159 Yes
1 50030 36 @ 3159 W/270, LWD[GR,RES,PWD] @ 4015 3994 No
2 50030 36 @ 3159 W/270, LWD[GR,RES,PWD] @ 4015 5401 No
3 50030 26 @ 3994, LWD[GR,RES,PWD] @ 5430, 20 @ 5401 3159 No
4 50030 26 @ 3994, LWD[GR,RES,PWD] @ 5430, 20 @ 5401 3994 Yes
5 50030 26 @ 3994, LWD[GR,RES,PWD] @ 5430, 20 @ 5401 5401 Yes
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.