I have 2 dataframes(df1 &df2). I want to take each row from df2 and for the particular ID & Query, fetch all the values from df1 for that particular ID after the 'Query' string was found. The only condition being that the rows displayed should have 'Sent by' column values as 'Support Team' only.
I tried something like df1= df1.loc['can we find tigers in Amazon forest?':]
but getting keyerror..Can anyone help me with this..
Note:Index in df1 is not sorted as the dataframes are grouped based on ID
df1 =
Index ID Query Sent by
0 76649 Hi Jack
2 76649 Anyone there Jack
3 76649 yes hi Support Team
10 76649 this is Fred from support team Support Team
5 76649 can we find tigers in Amazon forest? Jack
6 76649 yes tigers can be found there Support Team
7 76649 contact forest dept for more Support Team
13 76649 thanks for reaching out Support Team
9 67209 Hello Bianca
4 67209 Anyone there Bianca
11 67209 Hi this is Jim from support team Support Team
12 67209 can we find lions in Amazon forest? Bianca
8 67209 yes lions can be found there Support Team
14 67209 contact forest dept for more Support Team
15 67209 thanks for reaching out Support Team
16 67209 sure that helps..thank you Bianca
df2 =
Index Query ID
0 can we find tigers in Amazon forest? 76649
2 can we find lions in Amazon forest? 67209
3 can we find elephant in Amazon forest? 77832
Output expected:
76649 yes tigers can be found there Support Team
76649 contact forest dept for more Support Team
76649 thanks for reaching out Support Team
67209 yes lions can be found there Support Team
67209 contact forest dept for more Support Team
67209 thanks for reaching out Support Team
I don't know if it's the most elegant way todo it. But that would be my attempt. If something is not clear please ask for clarification.
# Just example dfs for testing
df1 = pd.DataFrame({'Query': ['this is Fred from support team', 'can we find tigers in Amazon forest?', 'yes tigers can be found there', 'can we find lions in Amazon forest?','yes lions can be found there'],
'ID': [1,1,1,2,2]})
df2 = pd.DataFrame({'Query': ['can we find tigers in Amazon forest?', 'can we find lions in Amazon forest?'],
'ID': [1,2]})
# Reset index so we can take every object with greater index
df1.reset_index(inplace=True)
#init output
output = None
#iterate over df2
for idx, row in df2.iterrows():
# Find index of matching string and id in df1
index = df1.index[(df1['Query'] == row['Query']) & (df1['ID'] == row['ID'])]
# index is a list so check if the result is consistent with our logic
# If string not found
if len(index) == 0:
continue
# You can add here code what to do if the same string with same id appears more often in df1
elif len(index) > 1:
print("Oopsi, your string seems to appear more often with the same ID!")
else:
# Create output
if output is None:
output = df1[(df1['ID'] == row['ID']) & (df1.index > index[0])]
else:
output = output.append(df1[(df1['ID'] == row['ID']) & (df1.index > index[0])])
# Filter by support team
output = output[output['Sent by'] == 'Support Team']
print(output)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.