简体   繁体   中英

Filter dataframe1 with values from dataframe2 and select all rows in dataframe 1 after a particular row value in Python

I have 2 dataframes(df1 &df2). I want to take each row from df2 and for the particular ID & Query, fetch all the values from df1 for that particular ID after the 'Query' string was found. The only condition being that the rows displayed should have 'Sent by' column values as 'Support Team' only.

I tried something like df1= df1.loc['can we find tigers in Amazon forest?':] but getting keyerror..Can anyone help me with this..

Note:Index in df1 is not sorted as the dataframes are grouped based on ID

df1 =

Index  ID        Query                                    Sent by
0    76649  Hi                                           Jack
2    76649  Anyone there                                 Jack
3    76649  yes hi                                    Support Team
10   76649  this is Fred from support team            Support Team
5    76649  can we find tigers in Amazon forest?        Jack
6    76649  yes tigers can be found there             Support Team
7    76649  contact forest dept for more              Support Team
13   76649  thanks for reaching out                   Support Team
9    67209  Hello                                      Bianca
4    67209  Anyone there                              Bianca
11   67209  Hi this is Jim from support team          Support Team
12   67209  can we find lions in Amazon forest?       Bianca
8    67209  yes lions  can be found there             Support Team
14   67209  contact forest dept for more              Support Team
15   67209  thanks for reaching out                   Support Team
16   67209  sure that helps..thank you                Bianca

df2 =

Index        Query                                         ID
0      can we find tigers in Amazon forest?               76649
2      can we find lions in Amazon forest?                67209
3      can we find elephant in Amazon forest?             77832

Output expected:

76649  yes tigers can be found there             Support Team
76649  contact forest dept for more              Support Team
76649  thanks for reaching out                   Support Team
67209   yes lions  can be found there             Support Team
67209   contact forest dept for more              Support Team
67209   thanks for reaching out                   Support Team

I don't know if it's the most elegant way todo it. But that would be my attempt. If something is not clear please ask for clarification.

# Just example dfs for testing
df1 = pd.DataFrame({'Query': ['this is Fred from support team', 'can we find tigers in Amazon forest?', 'yes tigers can be found there', 'can we find lions in Amazon forest?','yes lions can be found there'],
                    'ID':   [1,1,1,2,2]})
df2 = pd.DataFrame({'Query': ['can we find tigers in Amazon forest?', 'can we find lions in Amazon forest?'],
                    'ID':   [1,2]})

# Reset index so we can take every object with greater index
df1.reset_index(inplace=True)

#init output
output = None

#iterate over df2
for idx, row in df2.iterrows():

    # Find index of matching string and id in df1
    index = df1.index[(df1['Query'] == row['Query']) & (df1['ID'] == row['ID'])]

    # index is a list so check if the result is consistent with our logic
    # If string not found
    if len(index) == 0:
        continue

    # You can add here code what to do if the same string with same id appears more often in df1
    elif len(index) > 1:
        print("Oopsi, your string seems to appear more often with the same ID!")

    else:

        # Create output
        if output is None:
            output = df1[(df1['ID'] == row['ID']) & (df1.index > index[0])]
        else:
            output = output.append(df1[(df1['ID'] == row['ID']) & (df1.index > index[0])])

# Filter by support team
output = output[output['Sent by'] == 'Support Team']
print(output)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM