简体   繁体   中英

How to compare each row of a data frame with all rows of the data frame?

My data frame consists of only one column where each row is a list. I want to compare each row with all other rows to find if each list has any subsets in that column.And I want to print those subsets. Can you suggest a code for that?

I am assuming index is numeric from 0 to N and you are using pandas. If this is not the case please edit the df.drop line to df.drop(df[item]). I am storing each row into a variable, then removing the row to perform a comparison of the row against the entire dataframe. In the example given, I am using a column in my dataframe ("Identifier") to check for similarities between my row of interest and all others. You can insert your own logic after splitting the row from the dataframe. I hope this helps.

for item in range(len(df)):
    ## Split Row from Dataframe
    row_of_interest = df.iloc[item]
    df_without_row = df.drop(item)
    ## Perform Comparison of Row Characterisitics 
    ## Identifier is a column that I want to compare
    for j in range(len(df_without_row)):
        if df_without_row.iloc[j]["Identifier"] == row_of_interest["Identifier"]:
            do something ...

    ## Keep Row of interest or other rows

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM