How to compare each row of a data frame with all rows of the data frame?

Question

My data frame consists of only one column where each row is a list. I want to compare each row with all other rows to find if each list has any subsets in that column.And I want to print those subsets. Can you suggest a code for that?

Answer 1

I am assuming index is numeric from 0 to N and you are using pandas. If this is not the case please edit the df.drop line to df.drop(df[item]). I am storing each row into a variable, then removing the row to perform a comparison of the row against the entire dataframe. In the example given, I am using a column in my dataframe ("Identifier") to check for similarities between my row of interest and all others. You can insert your own logic after splitting the row from the dataframe. I hope this helps.

for item in range(len(df)):
    ## Split Row from Dataframe
    row_of_interest = df.iloc[item]
    df_without_row = df.drop(item)
    ## Perform Comparison of Row Characterisitics 
    ## Identifier is a column that I want to compare
    for j in range(len(df_without_row)):
        if df_without_row.iloc[j]["Identifier"] == row_of_interest["Identifier"]:
            do something ...

    ## Keep Row of interest or other rows

How to compare each row of a data frame with all rows of the data frame?

Question

1 answers

solution1
0 2019-09-09 18:11:40

How to compare each row of a data frame with all rows of the data frame?

Question

1 answers

solution1 0 2019-09-09 18:11:40

solution1
0 2019-09-09 18:11:40