My data frame consists of only one column where each row is a list. I want to compare each row with all other rows to find if each list has any subsets in that column.And I want to print those subsets. Can you suggest a code for that?
I am assuming index is numeric from 0 to N and you are using pandas. If this is not the case please edit the df.drop line to df.drop(df[item]). I am storing each row into a variable, then removing the row to perform a comparison of the row against the entire dataframe. In the example given, I am using a column in my dataframe ("Identifier") to check for similarities between my row of interest and all others. You can insert your own logic after splitting the row from the dataframe. I hope this helps.
for item in range(len(df)):
## Split Row from Dataframe
row_of_interest = df.iloc[item]
df_without_row = df.drop(item)
## Perform Comparison of Row Characterisitics
## Identifier is a column that I want to compare
for j in range(len(df_without_row)):
if df_without_row.iloc[j]["Identifier"] == row_of_interest["Identifier"]:
do something ...
## Keep Row of interest or other rows
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.