Comparing two panda dataframes with different size

Question

I want to compare two dataframes with content of 1s and 0s. I run for loops to check every element of the dataframes and at the end, I want to replace the "1" values in dataframe out that are equal with the dataframe df with the letter d and the values that are not equal between the dataframes with the letter i in the dataframe out . This code is too slow and I need some input to make it efficient and faster; does anyone have any idea? Also the df dataframe is 420x420 and the out 410x410

a1=out.columns.values
a2=df.columns.values
b1=out.index.values
b2=df.index.values

for a in a1:
 for b in b1:
    for c in a2:
        for d in b2:
            if a == c and b == d:
                if out.loc[b,a] == 1 and df.loc[d,c]==1:
                    out.loc[b,a] = "d"
                elif out.loc[b,a] != df.loc[d,c]:
                    out.loc[d,c] = "i"
            else:
                pass

A small example for better understanding: Dataframe df

1	2	3	4
1	0	1	1
2	1	0	0
3	1	0	0
4	0	0	0

Dataframe out

1	2	3	4
1	0	1	1
2	1	0	1
3	1	1	0
4	0	0	0

And the resulted dataframe out should be like that:

1	2	3	4
1	0	d	d
2	d	0	i
3	d	i	0
4	0	0	0

Answer 1

I created your dataframes like theese:

# df creation
data1 = [
    [1, 0, 1, 1],
    [2, 1, 0, 0],
    [3, 1, 0, 0],
    [4, 0, 0, 0]
]

df = pd.DataFrame(data1, columns=[1, 2, 3, 4])

1	2	3	4
1	0	1	1
2	1	0	0
3	1	0	0
4	0	0	0

# df_out creation
data2 = [
    [1, 0, 1, 1],
    [2, 1, 0, 1],
    [3, 1, 1, 0],
    [4, 0, 0, 0]
]

df_out = pd.DataFrame(data2, columns=[1, 2, 3, 4])

1	2	3	4
1	0	1	1
2	1	0	1
3	1	1	0
4	0	0	0


# Then I used 'np.where' method on all intersected columns.
intersected_columns = set(df.columns).intersection(df_out.columns)

for col in intersected_columns:
   if col != 1:  # I think first column is the index
       df_out[col] = np.where(# First condition
           (df[col] == 1) & (df_out[col] == 1),  
            "d",  # If first condition is true
             np.where(  # If first condition is false apply second condition
                    df[col] != df_out[col],
                    "i",
                    df_out[col])
            )

Output like this:
|   1 | 2   | 3   | 4   |
|----:|:----|:----|:----|
|   1 | 0   | d   | d   |
|   2 | d   | 0   | i   |
|   3 | d   | i   | 0   |
|   4 | 0   | 0   | 0   |

Comparing two panda dataframes with different size

Question

1 answers

solution1
0 2022-05-09 11:05:46

Comparing two panda dataframes with different size

Question

1 answers

solution1 0 2022-05-09 11:05:46

solution1
0 2022-05-09 11:05:46