Cartesian product of two dataframe in python

Question

I have two dataframe and how can I remove the similar row from the cartesian dataset

 **DF1:**
    Index    Name
    0       xyz
    1       abc
    2       def


    **DF2:**
    Index    Name
    0       xyz
    1       abc
    2       xyz


    **Expected Output**
    (0,0),**(0,2)**
    (1,1)

I want to combine only the indexes whose Name column is same however I don't want to display the repeated combination. That is, when I do cartesian index (0,2) and (2,0) will give me same result. So I want to show only one row.

Updated:

I already have a cartesian dataframe as input which is (0,0),(0,2),(1,1),(2,0)

What I want is, from this input dataframe I want to remove the duplicate (2,0). And I have around 100 rows in the dataframe, so want to loop through as well.

Answer 1

Assuming df1 and df2 have a single column "Name" and that "Index" is the index, and that you want a list of tuples with the matching indexes, as appear in the question, you can do:

df1 = pd.DataFrame({'Name': ['xyz', 'abc', 'def']})
df2 = pd.DataFrame({'Name': ['xyz', 'abc', 'xyz']})
df3 = df1.reset_index().merge(df2.reset_index(), on='Name', how='inner')
list_of_tuples = [tuple(item) for item in df3[['index_x', 'index_y']].values]
list_of_tuples 
# OUTPUT: [(0, 0), (0, 2), (1, 1)]

And if "Index" is a column name, just drop the reset_index() commands.

Cartesian product of two dataframe in python

Question

1 answers

solution1
1 2019-12-02 06:49:07

Cartesian product of two dataframe in python

Question

1 answers

solution1 1 2019-12-02 06:49:07

solution1
1 2019-12-02 06:49:07