I have two dataframe and how can I remove the similar row from the cartesian dataset
**DF1:**
Index Name
0 xyz
1 abc
2 def
**DF2:**
Index Name
0 xyz
1 abc
2 xyz
**Expected Output**
(0,0),**(0,2)**
(1,1)
I want to combine only the indexes whose Name column is same however I don't want to display the repeated combination. That is, when I do cartesian index (0,2) and (2,0) will give me same result. So I want to show only one row.
Updated:
I already have a cartesian dataframe as input which is (0,0),(0,2),(1,1),(2,0)
What I want is, from this input dataframe I want to remove the duplicate (2,0). And I have around 100 rows in the dataframe, so want to loop through as well.
Assuming df1
and df2
have a single column "Name" and that "Index" is the index, and that you want a list of tuples with the matching indexes, as appear in the question, you can do:
df1 = pd.DataFrame({'Name': ['xyz', 'abc', 'def']})
df2 = pd.DataFrame({'Name': ['xyz', 'abc', 'xyz']})
df3 = df1.reset_index().merge(df2.reset_index(), on='Name', how='inner')
list_of_tuples = [tuple(item) for item in df3[['index_x', 'index_y']].values]
list_of_tuples
# OUTPUT: [(0, 0), (0, 2), (1, 1)]
And if "Index" is a column name, just drop the reset_index()
commands.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.