Pandas dataframe join on all elements

Question

I am trying to join two dataframes in pandas. One dataframe contains URLs in a column, which contains duplicates ( same values repeating).

Second dataframe contains some properties of those URLs, but unique URLs only, no duplicates.

I am trying to map back or join two dataframes where I get those properties of URLs in 1st dataframe, for all occurances of the URls.

Example: Dataframe1:

Dataframe2:

ResultDataframe:

How can this be achieved? Which particular join | concatinate | or merge method to use to combine the dataframes on all rows.

The dataframe above is just example, actual dataframe has like 300+ unique URLs, and 1st dataframe has 10000+ rows.

I have tried inner join and outer join, does not works.

Answer 1

Here's a working example that should be directly applicable.

import pandas as pd

df = pd.DataFrame(zip([1,2,3,2,3,1],[7,8,9,10,11,12]),columns=["A","B"])
print(df)
df2 = pd.DataFrame(zip([1,2,3],["foo","baz","bar"]),columns=["A","X"])
print(df2)
df3 = df.join(df2.set_index('A'), on='A')
print(df3)

You will use somthing like dataframe1.join(dataframe2.set_index("url"),on="url")

Pandas dataframe join on all elements

Question

1 answers

solution1
0 2022-01-18 04:34:00

Pandas dataframe join on all elements

Question

1 answers

solution1 0 2022-01-18 04:34:00

solution1
0 2022-01-18 04:34:00