简体   繁体   中英

Merging two pandas dataframes many-to-one

How do I merge the following datasets:

df = A
date abc
1    a
1    b
1    c
2    d
2    dd
3    ee
3    df

df = B
date ZZZ
1    a
2    b
3    c

I want to get smth like this:

date abc  ZZZ
1    a     a
1    b     a
1    c     a
2    d     b
2    dd    b
3    ee    c
3    df    c

I tried this code:

aa = pd.merge(A, B, left_on="date", right_on="date", how="left", validate="m:1")

But I have the following mistake:

TypeError: merge() got an unexpected keyword argument 'validate'

I update my pandas using (conda update pandas), but still get the same error

Please, advise me this issue.

According to df.merge docs validate was added in version 0.21.0. You are using an older version so you should update the version of pandas you are using.

As @DeepSpace mentioned , you may need to upgrade your pandas.

To replicate the check in earlier versions, you can do something like this:

import pandas as pd

df1 = pd.DataFrame(index=['a', 'a', 'b', 'b', 'c'])
df2 = pd.DataFrame(index=['a', 'b', 'c'])

x = [i for i in df2.index if i in set(df1.index)]
len(x) == len(set(x))  # True


df1 = pd.DataFrame(index=['a', 'a', 'b', 'b', 'c'])
df2 = pd.DataFrame(index=['a', 'b', 'c', 'a'])

y = [i for i in df2.index if i in set(df1.index)]
len(y) == len(set(y))  # False

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM