简体   繁体   中英

Merging a pandas DataFrame with a Series

I have this df:

             cnpj
0  33062217000185
1  82645144000160

I run a function that creates two different Series:

for i in df.cnpj:
    s=peer_comparison(i)
    df=df.merge(peers.to_frame().T, how='left', on='cnpj')

In the first round of the for sentence, the output series goes like this:

s (first round):

A                                  N/A
B                                  N/A
C                                  N/A
cnpj                    33062217000185

The merged dataframe looks like this:

             cnpj   A       B     C
0  33062217000185   N/A   N/A   N/A 
1  82645144000160   NaN   NaN   NaN 

When it goes to the second round of merging, the series look like this:

s (second round):

A                                  N/A
B                                  N/A
C                                  N/A
cnpj                    82645144000160

But the merging gets all messy, like this:

             cnpj   A_x   B_x  C_x  A_y  B_y  C_y
0  33062217000185   N/A   N/A  N/A  NaN  NaN  NaN
1  82645144000160   NaN   NaN  NaN  N/A  N/A  N/A

If I try to change the merging using df.merge(s.to_frame().T.astype({'cnpj' : 'int'}), how='left',on='cnpj').fillna('') I get the following error:

ValueError: entry not a 2- or 3- tuple

Could anyone help?

Setup

df = pd.DataFrame({'cnpj': [33062217000185, 82645144000160]})
print(df)
             cnpj
0  33062217000185
1  82645144000160

s = pd.Series(['N/A', 'N/A', 'N/A', 33062217000185], index=['A', 'B', 'C', 'cnpj'])
print(s)
A                  N/A
B                  N/A
C                  N/A
cnpj    33062217000185
dtype: object

Use df.merge , converting s to a dataframe and transposing in the process.

df.merge(s.to_frame().T\
      .astype({'cnpj' : 'int'}), how='left').fillna('')
             cnpj    A    B    C
0  33062217000185  N/A  N/A  N/A
1  82645144000160  

Getting some of @COLDSPEED tips and using concat instead of merge or join it finally worked.

peers=peer_comparison(df.cnpj[0])
for i in df.cnpj[1:]:
    peers2=peer_comparison(i,base_year)
    peers=pd.concat([peers,peers2],axis=1)

df=peers.T

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM