简体   繁体   中英

Pandas Inner Join with axis is one

I was working with inner join using concat in pandas. Using two DataFrames as below:-

df1 = pd.DataFrame([['a',1],['b',2]], columns=['letter','number'])

df3 = pd.DataFrame([['c',3,'cat'],['d',4,'dog']],
                    columns=['letter','number','animal'])

pd.concat([df1,df3], join='inner')

The out is below

letter  number
0   a   1
1   b   2
0   c   3
1   d   4

But after using axis=1 the output is as below

pd.concat([df1,df3], join='inner', axis=1)

letter  number  letter  number  animal
0   a   1   c   3   cat
1   b   2   d   4   dog

Why it is showing animal column while doing inner join when axis=1?

In Pandas.concat()

axis argument defines whether to concat the dataframes based on index or columns .

axis=0 // based on index (default value)
axis=1 // based on columns

when you Concatenated df1 and df3, it uses index to combine dataframes and thus output is

letter  number
0   a   1
1   b   2
0   c   3
1   d   4

But when you used axis=1, pandas combined the data based on columns . thats why the output is

letter  number  letter  number  animal
0   a   1   c   3   cat
1   b   2   d   4   dog

EDIT:

you asked But inner join only join same column right? Then why it is showing 'animal' column? But inner join only join same column right? Then why it is showing 'animal' column?

So, Because right now you have 2 rows in both the dataframes and join only works in indexes.

For explaining to you, I have added another row in df3 Let's suppose df3 is

   0  1     2
0  c  3   cat
1  d  4   dog
2  e  5  bird

Now, If you concat the df1 and df3

pd.concat([df1,df3], join='inner', axis=1)

  letter  number  0  1    2
0      a       1  c  3  cat
1      b       2  d  4  dog

pd.concat([df1,df3], join='outer', axis=1)

  letter  number  0  1     2
0      a     1.0  c  3   cat
1      b     2.0  d  4   dog
2    NaN     NaN  e  5  bird

As you can see, in inner join only 0 and 1 indexes are in output but in outer join, all the indexes are in output with NAN values.

default value of axis is 0. So in the first concat call, axis=0 and there concatenation happens in rows. When you set axis=1 the operation is similar to

    df1.merge( df3, how="inner", left_index=True, right_index=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM