简体   繁体   中英

Join two Data Frames in Pandas based on third data frame

I have 3 pandas df with different dimensions . The first frame is as follows-

df1.head(4)

col1   col2   col3
  a      b     c
  d      e     f
  g      h     i
  j      k     l

The second frame is as follows-

df2.head(4)

col4   col5   col6
  m      n      o
  p      q      r
  s      t      u
  v      w      x

The third data frame has combinations of col3 of df1 and col6 of df2 . It looks like

df3.head(3)

col3    col6
  c       r
  i       u
  f       x

Now I want to combine all three data frames based on the combinations in df3 columns col3 and col6 . The resultant df should look like-

final_df.head(3)

col1    col2    col4    col5    col2    col6
  a      b       p        q       c       r
  g      h       s        t       i       u
  d      e       v        w       f       x

I have tried the following code

df4 = pd.merge(df1, df3, on='col3')
final_df = pd.merge(df4, df2, on='col6')

but got the memory error as

MemoryError: Unable to allocate 1.79 GiB for an array with shape (2, 120193432) and data type int64

Is there any other effecient way to do this ?

The above works fine without a memory error in my side. I am running an 8Gig Ram Computer with 32bit python.

  • Give More space to your Computer
  • Check and stop apps with high Memory(Ram) usage (mostly chrome tabs)
  • Limit data rows if the above dataframes is not what you working with

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM