简体   繁体   中英

updating a data frame using the values of another data frame - python

Let's say I have the data frames df1 and df2 .

c1 = np.repeat(['a','b'], [8, 8], axis=0)
c2 = list('xxxxyyyyxxxxyyyy')
c3 = ['G1','G1','G2','G2','G1','G1','G2','G2','G1','G1','G2','G2','G1','G1','G2','G2']
c4 = [1,2]*8
val1 = np.random.rand(16)
df1 = pd.DataFrame({'c1':c1,'c2':c2,'c3':c3,'c4':c4,'val':val1})

df2 = pd.DataFrame({'c1':['a','b','a','b'],'c2':['x','x','y','y'],'val2':[100,90,221,92]})

How can I use df2 to create a column on df1 containing the values in val2 ? The output should look like:

   c1   c2   c3   c4   val1   val2
0  a    x    G1   1    0.67   100
1  a    x    G1   2    0.36   100
2  a    x    G2   1    0.12   100
3  a    x    G2   2    0.31   100
4  a    y    G1   1    0.56   221
5  a    y    G1   2    0.92   221
6  a    y    G2   1    0.62   221
7  a    y    G2   2    0.99   221
8  b    x    G1   1    0.73   90
9  b    x    G1   2    0.56   90
10 b    x    G2   1    0.22   90
11 b    x    G2   2    0.91   90
12 b    y    G1   1    0.34   92
13 b    y    G1   2    0.39   92
14 b    y    G2   1    0.78   92
15 b    y    G2   2    0.63   92

I think you can use merge :

print pd.merge(df1,df2,on=['c1','c2'])
   c1 c2  c3  c4       val  val2
0   a  x  G1   1  0.600033   100
1   a  x  G1   2  0.929101   100
2   a  x  G2   1  0.311034   100
3   a  x  G2   2  0.341437   100
4   a  y  G1   1  0.512890   221
5   a  y  G1   2  0.124317   221
6   a  y  G2   1  0.428409   221
7   a  y  G2   2  0.047169   221
8   b  x  G1   1  0.485116    90
9   b  x  G1   2  0.960812    90
10  b  x  G2   1  0.347445    90
11  b  x  G2   2  0.490705    90
12  b  y  G1   1  0.273342    92
13  b  y  G1   2  0.784263    92
14  b  y  G2   1  0.805600    92
15  b  y  G2   2  0.057058    92

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM