简体   繁体   中英

creating a new data frame by extracting columns from one data frame based on the value of column in another data frame

I have a data frame df1 在此处输入图像描述

I have another data frame df2 在此处输入图像描述

What i want to is to merge these two in such a way that I get df3 as below

在此处输入图像描述

I tried the below

df3 <- cbind(
  df1,
  t(df2[,df1$Pool])
)

the output that I get is

在此处输入图像描述

I tried to generate some data frames that have the same structure as the one you put in the images, but I cannot reproduce the behavior you have.

For instance:

df1=data.frame(cus_id=sample(1:100, 6), pool=c("risk_7", "risk_5", "risk_5", "risk_5", "risk_6", "risk_5"))

df1
  cus_id   pool
1     45 risk_7
2     70 risk_5
3     16 risk_5
4     50 risk_5
5      4 risk_6
6     92 risk_5

Then I generated a matrix with random numbers that has the same size of your df2

the_numbers=matrix(rnorm(7*12),nrow=12)
df2=data.frame(1:12, the_numbers)
colnames(df2)=c("month", paste0("risk_", 1:7) )
head(df2)
  month      risk_1     risk_2      risk_3      risk_4      risk_5     risk_6       risk_7
1     1 -1.59589907  0.1938683 -0.09493059  1.40832914 -0.90416011 -0.3109643 -0.006606488
2     2  0.18909151  0.4302865  0.02042977  0.19644788  0.70734127  0.7787293  0.205113981
3     3  0.08839232  0.2060647 -0.53347739  1.28622444  0.41543447 -0.1872887 -0.145648466
4     4 -0.80375643  0.6664508 -0.42402686 -1.11301225  0.09023515 -1.1364959  1.401697819
5     5 -0.59077651  0.4874996  0.49586008 -0.04683787  0.33829197  1.4111230 -1.869269180
6     6  0.32643525 -0.9703614 -1.30666881  2.21348141  0.42064366 -0.7783622 -1.107047330

Exactly as you did I used cbind (although I used as.data.frame, otherwise the numbers get cast as characters):

df3=cbind(df1, as.data.frame(t(df2[,df1$pool])))
head(df3)
         cus_id   pool           V1        V2         V3          V4        V5         V6         V7         V8         V9        V10        V11        V12
risk_7       45 risk_7 -0.006606488 0.2051140 -0.1456485  1.40169782 -1.869269 -1.1070473 -0.1663397 -0.8709996  2.7316702 -0.8464295 -0.5886028  0.6179186
risk_5       70 risk_5 -0.904160114 0.7073413  0.4154345  0.09023515  0.338292  0.4206437 -1.6158471  0.8545650  1.6074374 -0.8418230 -0.0808034 -0.6900303
risk_5.1     16 risk_5 -0.904160114 0.7073413  0.4154345  0.09023515  0.338292  0.4206437 -1.6158471  0.8545650  1.6074374 -0.8418230 -0.0808034 -0.6900303
risk_5.2     50 risk_5 -0.904160114 0.7073413  0.4154345  0.09023515  0.338292  0.4206437 -1.6158471  0.8545650  1.6074374 -0.8418230 -0.0808034 -0.6900303
risk_6        4 risk_6 -0.310964343 0.7787293 -0.1872887 -1.13649592  1.411123 -0.7783622  1.8605113 -0.3938183 -0.7335341  1.0610378  0.1573779 -0.1681913
risk_5.3     92 risk_5 -0.904160114 0.7073413  0.4154345  0.09023515  0.338292  0.4206437 -1.6158471  0.8545650  1.6074374 -0.8418230 -0.0808034 -0.6900303

I believe that there is something wrong with your df2, maybe you have overwritten it?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM