仅选择 2 个不同数据帧 R 中的匹配列

Question

I have 106 columns in 1st DF and 97 in 2nd and i want to merge both of them.我在第一个 DF 中有 106 列，在第二个中有 97 列，我想合并它们。 For this i need to have identical columns in both DF's.为此，我需要在两个 DF 中都有相同的列。

So how can i achieve below requirements(listed below).那么我如何才能达到以下要求（如下所列）。

DF1 :column names are A,B,C & D 
DF2 :column names A,B & E.

Can select below combinations of columns in dataframes ?可以在数据框中选择以下列组合吗？

1) Match in both i.e A & B 
2) Extras in 2nd i.e E
3) Extras in first i.e C & D

I tried different ways like select() in dplyr with colnames(df1) == colnames(df2) etc and other different possibilities but not getting any success.我尝试了不同的方法，例如 dplyr 中的select()与colnames(df1) == colnames(df2)等和其他不同的可能性，但没有取得任何成功。

Below is Dataframe1 :以下是 Dataframe1 ：

[1] "ï..Lan.ID"                 "NBFC"                      "Application.ID"           
  [4] "Region"                    "Loan.City"                 "Loan.Type"                
  [7] "Loan.Scheme"               "Name"                      "Mobile.Number"            
 [10] "Loan.Status"               "Principal.Outstanding"     "Last.EMI"                 
 [13] "Next.EMI"                  "Next.Bullet.Month"         "Next.Bullet.Amount"       
 [16] "Sum.Instalment.Posted"     "Dues.Receipts"             "EMI.Due"                  
 [19] "All.Dues"                  "Instalment.Dues"           "Bullets.Overdue"          
 [22] "Loan.Quality"              "Sanctioned.Amount"         "Loan.Amount"              
 [25] "Tenure"                    "Completed.Tenure"          "Tenure.Left"              
 [28] "Personal.Email"            "Official.Email"            "No..Of.Late.Payments"     
 [31] "CRIF.Score"                "CIBIL.Score"               "No.of.Actions"            
 [34] "Fixed.Income"              "ECS.Customer.Name"         "ECS.Bank.Name"            
 [37] "ECS.Account.Number"        "Loan.Date"                 "Sanction.Month"           
 [40] "EMI.Start.Date"            "X1st.EMI.Month"            "End.Date"                 
 [43] "Home.Address"              "Permanent.Address"         "Employer.Name"            
 [46] "Company.MCA.ID"            "Business.Address"          "Reference.Details"        
 [49] "Nature.of.Business"        "Pan.Card"                  "Aadhar.UID"               
 [52] "Gender"                    "Educational.Qualification" "DOB"                      
 [55] "Marital.Status"            "Last.Payment.Date"         "Job.Type"                 
 [58] "Employment.Year"           "Cycle.Date"                "Age"                      
 [61] "relevant_pos"              "crif_active_accounts"      "crif_overdue_amt"         
 [64] "crif_current_outstanding"  "cibil_active_accounts"     "cibil_overdue_amt"        
 [67] "cibil_current_outstanding" "NACH.Status"               "Awarenss.Allocation"      
 [70] "Allocation.Date"           "Awareness.Data"            "Awareness.Brk.up"         
 [73] "Dec.19.EMI.Amount"         "Tenure.End"                "Dec.19.BKt"               
 [76] "DPD"                       "New.DPD"                   "DPD.Range.New"            
 [79] "New.Amount.Due"            "New.Total.Due"             "Loan.Slabs"               
 [82] "Last.Month.Bnc"            "X1st.EMI"                  "Dec.19.Bnc"               
 [85] "Dec.19.Non.Starter"        "Reason.of.Bnc"             "HNI"                      
 [88] "EMI.Due.1"                 "OS"                        "Advance.Paid"             
 [91] "Paid.Unpaid"               "Not.Allocated"             "Excess"                   
 [94] "CC.Take.Over...OD"         "Last.Month.delinq"         "Loan.Status.1"            
 [97] "CIBIL.Bracket"             "Salary.Bracket"            "DPD.1"                    
[100] "Reason.of.Default"         "Contactibility"            "Delinq"                   
[103] "PayTm.Industry"            "Industry"                  "Employer.Name.1"          
[106] "DELINQ.NON.DELINQ"

Dataframe 2:数据框 2：

[1] "ï..Lan.ID"                 "NBFC"                      "Application.ID"           
 [4] "Region"                    "Loan.City"                 "Loan.Type"                
 [7] "Loan.Scheme"               "Name"                      "Mobile.Number"            
[10] "Loan.Status"               "Principal.Outstanding"     "Last.EMI"                 
[13] "Next.EMI"                  "Next.Bullet.Month"         "Next.Bullet.Amount"       
[16] "Sum.Instalment.Posted"     "Dues.Receipts"             "EMI.Due"                  
[19] "All.Dues"                  "Instalment.Dues"           "Bullets.Overdue"          
[22] "Loan.Quality"              "Sanctioned.Amount"         "Loan.Amount"              
[25] "Tenure"                    "Completed.Tenure"          "Tenure.Left"              
[28] "Personal.Email"            "Official.Email"            "No..Of.Late.Payments"     
[31] "CRIF.Score"                "CIBIL.Score"               "No.of.Actions"            
[34] "Fixed.Income"              "ECS.Customer.Name"         "ECS.Bank.Name"            
[37] "ECS.Account.Number"        "Loan.Date"                 "Sanction.Month"           
[40] "EMI.Start.Date"            "X1st.EMI.Month"            "End.Date"                 
[43] "Home.Details"              "Permanent.Address.Details" "Employer.Name"            
[46] "Company.MCA.ID"            "Business.Details"          "Reference.Details"        
[49] "Nature.of.Business"        "Pan.Card"                  "Aadhar.UID"               
[52] "Gender"                    "Educational.Qualification" "DOB"                      
[55] "Marital.Status"            "Last.Payment.Date"         "Job.Type"                 
[58] "Employment.Year"           "Cycle.Date"                "Age"                      
[61] "relevant_pos"              "crif_active_accounts"      "crif_overdue_amt"         
[64] "crif_current_outstanding"  "cibil_active_accounts"     "cibil_overdue_amt"        
[67] "cibil_current_outstanding" "NACH.status"               "Awarenss.Allocation"      
[70] "Allocation.Date"           "Awareness.Data"            "Awareness.Brk.up"         
[73] "June.19.EMI.Amount"        "Tenure.End"                "June.BKt"                 
[76] "Loan.Slabs"                "Last.Month.Bnc"            "X1st.EMI"                 
[79] "June.19.Bnc"               "June.19.Non.Starter"       "Reason.of.Bnc"            
[82] "HNI"                       "EMI.Due.1"                 "OS"                       
[85] "Advance.Paid"              "PAID.Unpaid"               "Not.Allocated"            
[88] "Excess"                    "DPD"                       "CC.Take.Over"             
[91] "Last.Month.delinq"         "Loan.Status.1"             "CIBIL.Bracket"            
[94] "Salary.Bracket"            "DPD.1"                     "DELINQ.NON.DELINQ"        
[97] "Month"

Expected outcome here would be names of matching columns & names of unmatched columns in both DF's.这里预计结果将在两个DF的匹配列和无与伦比的列名的名字。

Answer 1

I think Sotos's comment provide the most elegant output expected to your question.我认为 Sotos 的评论为您的问题提供了最优雅的输出。

However as an alternative, you can have the use of %in% :但是，作为替代方案，您可以使用%in% ：

O1 = colnames(dfA)[colnames(dfA) %in% colnames(dfB)]

> O1
[1] "A" "B" "C"

However, regarding your matching conditions 2) and 3), it's a little bit confusing because when you ask for:但是，关于您的匹配条件 2) 和 3)，这有点令人困惑，因为当您要求时：

2) Common in both and additional in 2nd ie A,B & E 2) 在两者中通用，在 2nd 中是附加的，即 A、B 和 E

To my opinion, it correspond to all columns in the second dataset ( colnames(dfB) )在我看来，它对应于第二个数据集中的所有列（ colnames(dfB) ）

3) Common in both and extras in first ie A,B,C & D 3) 常见于两者和附加项，即 A、B、C 和 D

And this correspond to all columns in the first dataset ( colnames(dfA) )这对应于第一个数据集中的所有列（ colnames(dfA) ）

Does it makes sense to you ?这对你有意义吗？ Did I missed something on your merging pattern ?我是否遗漏了您的合并模式中的某些内容？

Data数据

dfA = data.frame(matrix(sample(1:100, 16), ncol = 4, nrow = 4))
colnames(dfA) = LETTERS[1:4]

dfB = data.frame(matrix(sample(1:100, 16), ncol = 4, nrow = 4))
colnames(dfB) = LETTERS[c(1:3,5)]

> dfA
   A  B  C  D
1 75 66 17 89
2 46  7 27 38
3 97 26 47 31
4 32 20 71  2

> dfB
   A  B  C  E
1 94 70 18 16
2 69 57 29 60
3 53 50 25 96
4 37 51 64 75

仅选择 2 个不同数据帧 R 中的匹配列

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-12-11 16:18:05

仅选择 2 个不同数据帧 R 中的匹配列

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-12-11 16:18:05

解决方案1
1 已采纳 2019-12-11 16:18:05