简体   繁体   English

将一个R数据表中的列添加到另一个

[英]Add columns from one R data table to another

Another R question. 另一个R问题。 Have looked through the data.table vignettes and seen solutions like these: 浏览了data.table渐晕,并看到了以下解决方案:

But unfortunately while they're close, I'm somehow missing something in my understanding. 但是不幸的是,当它们关闭时,我不知所措。

My initial data tables include one that includes results and another with standards. 我的初始数据表包括一个包含结果的表,另一个包含标准的表。 Several columns are common between the two tables. 在两个表之间共有几列。 Here's a sample ( more columns exist for both tables, but they are not common between the two ). 这是一个示例( 两个表都存在更多列,但在两个表之间并不常见 )。

Results
ID    Region   Locale    Medium    Name          Method
3324   Agate    Zone C    water     Cadmium        Z
2432   Gneiss   Zone B    air       Calcium        R
2433   Agate    Zone A    water     Molybdenum     Q
78882  Agate    Zone D    water     Iron           M

Standards
ID   Region   Locale   Medium     Name    CoeffA    CoeffB
3214  Agate   Zone A    water     Cadmium  -.243    1.43
3324  Agate   Zone C    water     Cadmium  -.243    1.43
2432  Gneiss  Zone B    water     Calcium  .432     0.44
78882 Agate   Zone D    water     Iron     1.475    0

There are many more results than standards and some results have no standards. 结果比标准多得多,有些结果没有标准。

What I'd like to do is add the mathematical coefficient values of the standards table to the results table as new columns ( Ca and Cb ). 我想做的是将标准表的数学系数值作为新列( CaCb )添加到结果表中。 Ultimately I'll use these to calculate comparative standard values. 最终,我将使用它们来计算比较标准值。

Results
ID    Region   Locale    Medium    Name          Method      C-a         C-b
3324   Agate    Zone C    water     Cadmium      Z           -.243      1.43
2432   Gneiss   Zone B    air       Calcium      R           .432       0.44
2433   Agate    Zone A    water     Molybdenum   Q           NA         NA
78882  Agate    Zone D    water     Iron         M           1.475       0

I've tried the following without success: 我已经尝试了以下方法,但均未成功:

  • Results[Standards] yields the standards values with result columns as NA Results[Standards]产生标准值,结果列为NA
  • Standards[Results] yields the results values with standards columns as NA Standards[Results]产生标准列为NA的结果值
  • merge(Results,Standards) after using setkey(c("ID","Region","Locale","Medium")) for the common key columns for Results and Standards , yields standards values with result columns as NA merge(Results,Standards)使用后setkey(c("ID","Region","Locale","Medium"))用于公共密钥列ResultsStandards ,收率和结果列作为标准值NA

I would have thought that one of these syntaxes would definitely have yielded coefficient columns with values other than NA . 我以为这些语法之一肯定会产生系数列,其值不是NA

Any suggestions on where I should look or what I'm missing? 关于我应该去哪里或缺少什么的建议?

Thanks in advance for your kind assistance. 在此先感谢您的协助。

Try this, you can perform it without setkey as follows 试试这个,你可以不用setkey来执行它,如下

require(data.table)
newResults <- merge(x = Results, y = Standards, by = "ID", all.x = TRUE)
setnames(newResults,"CoeffA","C-a")
setnames(newResults,"CoeffB","C-b")

newResults
ID     Region   Locale    Medium    Name         Method      C-a        C-b
2432   Gneiss   Zone B    air       Calcium      R           .432       0.44
2433   Agate    Zone A    water     Molybdenum   Q           NA         NA
3324   Agate    Zone C    water     Cadmium      Z           -.243      1.43
78882  Agate    Zone D    water     Iron         M           1.475      0

If you don't want NAs: 如果您不想要NA:

newResults[is.na(newResults)] <- 0   #replace NA with Zero
newResults[is.na(newResults)] <- "No value available" #replace NA with Text 

First off, setkey can't be used for multiple variables, you need to use setkeyv instead. 首先,setkey不能用于多个变量,您需要使用setkeyv。

setkeyv(Results,c("ID","Region","Locale","Medium"))
setkeyv(Standards,c("ID","Region","Locale","Medium"))

Then: 然后:

JoinedDT <- merge(Results,Standards, all.x = TRUE)

This will give an NA in any results row that does not have a Standards Row. 这将在没有“标准”行的任何结果行中提供NA If there are multiple Standards rows for one Results row, you will get two rows in your resulting data table. 如果一个结果行有多个标准行,您将在结果数据表中得到两行。

To set NA to 0 : NA设置为0

JoinedDT[is.na(JoinedDT$CoeffA),CoeffA:= 0]
JoinedDT[is.na(JoinedDT$CoeffB),CoeffB:= 0]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 r - data.table join然后将所有列从一个表添加到另一个表 - r - data.table join and then add all columns from one table to another 从一个表中选择数据从另一个表中选择数据列,使用r - Data from one table to select data columns from another table, using r 在一个函数调用中向R data.table添加多个列? - Add multiple columns to R data.table in one function call? 如果在 R 中不包含在另一个数据框中,则从两个数据框中删除列 - Remove columns from two data frames if not contained in one another in R R数据表添加不同行中的列 - R Data Table Add Columns From Different Rows 将值从一个数据帧添加到R中的另一个数据帧 - Add value from one data frame into another data frame in R 如何在 R 中将数据从一个数据集添加到另一个数据集? - How to add data from one data set to another in R? 检查一个表(X)中的值是否在具有R data.table的另一个表(Y)中的两列中的值之间 - Check if a value in one table (X) is between the values in two columns in another table (Y) with R data.table R data.table如何基于另一列的值从多个列之一(按列NAME)获取VALUE - R data.table How to obtain the VALUE from one of many columns (by column NAME), based on the value of another column 使用R中的data.table将多个列添加到data.table with =,只有一个函数调用 - Use data.table in R to add multiple columns to a data.table with = with only one function call
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM