[英]Add columns from one R data table to another
Another R question. 另一个R问题。 Have looked through the data.table vignettes and seen solutions like these:
浏览了data.table渐晕,并看到了以下解决方案:
But unfortunately while they're close, I'm somehow missing something in my understanding. 但是不幸的是,当它们关闭时,我不知所措。
My initial data tables include one that includes results and another with standards. 我的初始数据表包括一个包含结果的表,另一个包含标准的表。 Several columns are common between the two tables.
在两个表之间共有几列。 Here's a sample ( more columns exist for both tables, but they are not common between the two ).
这是一个示例( 两个表都存在更多列,但在两个表之间并不常见 )。
Results
ID Region Locale Medium Name Method
3324 Agate Zone C water Cadmium Z
2432 Gneiss Zone B air Calcium R
2433 Agate Zone A water Molybdenum Q
78882 Agate Zone D water Iron M
Standards
ID Region Locale Medium Name CoeffA CoeffB
3214 Agate Zone A water Cadmium -.243 1.43
3324 Agate Zone C water Cadmium -.243 1.43
2432 Gneiss Zone B water Calcium .432 0.44
78882 Agate Zone D water Iron 1.475 0
There are many more results than standards and some results have no standards. 结果比标准多得多,有些结果没有标准。
What I'd like to do is add the mathematical coefficient values of the standards table to the results table as new columns ( Ca
and Cb
). 我想做的是将标准表的数学系数值作为新列(
Ca
和Cb
)添加到结果表中。 Ultimately I'll use these to calculate comparative standard values. 最终,我将使用它们来计算比较标准值。
Results
ID Region Locale Medium Name Method C-a C-b
3324 Agate Zone C water Cadmium Z -.243 1.43
2432 Gneiss Zone B air Calcium R .432 0.44
2433 Agate Zone A water Molybdenum Q NA NA
78882 Agate Zone D water Iron M 1.475 0
I've tried the following without success: 我已经尝试了以下方法,但均未成功:
Results[Standards]
yields the standards values with result columns as NA
Results[Standards]
产生标准值,结果列为NA
Standards[Results]
yields the results values with standards columns as NA
Standards[Results]
产生标准列为NA
的结果值 merge(Results,Standards)
after using setkey(c("ID","Region","Locale","Medium"))
for the common key columns for Results
and Standards
, yields standards values with result columns as NA
merge(Results,Standards)
使用后setkey(c("ID","Region","Locale","Medium"))
用于公共密钥列Results
和Standards
,收率和结果列作为标准值NA
I would have thought that one of these syntaxes would definitely have yielded coefficient columns with values other than NA
. 我以为这些语法之一肯定会产生系数列,其值不是
NA
。
Any suggestions on where I should look or what I'm missing? 关于我应该去哪里或缺少什么的建议?
Thanks in advance for your kind assistance. 在此先感谢您的协助。
Try this, you can perform it without setkey as follows 试试这个,你可以不用setkey来执行它,如下
require(data.table)
newResults <- merge(x = Results, y = Standards, by = "ID", all.x = TRUE)
setnames(newResults,"CoeffA","C-a")
setnames(newResults,"CoeffB","C-b")
newResults
ID Region Locale Medium Name Method C-a C-b
2432 Gneiss Zone B air Calcium R .432 0.44
2433 Agate Zone A water Molybdenum Q NA NA
3324 Agate Zone C water Cadmium Z -.243 1.43
78882 Agate Zone D water Iron M 1.475 0
If you don't want NAs: 如果您不想要NA:
newResults[is.na(newResults)] <- 0 #replace NA with Zero
newResults[is.na(newResults)] <- "No value available" #replace NA with Text
First off, setkey can't be used for multiple variables, you need to use setkeyv instead. 首先,setkey不能用于多个变量,您需要使用setkeyv。
setkeyv(Results,c("ID","Region","Locale","Medium"))
setkeyv(Standards,c("ID","Region","Locale","Medium"))
Then: 然后:
JoinedDT <- merge(Results,Standards, all.x = TRUE)
This will give an NA
in any results row that does not have a Standards Row. 这将在没有“标准”行的任何结果行中提供
NA
。 If there are multiple Standards rows for one Results row, you will get two rows in your resulting data table. 如果一个结果行有多个标准行,您将在结果数据表中得到两行。
To set NA
to 0
: 将
NA
设置为0
:
JoinedDT[is.na(JoinedDT$CoeffA),CoeffA:= 0]
JoinedDT[is.na(JoinedDT$CoeffB),CoeffB:= 0]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.