简体   繁体   中英

Add columns from one R data table to another

Another R question. Have looked through the data.table vignettes and seen solutions like these:

But unfortunately while they're close, I'm somehow missing something in my understanding.

My initial data tables include one that includes results and another with standards. Several columns are common between the two tables. Here's a sample ( more columns exist for both tables, but they are not common between the two ).

Results
ID    Region   Locale    Medium    Name          Method
3324   Agate    Zone C    water     Cadmium        Z
2432   Gneiss   Zone B    air       Calcium        R
2433   Agate    Zone A    water     Molybdenum     Q
78882  Agate    Zone D    water     Iron           M

Standards
ID   Region   Locale   Medium     Name    CoeffA    CoeffB
3214  Agate   Zone A    water     Cadmium  -.243    1.43
3324  Agate   Zone C    water     Cadmium  -.243    1.43
2432  Gneiss  Zone B    water     Calcium  .432     0.44
78882 Agate   Zone D    water     Iron     1.475    0

There are many more results than standards and some results have no standards.

What I'd like to do is add the mathematical coefficient values of the standards table to the results table as new columns ( Ca and Cb ). Ultimately I'll use these to calculate comparative standard values.

Results
ID    Region   Locale    Medium    Name          Method      C-a         C-b
3324   Agate    Zone C    water     Cadmium      Z           -.243      1.43
2432   Gneiss   Zone B    air       Calcium      R           .432       0.44
2433   Agate    Zone A    water     Molybdenum   Q           NA         NA
78882  Agate    Zone D    water     Iron         M           1.475       0

I've tried the following without success:

  • Results[Standards] yields the standards values with result columns as NA
  • Standards[Results] yields the results values with standards columns as NA
  • merge(Results,Standards) after using setkey(c("ID","Region","Locale","Medium")) for the common key columns for Results and Standards , yields standards values with result columns as NA

I would have thought that one of these syntaxes would definitely have yielded coefficient columns with values other than NA .

Any suggestions on where I should look or what I'm missing?

Thanks in advance for your kind assistance.

Try this, you can perform it without setkey as follows

require(data.table)
newResults <- merge(x = Results, y = Standards, by = "ID", all.x = TRUE)
setnames(newResults,"CoeffA","C-a")
setnames(newResults,"CoeffB","C-b")

newResults
ID     Region   Locale    Medium    Name         Method      C-a        C-b
2432   Gneiss   Zone B    air       Calcium      R           .432       0.44
2433   Agate    Zone A    water     Molybdenum   Q           NA         NA
3324   Agate    Zone C    water     Cadmium      Z           -.243      1.43
78882  Agate    Zone D    water     Iron         M           1.475      0

If you don't want NAs:

newResults[is.na(newResults)] <- 0   #replace NA with Zero
newResults[is.na(newResults)] <- "No value available" #replace NA with Text 

First off, setkey can't be used for multiple variables, you need to use setkeyv instead.

setkeyv(Results,c("ID","Region","Locale","Medium"))
setkeyv(Standards,c("ID","Region","Locale","Medium"))

Then:

JoinedDT <- merge(Results,Standards, all.x = TRUE)

This will give an NA in any results row that does not have a Standards Row. If there are multiple Standards rows for one Results row, you will get two rows in your resulting data table.

To set NA to 0 :

JoinedDT[is.na(JoinedDT$CoeffA),CoeffA:= 0]
JoinedDT[is.na(JoinedDT$CoeffB),CoeffB:= 0]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM