I have a list of data frames, all with the same structure. I need to recode a variable in each data frame based on the value of another variable. I've found solutions here that have gotten me close, but after many hours, I'm still coming up short.
My data frames look like this:
$Test14
Class Total
1 201 1
2 203 14
3 204 3
4 205 7
5 206 7
6 207 1
7 211 2
8 212 1
9 213 16
10 288 27
11 299 9
12 517 1
13 592 2
14 593 8
Each Class code falls into a larger MajorClass category. I'm trying to attach another those MajorClass so that I can the data into plain English. So something like this:
$Test14
Class Total MajorClass
1 201 1 Reg Residential
2 203 14 Reg Residential
3 204 3 Reg Residential
4 205 7 Reg Residential
5 206 7 Reg Residential
6 207 1 Reg Residential
7 211 2 Reg Residential
8 212 1 Reg Residential
9 213 16 NonReg Residential
10 288 27 NonReg Residential
11 299 9 NonReg Residential
12 517 1 Commercial
13 592 2 Commercial
14 593 8 Industrial
My thought was to try and use lapply
in lieu of a for loop to get the MajorClass for each row and then use a cbind
to pull it all back together later. The closes I came was using the following code:
> MajorClass <- lapply(mydata, function(i) {
> i$MajorClass <- ""
> if (i$Class == '200' || i$Class == '202' || i$Class == '203' || i$Class == '204' || i$Class == '205' || i$Class == '206' || i$Class ==
> '207' || i$Class == '208' || i$Class == '209' || i$Class == '210' ||
> i$Class == '211' || i$Class == '212' || i$Class == '216' || i$Class ==
> '234' || i$Class == '278' || i$Class == '295')
> i$MajorClass <- "Reg Residential"
> else
> if (i$Class == '239' || i$Class == '240' || i$Class == '241' || i$Class == '201' || i$Class == '213' || i$Class == '224' || i$Class
> == '225' || i$Class == '236' || i$Class == '288' || i$Class == '290' || i$Class == '297' || i$Class == '299')
> i$MajorClass <- "NonReg Residential" ... and so on ...
But it returns only one value for the last record in each data frame. I've tried multiple variations on this, and have attempted to use a for loop, all to no avail. Also, my (limited) understanding is it's more efficient to use the apply functions instead of for loops.
Any help or pointing in the right direction would be greatly appreciated. As I said, I've searched a lot on this site and others and came close but not close enough. Thanks again!
What you are trying to do is to match values from one table to another, that can be done easily with a join
. This matches the elements of two tables by a common (and equally named) column.
To do that you need a reference table, where each different class
has its MajorClass
associated. (I've generated some dummy data)
#install.packages("dplyr")
library(dplyr)
test <- list(test14 = data.frame(class = c("201", "203","205"), total=c(1,3,7),
stringsAsFactors = F))
reference_table <- data.frame(class = c("201","202","203","204","205"),
MajorClass=c("Reg","Reg","NonReg","comercial","comercial"),
stringsAsFactors = F)
Now you can match it to each data frame, by using lapply
output.list <- lapply(test, function(x) left_join(x, reference_table, by="class"))
$test14
class total MajorClass
1 201 1 Reg
2 203 3 NonReg
3 205 7 comercial
Or collapse all the data frames of your list into one (you can do so if they have the same structure) and then match all the table at once.
data <- bind_rows(test)
output <- left_join(data, reference_table, by="class")
class total MajorClass
(chr) (dbl) (chr)
1 201 1 Reg
2 203 3 NonReg
3 205 7 comercial
I assume that you have a reference table as Vicent Boned describes. You can use base R to do the job.
test$MajorClass <- factor(test$class, levels=reference_table$class, labels=reference_table$MajorClass)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.