I would like to use lapply
to label the values of specific variables. I have found an example that gets me close ( here ), but I can't get it to work for only certain variables in the data set.
Working example:
df1 <- tribble(
~var1, ~var2, ~var3, ~var4,
"1", "1", "1", "a",
"2", "2", "2", "b",
"3", "3", "3", "c"
)
Here is the code that seems like it should work, but doesn't:
df1["var1", "var2"] <- lapply(df1["var1", "var2"], factor,
levels=c(1,
2,
3),
labels = c("Agree",
"Neither Agree/Disagree",
"Disagree"))
The code runs, but give the following output:
# A tibble: 4 x 4
var1 var2 var3 var4
* <chr> <chr> <chr> <chr>
1 1 1 1 a
2 2 2 2 b
3 3 3 3 c
4 <NA> <NA> <NA> <NA>
If I try with just one variable, it works:
df1["var1"] <- lapply(df1["var1"], factor,
levels=c(1,
2,
3),
labels = c("Agree",
"Neither Agree/Disagree",
"Disagree"))
It gives the following output (which is correct):
# A tibble: 3 x 4
var1 var2 var3 var4
<fctr> <chr> <chr> <chr>
1 Agree 1 1 a
2 Neither Agree/Disagree 2 2 b
3 Disagree 3 3 c
I have tried a lot of different ways to change the code to get it to work, but I just can't figure it out.
You were close. We need df1[c("var1", "var2")]
to specify columns.
df1[c("var1", "var2")] <- lapply(df1[c("var1", "var2")], factor,
levels=c("1",
"2",
"3"),
labels = c("Agree",
"Neither Agree/Disagree",
"Disagree"))
df1
# # A tibble: 3 x 4
# var1 var2 var3 var4
# <fctr> <fctr> <chr> <chr>
# 1 Agree Agree 1 a
# 2 Neither Agree/Disagree Neither Agree/Disagree 2 b
# 3 Disagree Disagree 3 c
Your problem is arising because you're trying to subset your data.frame
incorrectly.
In a data.frame
or tbl
, extracting using [
works in a couple of ways.
matrix
-like rectangular form, you can use a [row, column]
approach to get specific values. For example to get a single value, you can do something like df1[2, 1]
. tbl
/ data.frame
is a special type of list
, if you don't supply a comma, it assumes you want the entire list element. Thus, when you did ["var1", "var2"]
, it went into matrix
subsetting mode and was looking for a row named "var1", which it couldn't find, so it inserted a row of NA
values in your dataset.
Here's a small set of examples for you to experiment with.
Get rows 1:4 and columns 1:4
df <- mtcars[1:4, 1:4] df # mpg cyl disp hp # Mazda RX4 21.0 6 160 110 # Mazda RX4 Wag 21.0 6 160 110 # Datsun 710 22.8 4 108 93 # Hornet 4 Drive 21.4 6 258 110
Extract a single value using a [row, column]
approach
df["Mazda RX4", "mpg"] # [row, column] # [1] 21
Check whether a data.frame
is a list
is.list(df) # [1] TRUE
Convert a data.frame
to a list
and try to extract using [row, column]
.
L <- unclass(df) L["Mazda RX4", "mpg"] # A list doesn't have `dim`s. # Error in L["Mazda RX4", "mpg"] : incorrect number of dimensions
Providing just one value to extract from a data.frame
or a list
df["mpg"] # Treats it as asking for a single value from a list # mpg # Mazda RX4 21.0 # Mazda RX4 Wag 21.0 # Datsun 710 22.8 # Hornet 4 Drive 21.4 L["mpg"] # $mpg # [1] 21.0 21.0 22.8 21.4
Providing a vector of values to extract
df[c("mpg", "hp")] # mpg hp # Mazda RX4 21.0 110 # Mazda RX4 Wag 21.0 110 # Datsun 710 22.8 93 # Hornet 4 Drive 21.4 110 L[c("mpg", "hp")] # $mpg # [1] 21.0 21.0 22.8 21.4 # # $hp # [1] 110 110 93 110
Since a data.frame
is a special type of list
with dim
s, using an empty [, vals]
would work
df[, c("mpg", "hp")] # mpg hp # Mazda RX4 21.0 110 # Mazda RX4 Wag 21.0 110 # Datsun 710 22.8 93 # Hornet 4 Drive 21.4 110
Looking for a row that is not there would return NA
s
df["not here", ] # mpg cyl disp hp # NA NA NA NA NA
Keeping those details in mind, your best approach is to just use (as suggested in @www's answer :
df1[c("var1", "var2")]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.