[英]How do I convert all factor columns to numeric that have colnames matching from a list of strings?
I have not found this exact issue yet on here. 我还没有在这里找到这个确切的问题。 I have many columns and for all the ones that match ANY of a list of strings, I want to convert from factor -> character -> numeric.
我有很多列,对于所有与字符串列表中的任何一个匹配的列,我都想从factor-> character-> numeric转换。
Below shows an example where columns containing one of the strings are converted, and the two things I've tried for the case of multiple strings that failed 下面显示了一个示例,其中包含字符串之一的列被转换,并且在多个字符串失败的情况下我尝试了两件事
#Making fake data where every column is a factor. At the end I'd like to convert all factors that contain either "alcium" or "zinc" in the column name.
library(reshape2)
fake <-data.frame(id=c(1,1,1,2,2,2,3,3,3,1,1,1,2,2,2,3,3,3),
time=c(rep("Time1",9), rep("Time2",9)),
test=c("calcium","magnesium","zinc","calcium","magnesium","zinc","calcium","magnesium","zinc","calcium","magnesium","zinc","calcium","magnesium","zinc","calcium","magnesium","zinc"),
score=floor(runif(18, min=1, max=5)))
fake <- dcast(fake, id ~ time + test)
fake <- fake %>% mutate_if(is.numeric,as.factor)
#This works, but only for columns containing one of the strings
fake <- fake %>% mutate_at(vars(contains('alcium')), function(x) as.numeric(as.character(x)))
#Now trying to convert all columns containing either "alcium" or "zinc"
fake <- fake %>% mutate_at(vars(contains('alcium'| 'zinc')), function(x) as.numeric(as.character(x)))
#gives an error
#2nd attempt:
strings <- c("alcium", "zinc")
fake <- fake %>% mutate_at(vars(contains(strings)), function(x) as.numeric(as.character(x)))
#gives an error
Using the select helper matches()
instead of contains()
allows the passing of the strings collapsed into a regex friendly format. 使用选择助手
matches()
而不是contains()
允许传递折叠成正则表达式友好格式的字符串。
library(dplyr)
strings <- c("alcium", "zinc")
fake %>%
as_tibble %>%
mutate_at(vars(matches(paste0(strings, collapse = "|"))), as.numeric)
# A tibble: 3 x 8
id Time1_calcium `Time1_ma gnesium` Time1_magnesium Time1_zinc Time2_calcium Time2_magnesium Time2_zinc
<fct> <dbl> <fct> <fct> <dbl> <dbl> <fct> <dbl>
1 1 2 NA 4 1 3 4 1
2 2 2 NA 3 2 1 1 3
3 3 1 3 NA 1 2 3 2
I have updated your code a bit. 我已经更新了您的代码。
If you don't have too many you could do them separately. 如果您没有太多,可以分开进行。 Otherwise, I couldn't get multiple strings to work.
否则,我将无法使用多个字符串。
sofaWa <- fake %>% mutate_at(vars(contains('alcium')), list(as.numeric)) %>%
mutate_at(vars(contains('zinc')), list(as.numeric))
Produces this 产生这个
# A tibble: 3 x 7
id Time1_calcium Time1_magnesium Time1_zinc Time2_calcium Time2_magnesium Time2_zinc
<fct> <dbl> <fct> <dbl> <dbl> <fct> <dbl>
1 1 3 2 1 1 2 1
2 2 2 1 2 1 1 2
3 3 1 4 2 2 3 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.