[英]How to convert all factor variables into numeric variables (in multiple data frames at once)?
I have n data frames, each corresponding to data from a city. 我有n个数据框,每个数据框对应于一个城市的数据。
There are 3 variables per data frame and currently they are all factor variables. 每个数据帧有3个变量,目前它们都是因子变量。
I want to transform all of them into numeric variables. 我想将它们全部转换为数字变量。
I have started by creating a vector with the names of all the data frames in order to use in a for loop. 我首先创建了一个包含所有数据帧名称的向量,以便在for循环中使用。
cities <- as.vector(objects())
for ( i in cities){
i <- as.data.frame(lapply(i, function(x) as.numeric(levels(x))[x]))
}
Although the code runs and there I get no error code, I don't see any changes to my data frames as all three variables remain factor variables. 尽管代码可以运行并且没有错误代码,但是我的数据帧没有任何变化,因为所有三个变量仍然是因子变量。
The strangest thing is that when doing them one by one (as below) it works: 最奇怪的是,当它们一个接一个地执行时(如下):
df <- as.data.frame(lapply(df, function(x) as.numeric(levels(x))[x]))
What you're essentially trying to do is modify the type of the field if it is a factor (to a numeric type). 实际上,您要尝试的是修改字段的类型(如果它是一个因素)(为数字类型)。 One approach using
purrr
would be: 使用
purrr
一种方法是:
library(purrr)
map(cities, ~ modify_if(., is.factor, as.numeric))
Note that modify()
in itself is like lapply()
but it doesn't change the underlying data structure of the objects you are modifying (in this case, dataframes). 请注意,
modify()
本身就像lapply()
但是它不会更改您要修改的对象(在本例中为数据lapply()
的基础数据结构。 modify_if()
simply takes a predicate as an additional argument. modify_if()
只是将谓词作为附加参数。
for anyone who's interested in my question, I worked out the answer: 对于任何对我的问题感兴趣的人,我都会给出答案:
for ( i in cities){
assign(i, as.data.frame(lapply(get(i), function(x) as.numeric(levels(x))[x])))
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.