简体   繁体   English

根据 R 中的向量更改列的类型

[英]Change type of columns according to a vector in R

I'm dealing with changing formats in R.我正在处理 R 中的格式更改。

I have 2 dataframes:我有2个数据框:

  • The main one df主要的df
  • Another dataframe tmp which describes columns types of df and the New_format on which columns should be converted另一个数据帧tmp ,它描述了df的列类型和应转换的列的New_format

Here is a reproducible example:这是一个可重现的示例:

df <- data.frame(var1 = c("a", "b", "c"),
                 var2 = c(1,2,3), 
                 var3 = c("d", "e", "f"))

tmp <- data.frame(Variable = c("var1", "var2", "var3"), 
                  Format = c("character", "numeric", "character"),
                  New_format = c("character", "integer", "factor"))

I'd like to convert types of columns where New_format is different from Format .我想转换New_formatFormat不同的列类型。 I've struggled a lot by using lapply function but did not manage to do it.通过使用 lapply 功能,我遇到了很多困难,但没能做到。

It would be really nice if you have any idea :)如果您有任何想法,那就太好了:)

Thanks a lot!非常感谢!

You could set up a named mapping between the New_format values and corresponding as.<value> function, like this:您可以在New_format值和对应as.<value>函数之间设置命名映射,如下所示:

funcs <- list("character"= as.character,"integer"=as.integer, "factor"=as.factor)

Then, in a loop, call the function然后,在一个循环中,调用该函数

for(i in 1:nrow(tmp)) {
  if(tmp[i,"Format"]!=tmp[i,"New_format"]) {
    df[[tmp[i,"Variable"]]] <-funcs[[tmp[i,"New_format"]]](df[[tmp[i,"Variable"]]])   
  }
}

Use readr::type_convert()使用readr::type_convert()

library(tidyverse)

types <- paste(map_chr(tmp$New_format, ~str_sub(., 1,1)), collapse = "")

new_df <- type_convert(df, types, guess_integer = T)

str(new_df)
'data.frame':   3 obs. of  3 variables:
 $ var1: chr  "a" "b" "c"
 $ var2: int  1 2 3
 $ var3: Factor w/ 3 levels "d","e","f": 1 2 3

This function requires that the type specifications are passed in either as a cols() statement, or as a string with the new column type indicated by a single letter (eg "c" for character, "f" for factor, and so on).此函数要求将类型规范作为cols()语句或作为具有由单个字母指示的新列类型的字符串传入(例如,“c”表示字符,“f”表示因子,等等) .

So either just rename New_format labels to their single-letter versions ("c", "i", "f"), or you can use str_sub and paste with tmp to get the first letters (which type_convert wants for the type argument).因此,要么将New_format标签重命名为它们的单字母版本(“c”、“i”、“f”),要么您可以使用str_subpaste tmp来获取第一个字母( type_convert想要的类型参数)。

Note: Make sure to set guess_integer = TRUE , otherwise it will default to type double even if you ask for integer.注意:确保设置guess_integer = TRUE ,否则即使你要求整数,它也会默认输入double

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM