[英]How to convert all columns where entries have length ≤1 to numeric?
I have a data frame with ~80 columns, and ~20-40 of those columns have single-digit integers that were stored as characters.我有一个包含约 80 列的数据框,其中约 20-40 列具有存储为字符的个位数整数。 Other character columns are complete sentences (so, length >>> 1
), and so get coerced to NA
if I try mutate_if(is.character, as.numeric)
.其他字符列是完整的句子(因此, length >>> 1
),因此如果我尝试mutate_if(is.character, as.numeric)
被强制为NA
。
I would like to transform those efficiently, and based on this question , I was hoping for something like this:我想有效地转换这些,基于这个问题,我希望有这样的事情:
df %>% map_if(is.character & length(.) <= 1, as.numeric)
However, that doesn't work.但是,这不起作用。 I'm hoping for a tidy
solution, maybe using purrr
?我希望有一个tidy
解决方案,也许使用purrr
?
The best function for these situations is type_convert() , from readr
:这些情况的最佳函数是type_convert() ,来自readr
:
"[ type_convert()
re-converts character columns in a data frame], which is useful if you need to do some manual munging - you can read the columns in as character, clean it up with (eg) regular expressions and other transformations, and then let readr
take another stab at parsing it." “[ type_convert()
重新转换数据框中的字符列],如果您需要进行一些手动调整,这很有用 - 您可以将列作为字符读取,使用(例如)正则表达式和其他转换进行清理,然后让readr
再次尝试解析它。”
So, all you need to do is add it at the end of your pipe:因此,您需要做的就是将它添加到管道的末尾:
df %>% ... %>% type_convert()
Alternatively, we can use type.convert
from base R
, which would automatically detect the column type based on the value and change it或者,我们可以使用base R
type.convert
,它会根据值自动检测列类型并更改它
df[] <- type.convert(df, as.is = TRUE)
If the constraint is to look for columns that have only one character如果约束是查找只有一个字符的列
i1 <- !colSums(nchar(as.matrix(df)) > 1)
df[i1] <- type.convert(df[i1])
If we want to use tidyverse
, there is parse_guess
from readr
如果我们想用tidyverse
,有parse_guess
从readr
library(tidyverse)
library(readr)
df %>%
mutate_if(all(nchar(.) == 1), parse_guess)
You could check for nchar
of the column in mutate_if
你可以检查nchar
列的mutate_if
library(dplyr)
df %>% mutate_if(~all(nchar(.) == 1) & is.character(.), as.numeric)
Using with an example data使用示例数据
df <- data.frame(a = c("ab", "bc", "de", "de", "ef"),
b = as.character(1:5), stringsAsFactors = FALSE)
df1 <- df %>% mutate_if(~all(nchar(.) == 1) & is.character(.), as.numeric)
str(df1)
#'data.frame': 5 obs. of 2 variables:
# $ a: chr "ab" "bc" "de" "de" ...
# $ b: num 1 2 3 4 5
You could do the same with map_if
as well however, it returns a list back and you need to convert it back to dataframe您也可以对map_if
执行相同的map_if
,但是它返回一个列表,您需要将其转换回数据帧
library(purrr)
df %>%
map_if(~all(nchar(.) == 1) & is.character(.), as.numeric) %>%
as.data.frame(., stringsAsFactors = FALSE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.