[英]What is the most efficient way to correct text type data in a column?
fito <- c("forest", "savaaaana", "brae soil", "bare soil", "savanna", "froest")
id <- 1:6
df <- data.frame(fito = as.factor(fito), id = id)
用正確的數據("savanna", "bare soil", "forest")
替換錯誤類型的數據("savaaaana", "brae soil", "froest")
)的最聰明方法是什么?
一開始我有六個因素。 正確的是只有三個。
如何使用 tidyverse package 做到這一點?
你可以試試:
df2 <- df %>% mutate(fito = fct_collapse(fito, savanna = c("savaaaana", "savanna"),
`bare soil` = c("brae soil","bare soil"),
forest = c("forest","froest" )))
str(df2)
'data.frame': 6 obs. of 2 variables:
$ fito: Factor w/ 3 levels "bare soil","forest",..: 2 3 1 1 3 2
$ id : int 1 2 3 4 5 6
有兩種方法可以做到這一點:
library(tidyverse)
old<- c("savaaaana", "brae soil", "froest")
new<- c("savanna", "bare soil", "forest")
df %>%
mutate(fito=factor(str_replace_all(fito, set_names(new, old))))
fito id
1 forest 1
2 savanna 2
3 bare soil 3
4 bare soil 4
5 savanna 5
6 forest 6
df %>%
mutate(fito = lift(fct_recode)(as.list(set_names(old, new)), fito))
fito id
1 forest 1
2 savanna 2
3 bare soil 3
4 bare soil 4
5 savanna 5
6 forest 6
df %>%
mutate(fito = invoke(fct_recode, c(list(fito),as.list(set_names(old, new)))))
fito id
1 forest 1
2 savanna 2
3 bare soil 3
4 bare soil 4
5 savanna 5
6 forest 6
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.