[英]Remove NA from a dataframe column R
I have a dataframe named Resultaat我有一个名为 Resultaat 的 dataframe
Cluster Number
W63 1020 NA NA NA 1100
W50 1020 NA 1240 NA NA
I want to remove all the NA values en keep the numbers.我想删除所有 NA 值并保留数字。 The columns are defined as character.
列被定义为字符。
Expected output预期 output
Cluster Number
W63 1020 1100
W50 1020 1240
I tried things like gsub("^NA(?:\\s+NA)*\\b\\s*|\\s*\\bNA(?:\\s+NA)*$", "", Resultaat$Number)
& Resultaat <- Resultaat[.is.na(Resultaat)]
but nothing works我试过像
gsub("^NA(?:\\s+NA)*\\b\\s*|\\s*\\bNA(?:\\s+NA)*$", "", Resultaat$Number)
& Resultaat <- Resultaat[.is.na(Resultaat)]
但没有任何效果
Here is one option - read the column 'Number' with read.table
and unite
all the columns, excluding the NA
elements with na.rm = TRUE
这是一个选项 - 使用
read.table
读取列“数字”并unite
所有列,不包括na.rm = TRUE
的NA
元素
library(tidyr)
library(dplyr)
read.table(text = Resultaat$Number, header = FALSE, fill = TRUE) %>%
unite(Number, everything(), na.rm = TRUE, sep = " ") %>%
bind_cols(Resultaat[1], .)
-output -输出
Cluster Number
1 W63 1020 1100
2 W50 1020 1240
Regarding the gsub
, it can be关于
gsub
,它可以是
gsub("\\s+NA|NA\\s+|NA$|^NA", "", Resultaat$Number)
[1] "1020 1100" "1020 1240"
Or may also use tidvyerse
methods as或者也可以使用
tidvyerse
方法作为
library(dplyr)
library(tidyr)
library(stringr)
Resultaat %>%
separate_rows(Number) %>%
na_if("NA") %>%
drop_na() %>%
group_by(Cluster) %>%
summarise(Number = str_c(Number, collapse = " "))
-output -输出
# A tibble: 2 × 2
Cluster Number
<chr> <chr>
1 W50 1020 1240
2 W63 1020 1100
Resultaat <- structure(list(Cluster = c("W63", "W50"),
Number = c("1020 NA NA NA 1100",
"1020 NA 1240 NA NA")), class = "data.frame", row.names = c(NA,
-2L))
Assuming all numbers and NAs are space separated:假设所有数字和 NA 都是空格分隔的:
library("tidyverse")
Resultaat$Number <- Resultaat$Number %>%
str_split(pattern = " ") %>%
map_chr(~ paste(.x[.x != "NA"], collapse = " "))
Here is a base R option with regmatches
with pattern [^(NA) ]+
这是一个基本
regmatches
选项,其正则匹配与模式[^(NA) ]+
transform(
df,
Number = sapply(
regmatches(
Number,
gregexpr("[^(NA) ]+", Number)
),
paste0,
collapse = " "
)
)
which gives这使
Cluster Number
1 W63 1020 1100
2 W50 1020 1240
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.