[英]check if column of data frame have empty cells
I am checking if my column (name) have any empty cell but getting error.我正在检查我的列(名称)是否有任何空单元格但出现错误。 any solution....???
任何解决方案......? i am trying in this way.... also how i can disregard if that cell has space, i mean remove space if that cell have then check if t is empty, i just don't want change original name column, while checking i just want to remove spaces or NA and the check if the cells are empty.
我正在以这种方式尝试....还有我如何忽略该单元格是否有空间,我的意思是删除空间如果该单元格有然后检查 t 是否为空,我只是不想更改原始名称列,同时检查我只想删除空格或 NA 并检查单元格是否为空。
df8 <- data.frame(name=c("try,xab","xab,Lan","mhy,mun","vgtu,mmc","dgsy,aaf","kull,nnhu","hula,njam","mund,jiha","htfy,ntha","","sgyu,hytb","vdti,kula","mftyu,huta","","cday,bhsue","ajtu,nudj"),
email=c("xab.try@ybcd.com","Lan.xab@ybcd.com","tth.vgu@ybcd.com","mmc.vgtu@ybcd.com","aaf.dgsy@ybcd.com","nnhu.kull@ybcd.com","njam.hula@ybcd.com","jiha.mund@ybcd.com","ntha.htfy@ybcd.com","gydbt.bhr@ybcd.com","hytb.sgyu@ybcd.com","kula.vdti@ybcd.com","huta.mftyu@ybcd.com","ggat.khul@ybcd.com","bhsue.cday@ybcd.com","nudj.ajtu@ybcd.com"))
df8 <- df8 %>% mutate(is_blank_node = which(df8$name == "", arr.ind = TRUE),1 )
Error:
Error: Problem with mutate() input is_blank_node.
x Input is_blank_name can't be recycled to size 182753.
i Input is_blank_name is which(df$Name == "", arr.ind = TRUE).
i Input is_blank_name must be size 182753 or 1, not 0.
expected output预期产出
You don't need which
at all.你根本不需要
which
。 In fact, it causes the error here since the result is of length 2 (only the TRUE
values are taken into account) and it returns the position of the positive outcomes of your test only.事实上,它会在此处导致错误,因为结果的长度为 2(仅考虑
TRUE
值)并且它仅返回测试的正结果的位置。 mutate
can take the result from name == ""
directly. mutate
可以直接从name == ""
获取结果。 dplyr
also knows already that you evaluate the column name
within df8
. dplyr
也已经知道您评估了df8
的列name
。 So you can (and should) omit df$
:所以你可以(并且应该)省略
df$
:
df8 <- data.frame(name=c("try,xab","xab,Lan","mhy,mun","vgtu,mmc","dgsy,aaf","kull,nnhu","hula,njam","mund,jiha","htfy,ntha","","sgyu,hytb","vdti,kula","mftyu,huta","","cday,bhsue","ajtu,nudj"),
email=c("xab.try@ybcd.com","Lan.xab@ybcd.com","tth.vgu@ybcd.com","mmc.vgtu@ybcd.com","aaf.dgsy@ybcd.com","nnhu.kull@ybcd.com","njam.hula@ybcd.com","jiha.mund@ybcd.com","ntha.htfy@ybcd.com","gydbt.bhr@ybcd.com","hytb.sgyu@ybcd.com","kula.vdti@ybcd.com","huta.mftyu@ybcd.com","ggat.khul@ybcd.com","bhsue.cday@ybcd.com","nudj.ajtu@ybcd.com"))
library(tidyverse)
df8 %>%
mutate(is_blank_node = name == "")
#> name email is_blank_node
#> 1 try,xab xab.try@ybcd.com FALSE
#> 2 xab,Lan Lan.xab@ybcd.com FALSE
#> 3 mhy,mun tth.vgu@ybcd.com FALSE
#> 4 vgtu,mmc mmc.vgtu@ybcd.com FALSE
#> 5 dgsy,aaf aaf.dgsy@ybcd.com FALSE
#> 6 kull,nnhu nnhu.kull@ybcd.com FALSE
#> 7 hula,njam njam.hula@ybcd.com FALSE
#> 8 mund,jiha jiha.mund@ybcd.com FALSE
#> 9 htfy,ntha ntha.htfy@ybcd.com FALSE
#> 10 gydbt.bhr@ybcd.com TRUE
#> 11 sgyu,hytb hytb.sgyu@ybcd.com FALSE
#> 12 vdti,kula kula.vdti@ybcd.com FALSE
#> 13 mftyu,huta huta.mftyu@ybcd.com FALSE
#> 14 ggat.khul@ybcd.com TRUE
#> 15 cday,bhsue bhsue.cday@ybcd.com FALSE
#> 16 ajtu,nudj nudj.ajtu@ybcd.com FALSE
Created on 2020-09-17 by the reprex package (v0.3.0)由reprex 包(v0.3.0) 于 2020 年 9 月 17 日创建
TRUE
and FALSE
are basically equivalent to 1
and 0
just in logical
instead of integer
/ numeric
type. TRUE
和FALSE
基本上等同于1
和0
只是在logical
而不是integer
/ numeric
类型。 You can try this with TRUE * 1
which turns the logical
into a numeric
value.您可以尝试使用
TRUE * 1
将logical
值转换为numeric
。 Or use as.integer
directly.或者直接使用
as.integer
。 To get around the problem of cells being filled only with whitespace or NA
you can also include extra steps.要解决仅用空格或
NA
填充单元格的问题,您还可以包括额外的步骤。 Since this is getting a bit verbose, we can wrap it in a function:由于这有点冗长,我们可以将它包装在一个函数中:
check_blank <- function(x) {
as.integer(trimws(ifelse(is.na(x), "", x)) == "")
}
df8 %>%
mutate(is_blank_node = check_blank(name))
#> name email is_blank_node
#> 1 try,xab xab.try@ybcd.com 0
#> 2 xab,Lan Lan.xab@ybcd.com 0
#> 3 mhy,mun tth.vgu@ybcd.com 0
#> 4 vgtu,mmc mmc.vgtu@ybcd.com 0
#> 5 dgsy,aaf aaf.dgsy@ybcd.com 0
#> 6 kull,nnhu nnhu.kull@ybcd.com 0
#> 7 hula,njam njam.hula@ybcd.com 0
#> 8 mund,jiha jiha.mund@ybcd.com 0
#> 9 htfy,ntha ntha.htfy@ybcd.com 0
#> 10 gydbt.bhr@ybcd.com 1
#> 11 sgyu,hytb hytb.sgyu@ybcd.com 0
#> 12 vdti,kula kula.vdti@ybcd.com 0
#> 13 mftyu,huta huta.mftyu@ybcd.com 0
#> 14 ggat.khul@ybcd.com 1
#> 15 cday,bhsue bhsue.cday@ybcd.com 0
#> 16 ajtu,nudj nudj.ajtu@ybcd.com 0
Created on 2020-09-17 by the reprex package (v0.3.0)由reprex 包(v0.3.0) 于 2020 年 9 月 17 日创建
Maybe you can try nchar
like below也许你可以像下面这样尝试
nchar
df8 %>%
mutate(is_blank_node = +(nchar(name)==0))
or nzchar
或
nzchar
df8 %>%
mutate(is_blank_node = +!nzchar(name))
which gives这使
> df8 %>%
+ mutate(is_blank_node = +(nchar(name)==0))
name email is_blank_node
1 try,xab xab.try@ybcd.com 0
2 xab,Lan Lan.xab@ybcd.com 0
3 mhy,mun tth.vgu@ybcd.com 0
4 vgtu,mmc mmc.vgtu@ybcd.com 0
5 dgsy,aaf aaf.dgsy@ybcd.com 0
6 kull,nnhu nnhu.kull@ybcd.com 0
7 hula,njam njam.hula@ybcd.com 0
8 mund,jiha jiha.mund@ybcd.com 0
9 htfy,ntha ntha.htfy@ybcd.com 0
10 gydbt.bhr@ybcd.com 1
11 sgyu,hytb hytb.sgyu@ybcd.com 0
12 vdti,kula kula.vdti@ybcd.com 0
13 mftyu,huta huta.mftyu@ybcd.com 0
14 ggat.khul@ybcd.com 1
15 cday,bhsue bhsue.cday@ybcd.com 0
16 ajtu,nudj nudj.ajtu@ybcd.com 0
Base R solution, checking if any empty strings in all vectors:基本 R 解决方案,检查所有向量中是否有空字符串:
data.frame(+(t(apply(df8, 1, `==`, ""))))
Base R, with results column-bind to the original data.frame:基础 R,结果列绑定到原始 data.frame:
cbind(df8, setNames(data.frame(+(t(apply(df8, 1, `==`, "")))),
paste("empty", names(df8), sep = "_")))
This will mutate said column with your desired values (1 or 0)这将使用您想要的值(1 或 0)改变所述列
df8 <- df8 %>% mutate(is_blank_node = ifelse(name == "", 1, 0))
Update: added line that removes any whitespace from the column, then will check if the cell is empty...更新:添加了从列中删除任何空格的行,然后将检查单元格是否为空...
df8 <- df8 %>%
mutate(name = trimws(name, which = "both")) %>%
mutate(is_blank_node = ifelse(name == "", 1, 0))
Update 2: This will give a '1' to any cell detected as blank or having only spaces (no matter amount of spaces), and give a '0' to anything else.更新 2:这将为检测为空白或只有空格(无论空格数量)的任何单元格给出“1”,并为其他任何单元格给出“0”。 This does not change the contents of the original column.
这不会更改原始列的内容。
library(tidyverse)
df8 <- df8 %>% mutate(is_blank_node = ifelse(name == "" | str_detect(name, '^\\s*$'), 1, 0))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.