删除特定列中具有 NA 值的行

Question

I have a huge dataset of about 1.6 million rows, and the variable (column) I need to focus on is 'temperature'.我有一个大约 160 万行的庞大数据集，我需要关注的变量（列）是“温度”。 The temperature column has many NA values, and the other variable columns have NA values throughout as well.温度列具有许多 NA 值，其他变量列也始终具有 NA 值。 I want to remove only the rows with NA values in the temperature column, I don't particularly care about the NA values in the other columns.我只想删除温度列中具有 NA 值的行，我并不特别关心其他列中的 NA 值。 How can I do this?我怎样才能做到这一点？ If I end up needing to remove rows with NA values for more than just my temperature column, (eg the depth column) how can I select two columns?如果我最终需要删除 NA 值的行而不仅仅是我的温度列（例如深度列），我该如何选择两列？ This is my code:这是我的代码：

otn <- tidync(filename, row.names=TRUE) %>% activate('D0')
glider_table <- hyper_tibble(otn)
attach(glider_table)
summary(temperature)
na.omit(glider_table)

na.omit () removes all rows with NA values regardless of which column they're in, so I need something more selective. na.omit () 删除所有具有 NA 值的行，不管它们在哪一列，所以我需要一些更有选择性的东西。

Answer 1

You can use the drop_na() function, the first argument is the dataset name, and the second is an optional argument where you can name the specific columns you want to remove the NA responses from.您可以使用 drop_na() 函数，第一个参数是数据集名称，第二个是可选参数，您可以在其中命名要从中删除 NA 响应的特定列。 Like this , drop_na(dataset, column)像这样， drop_na(dataset, column)

删除特定列中具有 NA 值的行

问题描述

1 个解决方案

解决方案1
1 2020-02-12 20:47:46

删除特定列中具有 NA 值的行

问题描述

1 个解决方案

解决方案1 1 2020-02-12 20:47:46

解决方案1
1 2020-02-12 20:47:46