如何從 R 中的惰性查詢中刪除 NA

Question

我需要有關此數據庫的幫助https://www.kaggle.com/datasets/hugomathien/soccer我想使用表 player_attributes 的列 preferred_foot數據庫，並使用：group_by 和 dplyr 的摘要。當我在 r 中運行它時：

con <- DBI::dbConnect(RSQLite::SQLite(), "data/database.sqlite")
library(tidyverse)
library(DBI)
player_attributes<-tbl(con,"Player_Attributes")
Table_preferred_foot<- player_attributes %>%
  group_by(preferred_foot) %>%
  summarize(number_of_players=count(preferred_foot))
head(Table_preferred_foot)

我得到右腳和左腳球員的數量，我也得到 NA 的數量是 0。但是如果我運行：

player_attributes %>%
  group_by(preferred_foot) %>%
  count()

我得到右腳和左腳球員的數量（與以前相同的數字），但我得到 NA 的數量是 836，這是 NA 的真實數量。 如何通過使用 summarize 和 group_by 獲得正確答案？

還有一個直接的 function 來檢查惰性查詢的變量中是否有任何 NA，並從惰性查詢的變量中刪除 NA，就像常規數據幀一樣？ （像 na.omit() 這樣的基本功能不起作用）

Answer 1

您可以對每個片段 1 進行group_by和summarise 。 Count將每個片段 2 合並為一行。您可以filter掉每個片段 3 的 NA。

library(tidyverse)

con <- DBI::dbConnect(RSQLite::SQLite(), "database.sqlite")

tbl(con, "Player_Attributes") %>%
  group_by(preferred_foot) %>%
  summarise(n = n())

tbl(con, "Player_Attributes") %>%
  count(preferred_foot)

tbl(con, "Player_Attributes") %>%
  filter(!is.na(preferred_foot)) %>%
  count(preferred_foot)

如何從 R 中的惰性查詢中刪除 NA

問題描述

1 個解決方案

解決方案1
0 2022-05-21 10:15:43

如何從 R 中的惰性查詢中刪除 NA

問題描述

1 個解決方案

解決方案1 0 2022-05-21 10:15:43

解決方案1
0 2022-05-21 10:15:43