简体   繁体   中英

How to remove NA's from a lazy query in R

I need help regarding this database https://www.kaggle.com/datasets/hugomathien/soccer I want to find how many players are right-footed and how many are left-footed, using the column preferred_foot of the table player_attributes of the database, and using: group_by and summarize of dplyr. When i run this in r:

con <- DBI::dbConnect(RSQLite::SQLite(), "data/database.sqlite")
library(tidyverse)
library(DBI)
player_attributes<-tbl(con,"Player_Attributes")
Table_preferred_foot<- player_attributes %>%
  group_by(preferred_foot) %>%
  summarize(number_of_players=count(preferred_foot))
head(Table_preferred_foot)

i get the number of right and left footed players, and I also get that the Number of NA's is 0. But if i run:

player_attributes %>%
  group_by(preferred_foot) %>%
  count()

i get the number of right and left footed players (same numbers as before),but i get that the number of NA's is 836, which is the real number of NA's. How can i get the correct answer by using both summarize and group_by?

Also is there a direct function to check if there are any NA's in a variable of a lazy query, and to remove NA's from a variable of a lazy query, like the regular data frames?? (the basic functions like na.omit() do not work)

You can group_by and summarise per snippet 1. Count combines this into one line per snippet 2. And you could filter out the NAs per snippet 3.

library(tidyverse)

con <- DBI::dbConnect(RSQLite::SQLite(), "database.sqlite")

tbl(con, "Player_Attributes") %>%
  group_by(preferred_foot) %>%
  summarise(n = n())

tbl(con, "Player_Attributes") %>%
  count(preferred_foot)

tbl(con, "Player_Attributes") %>%
  filter(!is.na(preferred_foot)) %>%
  count(preferred_foot)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM