Return most common value in column by group, replace null in that column with that value

Question

I'd like to replace the na values in my df column with the most common value by group

#Ex:

df <- data.frame(Home_Abbr = c('PHI', 'PHI', 'DAL', 'PHI'),
                 Home_City = c('Philadelphia', 'Philadelphia', 'Dallas', NULL))

#Desired Result

Home_Abbr   Home_City

PHI         Philadelphia
PHI         Philadelphia
DAL         Dallas
PHI         Philadelphia

Here is what I've tried so far:

df <- df %>%
  group_by(Home_Abbr) %>%
  mutate(Home_City = names(which.max(table(Home_City))))

But when I run this I get a 'Can't combine NULL and non NULL results' Error.

Answer 1

We can use Mode function

 Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

and then replace

library(dplyr)
df %>% 
  group_by(Home_Abbr) %>%
  mutate(Home_City = replace(Home_City, is.na(Home_City), 
      Mode(Home_City))) %>%
  ungroup

-output

# A tibble: 4 × 2
  Home_Abbr Home_City   
  <chr>     <chr>       
1 PHI       Philadelphia
2 PHI       Philadelphia
3 DAL       Dallas      
4 PHI       Philadelphia

data

df <- structure(list(Home_Abbr = c("PHI", "PHI", "DAL", "PHI"), Home_City = c("Philadelphia", 
"Philadelphia", "Dallas", NA)), class = "data.frame", row.names = c(NA, 
-4L))

Return most common value in column by group, replace null in that column with that value

Question

1 answers

solution1
2 ACCPTED 2022-05-16 19:55:50

data

Return most common value in column by group, replace null in that column with that value

Question

1 answers

solution1 2 ACCPTED 2022-05-16 19:55:50

data

solution1
2 ACCPTED 2022-05-16 19:55:50