简体   繁体   中英

Count factor levels inside another factor - R

Assume that we have a data frame with hundreds of observations of cars. Each observation can be grouped by model, brand and country.

How can we count how many models of cars are produced in each country?

I tried:

janitor::tabyl(data,  coutry, model)

But I get the amount of each observation by model per country. However, I am looking for the amount and list of models that are from a particular country, for each country.

example:

Country  n    model
Italy    4    Punto, Panda, Mito, Giulietta
Germany  3    Polo, Golf, X5 

You can do it in dplyr

df <- data.frame(Country = c("Italy","Italy",
                             "Italy","Italy","Germany","Germany","Germany"),
                   Model = c("Punto","Panda",
                             "Mito","Mito","Polo","Golf","Golf")
      )

library(dplyr)
df %>% group_by(Country) %>%
  summarise(n = n_distinct(Model),
            model = toString(unique(Model)), .groups = 'drop')

# A tibble: 2 x 3
  Country     n model             
  <chr>   <int> <chr>             
1 Germany     2 Polo, Golf        
2 Italy       3 Punto, Panda, Mito

Created on 2021-05-06 by the reprex package (v2.0.0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM