简体   繁体   English

计算另一个因子中的因子水平 - R

[英]Count factor levels inside another factor - R

Assume that we have a data frame with hundreds of observations of cars.假设我们有一个包含数百个汽车观察的数据框。 Each observation can be grouped by model, brand and country.每个观察可以按 model、品牌和国家分组。

How can we count how many models of cars are produced in each country?我们如何计算每个国家生产了多少型号的汽车?

I tried:我试过了:

janitor::tabyl(data,  coutry, model)

But I get the amount of each observation by model per country.但是我得到每个国家的 model 的每个观察量。 However, I am looking for the amount and list of models that are from a particular country, for each country.但是,我正在寻找每个国家/地区的特定国家/地区的模型数量和列表。

example:例子:

Country  n    model
Italy    4    Punto, Panda, Mito, Giulietta
Germany  3    Polo, Golf, X5 

You can do it in dplyr你可以在 dplyr

df <- data.frame(Country = c("Italy","Italy",
                             "Italy","Italy","Germany","Germany","Germany"),
                   Model = c("Punto","Panda",
                             "Mito","Mito","Polo","Golf","Golf")
      )

library(dplyr)
df %>% group_by(Country) %>%
  summarise(n = n_distinct(Model),
            model = toString(unique(Model)), .groups = 'drop')

# A tibble: 2 x 3
  Country     n model             
  <chr>   <int> <chr>             
1 Germany     2 Polo, Golf        
2 Italy       3 Punto, Panda, Mito

Created on 2021-05-06 by the reprex package (v2.0.0)reprex package (v2.0.0) 于 2021 年 5 月 6 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM