简体   繁体   English

作为 dplyr 过滤功能的一部分,如何循环遍历列表的元素并最终在 R 中打印差异?

[英]How do I loop through the elements of a list as a part of dplyr's filter capability and eventually print variance in R?

I have a dataframe called diamondsData that I would like to manipulate in such a was that allows me to calculate the the variance in carat size for each type of cut in R.我有一个名为 diamondsData 的 dataframe,我想在其中进行操作,它允许我计算 R 中每种切割类型的克拉尺寸差异。

我想获得每种切割类型的克拉尺寸差异,例如理想、优质等。

I can easily do it manually as follows:我可以很容易地手动完成,如下所示:

diamondsData %>% 
  filter(cut=="Fair") %>% 
  select(carat) %>% 
  var()

I can change "Fair in filter(cut=="Fair") to Ideal and so on, but I would like to automate this. I tried creating a list and looping through like:我可以将“Fair in filter(cut=="Fair") 更改为 Ideal 等,但我想自动执行此操作。我尝试创建一个列表并循环遍历:

y <- list("Ideal", "Premium", "Good", "Very Good", "Fair")

for(x in y){
  print(
    diamondsData %>% 
      filter(cut==x) %>% 
      select(carat) %>% 
      var()
  )
}

but this doesn't work.但这不起作用。 This resulted in:这导致:

[ [我的循环尝试的结果2

I am working in R.我在 R 工作。 Any suggestions??有什么建议么??

Use group_by instead of looping:使用 group_by 而不是循环:

diamondsData %>%
group_by(cut) %>%
select(carat) %>%
summarize(variability = var(carat))

And it should give you a dataframe containing the variance of carat within each cut.它应该给你一个 dataframe 包含每个切割内的克拉变化。

Edited to replace %>% var() with %>% summarise(....) .编辑以替换%>% var()%>% summarise(....)

The issue is that var expects a vector while select returns a data.frame/tibble as output (even when one column is selected).问题是var需要一个vector ,而select返回一个data.frame/tibble作为 output (即使选择了一列)。 We can use pull to get the column as vector我们可以使用pull取列作为vector

for(x in y){
  print(
   diamondsData %>% 
   filter(cut==x) %>% 
    pull(carat) %>% 
     var(., na.rm = TRUE)
 )
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM