循环遍历 r 中的 groupby 列并应用 function

Question

Hello everyone I would need help in order to loop over a dataframe by groups of columns.大家好，我需要帮助才能按列组循环遍历 dataframe。

Here is an example of dataframe这是 dataframe 的示例

  Group       Species Values
1    G1 Cattus_cattus   Val1
2    G1 Cattus_cattus   Val2
3    G1 Cattus_cattus   Val3
4    G2   Canis_lupus   Val4
5    G2   Canis_lupus   Val5
6    G3  Griseus_lupa   Val6
7    G4  Griseus_lupa   Val7

I would like to:我想：

1 - loop over c(df$Group,df$Species) 1 - 循环c(df$Group,df$Species)

2 - take the df$Values and store it as a vector 2 - 获取df$Values并将其存储为vector

3 - put that vector into a function called afunction 3 - 将该向量放入称为函数的afunction

4 - open a treefile with anotherfunction where its name is the df$Group name 4 - 使用另一个函数打开一个treefile文件，其名称为anotherfunction df$Group name

5 - get the output value of that function and add it into a new_column 5 - 获取 function 的output value并将其添加到new_column

So here is an exemple of what the code should do:所以这里是代码应该做什么的一个例子：

first groups is G1,Cattus_cattus :第一组是G1,Cattus_cattus ：

  Group       Species Values
1    G1 Cattus_cattus   Val1
2    G1 Cattus_cattus   Val2
3    G1 Cattus_cattus   Val3

Then I open the treefile with treefile <- anotherfunction(G1)然后我用treefile <- anotherfunction(G1)打开treefile文件

Then I generate the output value such as output_value<-afunction(treefile,c("Val1","Val2","Val3))然后我生成 output 值，例如output_value<-afunction(treefile,c("Val1","Val2","Val3))

then the output_value = 30那么output_value = 30

so I add 30 into the df:所以我将 30 添加到 df 中：

  Group       Species Values new_column
1    G1 Cattus_cattus   Val1 30
2    G1 Cattus_cattus   Val2 30
3    G1 Cattus_cattus   Val3 30

if there is only one row within the Group, then I do nothing and add a NA.如果组内只有一行，那么我什么都不做并添加一个 NA。

Note that of course it is a nonexisting function, so you cannot reproduce the exemple.请注意，它当然是不存在的 function，因此您无法重现该示例。

Ath the and we should get something like (where new_column values are random here).我们应该得到类似的东西（这里的new_column值是随机的）。

  Group       Species Values new_column
1    G1 Cattus_cattus   Val1 30
2    G1 Cattus_cattus   Val2 30
3    G1 Cattus_cattus   Val3 30
4    G2   Canis_lupus   Val4 21
5    G2   Canis_lupus   Val5 21
6    G3  Griseus_lupa   Val6 NA
7    G4  Griseus_lupa   Val7 NA

Does someone have an idea please?有人有想法吗？ So fare I known how to loop over a dataframe using a for loop but here I do not known how to deal with groups composed of 2 colums..到目前为止，我知道如何使用 for 循环遍历 dataframe 但在这里我不知道如何处理由 2 列组成的组。

data数据

structure(list(Group = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 4L
), .Label = c("G1", "G2", "G3", "G4"), class = "factor"), Species = structure(c(2L, 
2L, 2L, 1L, 1L, 3L, 3L), .Label = c("Canis_lupus", "Cattus_cattus", 
"Griseus_lupa"), class = "factor"), Values = structure(1:7, .Label = c("Val1", 
"Val2", "Val3", "Val4", "Val5", "Val6", "Val7"), class = "factor")), class = "data.frame", row.names = c(NA, 
-7L))

Answer 1

You can try something like this:你可以尝试这样的事情：

library(dplyr)
library(purrr)

df %>%
  group_by(Group) %>%
  summarise(treefile = anotherfunction(first(Group)), 
            Values = list(Values)) %>%
  mutate(new_column = map2_dbl(treefile, Values, afunction))

This would give you a summarised dataframe.这会给你一个总结的 dataframe。 To get the same number of rows back you can left_join with df by Group .要获得相同数量的行，您可以left_join与df by Group 。

Answer 2

Here is how you do it:这是您的操作方法：

anotherfunction = function(x){
  #do something with your treefile
  ifelse("Val2" %in% x, 30, ifelse("Val4" %in% x, 21, NA))
}

df %>% 
  group_by(Group) %>% 
  mutate(new_column=anotherfunction(Values))

You did not give a lot of information about anotherfunction() so I used an ugly nested ifelse() to mimic the behavior.您没有提供有关anotherfunction()的大量信息，因此我使用了丑陋的嵌套ifelse()来模仿该行为。

The key is that mutate() will use the Values inside the Groups.关键是mutate()将使用组内的值。

To explore this, you can try to run the code:要探索这一点，您可以尝试运行代码：

anotherfunction = function(x){browser()}

循环遍历 r 中的 groupby 列并应用 function

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-03-16 11:30:27

解决方案2
0 2021-03-16 10:44:20

循环遍历 r 中的 groupby 列并应用 function

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-03-16 11:30:27

解决方案2 0 2021-03-16 10:44:20

解决方案1
1 已采纳 2021-03-16 11:30:27

解决方案2
0 2021-03-16 10:44:20