将函数应用于数据框的块

Question

I'm a C# programmer who's been asked to do some work in R. I need to figure out how to call a function multiple times passing in 'chunks' of a data frame; 我是一名C＃程序员，他被要求在R中做一些工作。我需要弄清楚如何多次调用函数来传递数据帧的“块”; for all rows where the first two columns are distinct I need to call the function once. 对于前两列不同的所有行，我需要调用该函数一次。

Here's what I mean: 这就是我的意思：

Stratum<-c("FPN", "FPN", "FPN", "MPN", "MPN", "MPN")
Cal<-c("ynnn", "ynnn", "yynn", "ynnn", "ynnn", "yynn")
Band.1<-c(1,2,1,1,2,1)
Band.2<-c(2,3,2,2,3,2)
Regroup<-c("No","Yes","No","Yes","No","No")
decs.data<-data.frame(Stratum,Cal,Band.1,Band.2,Regroup,stringsAsFactors=FALSE)

Stratum  Cal Band.1 Band.2 Regroup
    FPN ynnn      1      2      No
    FPN ynnn      2      3     Yes
    FPN yynn      1      2      No
    MPN ynnn      1      2     Yes
    MPN ynnn      2      3      No
    MPN yynn      1      2      No

For the above data I'd call the function four times - once passing it all the rows of decs.data where Stratum="FPN" and Cal="ynnn", then where Stratum="FPN" and Cal="yynn" and so on. 对于上面的数据，我将函数调用四次 - 一次传递decs.data的所有行，其中Stratum =“FPN”和Cal =“ynnn”，然后Stratum =“FPN”和Cal =“yynn”，等等。

The function won't operate on those rows, it uses them to determine which data file to load from disc and what to do with it. 该函数不会对这些行进行操作，它使用它们来确定从光盘加载哪个数据文件以及如何处理它。

How would I go about calling a function this way in R? 我如何在R中以这种方式调用函数？ I'm sure 'apply' must be involved but I'm struggling to figure out how. 我确信'申请'必须参与，但我正在努力弄清楚如何。

UPDATE: I don't need all the rows in the data.frame as arguments to the function, just the matching ones (ie rows 1 & 2 for the 1st call, 3 for the 2nd, 4 & 5 for the 3rd and 6 for the 5th). 更新：我不需要data.frame中的所有行作为函数的参数，只需要匹配的行（即第一次调用的第1行和第2行，第2次调用3，第3次调用4和5，以及第5节）。

The function will load a data file based on the Stratum & Cal columns (eg FPN.ynnn.rdata) then decide how to process it based on the Band.1, Band.2 and Regroup columns. 该函数将根据Stratum＆Cal列（例如FPN.ynnn.rdata）加载数据文件，然后根据Band.1，Band.2和Regroup列决定如何处理它。

Essentially, decs.data is not the data I want to manipulate but a decisions matrix defining which bands in which rdata files need to be regrouped. 从本质上讲，decs.data不是我想要操作的数据，而是一个决策矩阵，用于定义rdata文件需要重新分组的波段。

Answer 1

You are looking for by . 您正在寻找by 。 If you want to run your function on subsets of the decs.data , using Stratum and Cal as the splitting variable, you can do: 如果要在decs.data子集上运行函数，使用Stratum和Cal作为拆分变量，可以执行以下操作：

by(decs.data,decs.data[c('Stratum','Cal')],function)

where function is your function. function是你的功能。

将函数应用于数据框的块

问题描述

1 个解决方案

解决方案1
4 2013-12-12 12:07:45

将函数应用于数据框的块

问题描述

1 个解决方案

解决方案1 4 2013-12-12 12:07:45

解决方案1
4 2013-12-12 12:07:45