简体   繁体   English

R:如何以编程方式遍历拆分的数据帧

[英]R: how do I programmatically loop through a split data frame

I have data set where data for a series of lots is stored sequentially down a column, and multiple parameters are given for each lot (also sequentially).我有一个数据集,其中一系列批次的数据按列顺序存储,每个批次(也是按顺序)给出多个参数。 The file looks something like this:该文件如下所示:

LotID,Param,Nominal,Value
R0001,Len,1.2500,1.234
R0001,Dia,2.0000,1.979
R0002,Len,1.2500,1.252
R0002,Dia,2.0000,2.010

I'm able to drill down to the data I need by importing it into a data frame, splitting the data frame by LotID, and then splitting again by Param, which is great.我可以通过将数据导入到数据框中,通过 LotID 拆分数据框,然后通过 Param 再次拆分来深入查看我需要的数据,这很棒。 Here is the code I am using for that:这是我为此使用的代码:

myCapFull <- read.csv("capabilityFull.csv")
myCapSplit <- split(myCapFull, myCapFull$LotID)
myR0001 <- split(myCapSplit$R0001,myCapSplit$R0001$Param)
myR0001$Dia$Value # Returns 1.979

But what I want to do is use iter to iterate over each parameter of each lot, and I can't find a way to do that programmatically.但是我想要做的是使用 iter 迭代每个批次的每个参数,我找不到以编程方式执行此操作的方法。 I know how to write the code if I know all of the names in the LotID field, but that doesn't help inside a for/next loop.如果我知道 LotID 字段中的所有名称,我知道如何编写代码,但这在 for/next 循环中无济于事。 I have a feeling that I'm just missing one very simple command, and I've spent a lot of time searching but haven't found the answer.我有一种感觉,我只是错过了一个非常简单的命令,我花了很多时间搜索但没有找到答案。 I'm new to R, this is really my first real-world application of it, so any help would be much appreciated.我是 R 的新手,这真的是我第一次在现实世界中应用它,因此非常感谢任何帮助。

If you don't know the values in LotID , you can access the data frames in your list with numeric indices:如果您不知道LotID的值,您可以使用数字索引访问列表中的数据框:

> myCapSplit[[1]]
  LotID Param Nominal Value
1 R0001   Len    1.25 1.234
2 R0001   Dia    2.00 1.979
> 
> myCapSplit[[2]]
  LotID Param Nominal Value
3 R0002   Len    1.25 1.252
4 R0002   Dia    2.00 2.010

Maybe you're looking for subset ?也许您正在寻找subset

subset(myCapFull, Param=="Dia" & LotID == "R0001")
#   LotID Param Nominal Value
# 2 R0001   Dia       2 1.979    

Alternatively, you can look into documentation for [.data.frame for more info on how to subset, or into data.table , dplyr , or plyr packages for manipulation of data frames by groups (ie split apply combine analysis ).或者,您可以查看[.data.frame文档以获取有关如何进行子集化的更多信息,或data.tabledplyrplyr包以按组操作数据帧(即split apply combine analysis )。 For example, here we find the mean of each parameter across all lots with data.table :例如,在这里我们使用data.table找到所有批次的每个参数的平均值:

library(data.table)
DT <- data.table(myCapFull)
DT[, mean(Value), by=Param]
#    Param     V1
# 1:   Len 1.2430
# 2:   Dia 1.9945

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM