[英]What is the best way to apply a function to a list of lists in R? specifically if the internal variables are all called the same thing
Hi I am fairly new to R and would appreciate any help on this one.嗨,我对 R 相当陌生,希望能对此提供任何帮助。
I have searched for similar questions but I unfortunately I dont really understand the solutions given.我已经搜索过类似的问题,但不幸的是我并不真正理解给出的解决方案。
My problem:我的问题:
I have about 60 excel sheets full of repeated testing data that I want to analyse and compare.我有大约 60 张 excel 表,里面充满了我想要分析和比较的重复测试数据。 These all have a similar structure and variable/column names but the number of data points is different for each one.
这些都具有相似的结构和变量/列名称,但每个数据点的数量不同。 I have loaded these into R as a list of lists and I want to perform a series of manipulations on each original data set once it is inside the list of lists.
我已将这些作为列表列表加载到 R 中,一旦每个原始数据集位于列表列表中,我想对它执行一系列操作。 These manipulations would be identical using the same variable names etc but applied to different data sets.
这些操作使用相同的变量名称等是相同的,但适用于不同的数据集。
As an example say I wanted to calculate something based on the data and then add the results as a new variable inside the nested list.例如,我想根据数据计算一些东西,然后将结果作为新变量添加到嵌套列表中。
A simplified version of my situation would be something like this.我的情况的简化版本是这样的。
###set up###
specimen1=list("Stress"=50:100,
"Strain"=5:55) #represents my excel sheet imports
specimen2=list("Stress"=65:115,
"Strain"=6.5:56.5) #simplifed for brevity
specimen3=list("Stress"=42:92,"Strain"=4.2:54.2)
rate1=list(specimen1,specimen2,specimen3) #my list of lists
names(rate1)<-c("specimen 1","specimen 2","specimen 3") #set the names
####performing calculation and adding to the list entry###
#now i want to perform a calculation on each specimen and then add the result to that specimen
#I suspect the solution lies with the lapply family something like this?
example_function<-function(Stress,Strain){
E=Stress/Strain #performs calculation
#but doesn't add the result to the list?
rate1$specimen$E=E #something like this to add to the original data set?
#but I don't understand how to change the indexing with out using a for loop
}
lapply(rate1,example_function)
######### #########
What is the best way to perform a function on each element of list of lists which then adds a variable to all those list components?对列表列表的每个元素执行 function 的最佳方法是什么,然后将变量添加到所有这些列表组件?
I suspect that the solution to this will be simple?我怀疑这个问题的解决方案很简单?
If you are not tied to doing this in lists
, you can bind all your lists and do it in the resulting data.frame
format using dplyr
如果您不喜欢在
lists
中执行此操作,则可以绑定所有列表并使用dplyr
以生成的data.frame
格式执行此操作
library(dplyr)
bind_rows(rate1, .id="specimen") %>%
mutate(E = Stress/Strain)
which produces产生
# A tibble: 153 x 4
specimen Stress Strain E
<chr> <int> <dbl> <dbl>
1 specimen 1 50 5 10
2 specimen 1 51 6 8.5
3 specimen 1 52 7 7.43
4 specimen 1 53 8 6.62
...
Using data.frames
is usually the most straightforward way of doing things in R
.使用
data.frames
通常是R
中最直接的处理方式。
If you want to keep using lists
, because you have to make a new column, it's possibly better to use a for
loop instead of lapply
.如果您想继续使用
lists
,因为您必须创建一个新列,那么使用for
循环而不是lapply
可能会更好。 This is how you solve your particular issue using a loop.这就是您使用循环解决特定问题的方法。
# this will add the column E to each element of the list rate1
for(i in 1:length(rate1)) {
rate1[[i]]$E <- rate1[[i]]$Stress/rate1[[i]]$Strain
}
This is the version with lapply
, you can keep adding columns inside the list
call of the function(li)
.这是带有
lapply
的版本,您可以在function(li)
的list
调用中继续添加列。
modified_rate1 <-
lapply(rate1, function(li)
list(
Stress = li$Stress,
Strain = li$Strain,
E = li$Stress/li$Strain
)
)
I think doing this in a data.frame
is the way to go, but you should see what's better for your other many purposes我认为在
data.frame
的方法,但您应该看看什么对您的其他许多目的更好
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.