[英]Applying the same function for multiple dataframes in R
I am a new R user and I meet problems with my code. 我是R的新用户,我的代码遇到问题。 I have 16 different dataframes and I would like to apply the same function for each dataframe.
我有16个不同的数据框,我想为每个数据框应用相同的功能。 Then, I want to put all the result in the new dataframe.
然后,我想将所有结果放入新的数据框中。 I wrote this code and it works well :
我写了这段代码,它运作良好:
df2012<-as.data.frame(cprop(wtd.table(database2012$year,database2012$nivvie_dec,weights=database2012$wprm),total=FALSE))
df2012$annee<-"2012"
df2011<-as.data.frame(cprop(wtd.table(database2011$year,database2011$nivvie_dec,weights=database2011$wprm),total=FALSE))
df2011$annee<-"2011"
df2010<-as.data.frame(cprop(wtd.table(database2010$year,database2010$nivvie_dec,weights=database2010$wprm),total=FALSE))
df2010$annee<-"2010"
df2009<-as.data.frame(cprop(wtd.table(database2009$year,database2009$nivvie_dec,weights=database2009$wprm),total=FALSE))
df2009$annee<-"2009"
df2008<-as.data.frame(cprop(wtd.table(database2008$year,database2008$nivvie_dec,weights=database2008$wprm),total=FALSE))
df2008$annee<-"2008"
df2007<-as.data.frame(cprop(wtd.table(database2007$year,database2007$nivvie_dec,weights=database2007$wprm),total=FALSE))
df2007$annee<-"2007"
df2006<-as.data.frame(cprop(wtd.table(database2006$year,database2006$nivvie_dec,weights=database2006$wprm),total=FALSE))
df2006$annee<-"2006"
df2005<-as.data.frame(cprop(wtd.table(database2005$year,database2005$nivvie_dec,weights=database2005$wprm),total=FALSE))
df2005$annee<-"2005"
df2004<-as.data.frame(cprop(wtd.table(database2004$year,database2004$nivvie_dec,weights=database2004$wprm),total=FALSE))
df2004$annee<-"2004"
df2003<-as.data.frame(cprop(wtd.table(database2003$year,database2003$nivvie_dec,weights=database2003$wprm),total=FALSE))
df2003$annee<-"2003"
df2002<-as.data.frame(cprop(wtd.table(database2002$year,database2002$nivvie_dec,weights=database2002$wprm),total=FALSE))
df2002$annee<-"2002"
df2001<-as.data.frame(cprop(wtd.table(database2001$year,database2001$nivvie_dec,weights=database2001$wprm),total=FALSE))
df2001$annee<-"2001"
df2000<-as.data.frame(cprop(wtd.table(database2000$year,database2000$nivvie_dec,weights=database2000$wprm),total=FALSE))
df2000$annee<-"2000"
df1999<-as.data.frame(cprop(wtd.table(database1999$year,database1999$nivvie_dec,weights=database1999$wprm),total=FALSE))
df1999$annee<-"1999"
df1998<-as.data.frame(cprop(wtd.table(database1998$year,database1998$nivvie_dec,weights=database1998$wprm),total=FALSE))
df1998$annee<-"1998"
df1997<-as.data.frame(cprop(wtd.table(database1997$year,database1997$nivvie_dec,weights=database1997$wprm),total=FALSE))
df1997$annee<-"1997"
df1996<-as.data.frame(cprop(wtd.table(database1996$year,database1996$nivvie_dec,weights=database1996$wprm),total=FALSE))
df1996$annee<-"1997"
df19962012<-rbind(df1996,df1997,df1998,df1999,df2000,df2001,df2002,df2003,df2004,df2005,df2006,df2007,df2008,df2009,df2010,df2011,df2012)
However, it is a long code and I need to replicate for others variables like sex, educational levels and family structure instead of year... I looked for a shorter code using lapply
, but all my tentatives failed. 但是,这是一个很长的代码,我需要复制其他变量,例如性别,学历和家庭结构,而不要复制年份……我使用
lapply
寻找了一个较短的代码,但是我所有的尝试都失败了。 Someone knows a way to shorten the code ? 有人知道缩短代码的方法吗?
Thank you very much for your help ! 非常感谢您的帮助 !
Again, see my comment to generate a new example, but the following should get at the core elements of your question and is reproducible. 同样,请参阅我的评论以生成一个新示例,但是以下内容应成为您问题的核心要素,并且是可重复的。 Walk through each portion slowly to understand what's going on.
慢慢地遍历每个部分以了解发生了什么。 In general, you should strive for DRY code when possible and get in the habit of writing small/simple functions anytime you find yourself repeating lines of code:
通常,您应该尽可能尝试DRY代码 ,并在发现重复的代码行时养成编写小型/简单函数的习惯:
Make two "fake" data.frames: 制作两个“假” data.frames:
df1 <- data.frame(x = 1:10)
df2 <- data.frame(x = 11:20)
A simple "dummy" function h(x)
, rather, h(df)
, takes a data.frame
and creates a new column y
by taking the dataframe's existing x
column and adding 10
. 一个简单的“虚拟”函数
h(x)
,而不是h(df)
,通过获取数据data.frame
的现有x
列并添加10
获取一个data.frame
并创建一个新列y
。
h <- function(df) {
df$y <- df$x + 10
df
}
Find all the objects of the pattern df-any-number
and store them in dfs
: 找到
df-any-number
模式的所有对象,并将它们存储在dfs
:
dfs <- ls(pattern = "df[0-9]")
dfs
Run lapply
over dfs
by searching by name (ie mget
) and apply function h
to each of them. 通过按名称(即
mget
)搜索在dfs
运行lapply
,并对每个函数应用函数h
。 Finally, rbind
the results via do.call
. 最后,通过
do.call
rbind
结果。
do.call(rbind, lapply(mget(dfs), h))
# x y
# df1.1 1 11
# df1.2 2 12
# df1.3 3 13
# df1.4 4 14
# df1.5 5 15
# df1.6 6 16
# df1.7 7 17
# df1.8 8 18
# df1.9 9 19
# df1.10 10 20
# df2.1 11 21
# df2.2 12 22
# df2.3 13 23
# df2.4 14 24
# df2.5 15 25
# df2.6 16 26
# df2.7 17 27
# df2.8 18 28
# df2.9 19 29
# df2.10 20 30
Some posts that will be helpful to guide your understanding: 一些有助于指导您理解的帖子:
for a list of Dataframes: 有关数据框的列表:
yDF <- function(y) {
db <- get(paste0("database", y))
df <- as.data.frame(cprop(wtd.table(db$year,db$nivvie_dec,weights=db$wprm),total=FALSE))
df$annee <- y
df
}
years <- 1996:2012
L <- lapply(years, yDF)
... normaly I am not a friend of get(). ...通常我不是get()的朋友。 you also can do rbind() for a long dataframe:
您还可以对较长的数据帧执行rbind():
DF <- yDF(1996)
for (y in 1997:2012) DF <- rbind(DF, yDF(y))
You can do something like complete_dataframe <- rbind(...)
to combine all your data frames together, especially if they have a separate column that defines each dataframe (here it will be annee
). 您可以执行诸如
complete_dataframe <- rbind(...)
来将所有数据帧组合在一起,尤其是当它们具有定义每个数据帧的单独列(此处为annee
)时。 Then you can use either the data.table
package or dplyr
package to apply a function over specific groups. 然后,您可以使用
data.table
包或dplyr
包在特定组上应用功能。
In dplyr
, the workflow would be 在
dplyr
,工作流程为
complete_dataframe %>% group_by(annee) %>% mutate(new_var = somefunction(columns_to_pass_into_function))
to generate new variables, or 生成新变量,或
complete_dataframe %>% group_by(annee) %>% summarise(new_var = somefunction(columns_to_pass_into_function))
to create a summary table over the groups. 在组上创建摘要表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.