简体   繁体   English

使用dplyr中的purrr和mutate()将新变量添加到数据帧列表中

[英]Add new variable to list of data frames with purrr and mutate() from dplyr

I know that there are many related questions here on SO, but I am looking for a purrr solution, please, not one from the apply list of functions or cbind/rbdind (I want to take this opportunity to get to know purrr better). 我知道在SO上有很多相关的问题,但是我正在寻找一种purrr解决方案,而不是函数或cbind / rbdind的应用列表中的一个(我想借此机会更好地了解purrr)。

I have a list of dataframes and I would like to add a new column to each dataframe in the list. 我有一个数据框列表,我想向列表中的每个数据框添加一个新列。 The value of the column will be the name of the dataframe, ie the name of each element in the list. 该列的值将是数据框的名称,即列表中每个元素的名称。

There is something similar here , but it involves the use of a function and mutate_each() , whereas I need just mutate() . 这里有一些类似的东西 ,但是它涉及到函数和mutate_each() ,而我只需要mutate()

To give you an idea of the list (called comentarios ), here is the first line of str() on the first element: 为了让您对列表有所了解(称为comentarios ),这是第一个元素上str()的第一行:

> str(comentarios[1])
List of 1
 $ 166860353356903_661400323902901:'data.frame':    13 obs. of  7 variables:

So I would like my new variable to contain 166860353356903_661400323902901 for 13 lines in the result, as an ID for each dataframe. 因此,我希望我的新变量在结果中包含166860353356903_661400323902901的13行,作为每个数据帧的ID。

What I am trying is: 我正在尝试的是:

dff <- map_df(comentarios, 
              ~ mutate(ID = names(comentarios)),
              .id = "Group"
              )

However, mutate() needs the name of the dataframe in order to work: 但是, mutate()需要数据框的名称才能工作:

Error in mutate_(.data, .dots = lazyeval::lazy_dots(...)) : 
  argument ".data" is missing, with no default

It doesn't make sense to put in each name, I'd be straying into loop territory and losing the advantages of purrr (and R, more generally). 输入每个名称都没有意义,我会迷路于循环领域,并失去purrr(和R,更一般而言)的优势。 If the list was smaller, I'd use reshape::merge_all() , but it has over 2000 elements. 如果列表较小,我将使用reshape::merge_all() ,但是它具有2000多个元素。 Thanks in advance for any help. 在此先感谢您的帮助。

edit: some data to make the problem reproducible, as per alistaire's comments 编辑:根据利斯特里尔(Alistaire)的评论,有一些数据可以使问题重现

# install.packages("tidyverse")
library(tidyverse)
df <- data_frame(one = rep("hey", 10), two = seq(1:10), etc = "etc")

list_df <- list(df, df, df, df, df)
names(list_df) <- c("first", "second", "third", "fourth", "fifth")
dfs <- map_df(list_df, 
              ~ mutate(id = names(list_df)),
              .id = "Group"
              )

Your issue is that you have to explicitly provide reference to the data when you're not using mutate with piping. 您的问题是,当您不对管道使用mutate时,必须显式提供对数据的引用。 To do this, I'd suggest using map2_df 为此,我建议使用map2_df

dff <- map2_df(comentarios, names(comentarios), ~ mutate(.x, ID = .y)) 

using the OP's data the answer would be 使用OP的数据,答案将是

library(tidyverse)
df <- data_frame(one = rep("hey", 10), two = seq(1:10), etc = "etc")

list_df <- list(df, df, df, df, df)
dfnames <- c("first", "second", "third", "fourth", "fifth")

dfs <- list_df %>% map2_df(dfnames,~mutate(.x,name=.y))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM