[英]How to loop over the columns in a dataframe, apply spread, and create a new dataframe in R?
I have a dataframe which looks like this example, just much larger:我有一个 dataframe 看起来像这个例子,只是更大:
Name date var1 var2 var3
Peter 2020-03-30 0.4 0.5 0.2
Ben 2020-10-14 0.6 0.4 0.1
Mary 2020-12-06 0.7 0.2 0.9
I want to create a new dataframe for each variable (ie, var1, var2, var3), which should look like this, eg, for var1:我想为每个变量(即 var1、var2、var3)创建一个新的 dataframe,它应该如下所示,例如,对于 var1:
date Peter Ben Mary
2020-03-30 0.4 NA NA
2020-10-14 NA 0.6 NA
2020-12-06 NA NA 0.7
I can do it with spread
for one variable at a time:我可以一次对一个变量进行
spread
:
df_new <-tidyr::spread(df[,-c(2:3)], name, var1)
But I could not figure out how to loop it over all columns as I am new to R.但是我不知道如何在所有列上循环它,因为我是 R 的新手。
Thank you!谢谢!
First we want to create a list of data frames and then pivot each one:首先我们要创建一个数据帧列表,然后是 pivot 每个:
library(tidyverse)
res_list = dat %>%
pivot_longer(cols = contains("var")) %>%
split(., .$name) %>%
map(. %>% pivot_wider(names_from="Name"))
$var1
# A tibble: 3 × 5
date name Peter Ben Mary
<date> <chr> <dbl> <dbl> <dbl>
1 2020-03-30 var1 0.4 NA NA
2 2020-10-14 var1 NA 0.6 NA
3 2020-12-06 var1 NA NA 0.7
$var2
# A tibble: 3 × 5
date name Peter Ben Mary
<date> <chr> <dbl> <dbl> <dbl>
1 2020-03-30 var2 0.5 NA NA
2 2020-10-14 var2 NA 0.4 NA
3 2020-12-06 var2 NA NA 0.2
$var3
# A tibble: 3 × 5
date name Peter Ben Mary
<date> <chr> <dbl> <dbl> <dbl>
1 2020-03-30 var3 0.2 NA NA
2 2020-10-14 var3 NA 0.1 NA
3 2020-12-06 var3 NA NA 0.9
Then you can access them like然后你可以像访问它们
res_list["var1"]
# A tibble: 3 × 5
date name Peter Ben Mary
<date> <chr> <dbl> <dbl> <dbl>
1 2020-03-30 var1 0.4 NA NA
2 2020-10-14 var1 NA 0.6 NA
3 2020-12-06 var1 NA NA 0.7
We can do it this way: The beginning is similar to user438383 solution.我们可以这样做:开始类似于 user438383 的解决方案。 But then we name each tibble in the list and save them to the global environment within the the pipe.
但随后我们命名列表中的每个 tibble 并将它们保存到 pipe 内的全局环境中。 For this we need
massign
from collapse
package: thanks to @akrun How to save each named tibble in a list, as a separate tibble or dataframe in one run为此,我们需要从
collapse
massign
中恢复:感谢@akrun 如何将每个命名的 tibble 保存在列表中,作为单独的 tibble 或 dataframe 一次运行
library(tidyverse)
library(collapse)
df %>%
pivot_longer(cols = contains("var")) %>%
group_split(name) %>%
setNames(unique(df$Name)) %>%
map(. %>% pivot_wider(names_from = Name)) %>%
map(. %>% select(-name)) %>%
massign(names(.), ., .GlobalEnv)
Ben
Mary
Peter
A tibble: 3 x 4
date Peter Ben Mary
<chr> <dbl> <dbl> <dbl>
1 2020-03-30 0.5 NA NA
2 2020-10-14 NA 0.4 NA
3 2020-12-06 NA NA 0.2
> Mary
# A tibble: 3 x 4
date Peter Ben Mary
<chr> <dbl> <dbl> <dbl>
1 2020-03-30 0.2 NA NA
2 2020-10-14 NA 0.1 NA
3 2020-12-06 NA NA 0.9
> Peter
# A tibble: 3 x 4
date Peter Ben Mary
<chr> <dbl> <dbl> <dbl>
1 2020-03-30 0.4 NA NA
2 2020-10-14 NA 0.6 NA
3 2020-12-06 NA NA 0.7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.