简体   繁体   English

如何使用r中几个变量的重复和未重复测量来创建小标题?

[英]How to create a tibble with repeated and unrepeated measures of several variables in r?

I would like to create a tibble with a mix of repeated and unrepeated measures of variables along with the dates when they were measured; 我想通过反复和不重复的变量度量以及度量的日期来创建tibble

  • 3 variables ( var1 , var2 , var3 ) that were measured 16 times during the months of April, May, June at irregular intervals 3个变量( var1var2var3 )在4月,5月,6月的月份以不规则的间隔进行了16次测量
  • 4 variables ( var4 , var5 , var6 , var7 ) that were measured once in July 7月一次测量了4个变量( var4var5var6var7
  • 2 variables ( var8 , var9 ) that were measured also once in July 在7月份也测量了2个变量( var8var9

To create the tibble , I could write vectors with every combination of variable-date and measurement, but I am wondering if there is a way to do this more efficiently, since 3 variables are repeatedly taken for 16 times. 为了创建tibble ,我可以将变量日期和度量的每种组合编写向量,但是我想知道是否有一种方法可以更有效地执行此操作,因为3个变量重复进行了16次。 I've written this chunk of code with variables, dates and measurements to start with, but I'm stuck there. 我已经写出了这段代码,其中包含变量,日期和测量值,但是我一直呆在那里。 Any suggestions? 有什么建议么?

library (tidyverse)
variables <- c(var1, var2, var3, var4, var5, var6, var7, var8, var9)
mydates <- c(2013-04-15,
             2013-04-16,
             2013-04-17,
             2013-04-22,
             2013-04-25,
             2013-04-29,
             2013-05-02,
             2013-05-06,
             2013-05-09,
             2013-05-13,
             2013-05-16,
             2013-05-20,
             2013-05-23,
             2013-05-27,
             2013-05-30,
             2013-06-03,
             2013-07-04,  
             2013-07-08)
measurements <- c(3.2, 4.6, 1.1, 3.0, 3.6, 1.6, 1.4, 1.4, 4.8, 3.5, 4.0, 
2.7, 1.4, 2.9, 2.4, 3.6, 3.7, 4.3, 3.6, 3.5, 4.7, 1.8, 3.5, 2.4, 2.1, 1.2,
2.3, 3.9, 1.6, 2.8, 5.0, 2.4, 2.2, 2.9, 1.8, 1.7, 4.4, 3.9, 4.4, 2.6, 1.7, 
4.2, 3.4, 4.4, 4.7, 5.0, 3.0, 3.7, 2.1, 2.9, 4.5, 1.5, 2.2, 2.9)

tibble (variables, mydates, measurements)

I would like a tibble that looks like this, with my first 3 variables each repeated 16 times, my 16 first dates each repeated 3 times and the measurements: 我想要一个像这样的小标题,我的前三个变量每个重复16次,我的16个初次日期每个重复3次,并进行以下测量:

variables   mydates     measurements
var1        2013-04-15  3.2
var2        2013-04-15  4.6
var3        2013-04-15  1.1
var1        2013-04-16  3.0
var2        2013-04-16  3.6
var3        2013-04-16  1.6
var1        2013-04-17  1.4
var2        2013-04-17  1.4
var3        2013-04-17  4.8
...         ...         ...  # measurements for var1, var2, var3 were repeatedly taken during the 16 first dates in the vector mydates.
var4        2013-07-04  2.1
var5        2013-07-04  2.9
var6        2013-07-04  4.5
var7        2013-07-04  1.5
var8        2013-07-08  2.2
var9        2013-07-08  2.9

Here is an (bit 'dirty') alternative using base::expand.grid and lubridate . 这是使用base::expand.gridlubridate的(“脏”位)替代方案。 I transformed your mydates vector into a date class object. 我将mydates向量转换为date类对象。

Once you get all combinations between variables and mydates , you can bind measurements and transform into a tibble using as_data_frame . 一旦你之间的所有组合variablesmydates ,可以绑定measurements和转换成tibble使用as_data_frame

library (tidyverse)
library(lubridate) 


variables <- c("var1", "var2", "var3", "var4", "var5", "var6", "var7", "var8", "var9")

mydates <- c("2013-04-15",
             "2013-04-16",
             "2013-04-17",
             '2013-04-22',
             '2013-04-25',
             '2013-04-29',
             "2013-05-02",
             "2013-05-06",
             "2013-05-09",
             "2013-05-13",
             "2013-05-16",
             "2013-05-20",
             "2013-05-23",
             "2013-05-27",
             "2013-05-30",
             "2013-06-03",
             "2013-07-04",  
             "2013-07-08") %>% 
  as_date()

measurements <- c(3.2, 4.6, 1.1, 3.0, 3.6, 1.6, 1.4, 1.4, 4.8, 3.5, 4.0, 
                  2.7, 1.4, 2.9, 2.4, 3.6, 3.7, 4.3, 3.6, 3.5, 4.7, 1.8, 3.5, 2.4, 2.1, 1.2,
                  2.3, 3.9, 1.6, 2.8, 5.0, 2.4, 2.2, 2.9, 1.8, 1.7, 4.4, 3.9, 4.4, 2.6, 1.7, 
                  4.2, 3.4, 4.4, 4.7, 5.0, 3.0, 3.7, 2.1, 2.9, 4.5, 1.5, 2.2, 2.9)



mydata <- expand.grid(vars = variables[1:3], 
                      dates = mydates[month(mydates) < 7]) %>% 
  rbind(expand.grid(vars = variables[4:7], 
                    dates = mydates[month(mydates) == 7 & day(mydates) == 4])) %>% 
  rbind(expand.grid(vars = variables[8:9], 
                    dates = mydates[month(mydates) == 7 & day(mydates) == 8])) %>% 
  mutate(measures = measurements) %>% 
  as_data_frame()

And the output would be: 输出将是:

mydata

## A tibble: 54 x 3
#   vars  dates      measures
#   <fct> <date>        <dbl>
# 1 var1  2013-04-15      3.2
# 2 var2  2013-04-15      4.6
# 3 var3  2013-04-15      1.1
# 4 var1  2013-04-16      3  
# 5 var2  2013-04-16      3.6
# 6 var3  2013-04-16      1.6
# 7 var1  2013-04-17      1.4
# 8 var2  2013-04-17      1.4
# 9 var3  2013-04-17      4.8
#10 var1  2013-04-22      3.5
## ... with 44 more rows

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM