简体   繁体   English

在R中对齐分类数据

[英]Align Disaggregated Data in R

I have an excel data set in excel that I would like to load into R. The dataset has two variables "weight" and "height" in which each variable has its own date specified as to when it was recorded. 我有一个要加载到R中的excel中的excel数据集。该数据集有两个变量“ weight”和“ height”,其中每个变量都有自己的记录日期。 The height variable has skipping/missing values, likewise in the weight variable if you go down in the data far enough. height变量具有跳过/缺失值,如果您深入数据中,同样在weight变量中也是如此。 I am trying to create a consolidated data-set in which weight and height are combined and arranged by the date in the proper place and NA's are placed when a value isn't present. 我正在尝试创建一个合并的数据集,该数据集将体重和身高组合在一起并按日期安排在适当的位置,并且在不存在值的情况下放置NA。 Are there any commands/functions that can help me do this? 是否有任何命令/功能可以帮助我做到这一点? Thank you! 谢谢!

 obs     date   weight     date    height
  1   2010-10-04 52495  2010-10-04 11.6  
  2   2010-10-01 53000  2010-10-01 15.3
  3   2010-09-30 52916  2010-09-30 14.3
  4   2010-09-29 52785  2010-09-29 11.3
  5   2010-09-28 53348  2010-09-28 18.2
  6   2010-09-27 52885  2010-09-24 11.7
  7   2010-09-24 52174  2010-09-23 15.0
  8   2010-09-23 51461  2010-09-22 18.6
  9   2010-09-22 51286  2010-09-20 17.9
  10  2010-09-21 50968  
  11  2010-09-20 49250  

I'm assuming this question isn't about reading the data into R, but processing it after it has been read. 我假设这个问题不是关于将数据读取到R中,而是在读取数据之后对其进行处理。 Nevertheless, you can use the arguments check.names = FALSE and fill = TRUE when reading your data in to allow you to use Reduce to merge your data. 不过,在读入数据时,可以使用参数check.names = FALSEfill = TRUE ,以允许您使用Reduce合并数据。

First, simulate reading the data in. 首先,模拟读取数据。

temp <- read.table(header = TRUE, 
text = "obs date weight date height
1   2010-10-04 52495  2010-10-04 11.6
2   2010-10-01 53000  2010-10-01 15.3
3   2010-09-30 52916  2010-09-30 14.3
4   2010-09-29 52785  2010-09-29 11.3
5   2010-09-28 53348  2010-09-28 18.2
6   2010-09-27 52885  2010-09-24 11.7
7   2010-09-24 52174  2010-09-23 15.0
8   2010-09-23 51461  2010-09-22 18.6
9   2010-09-22 51286  2010-09-20 17.9
10  2010-09-21 50968
11  2010-09-20 49250
", fill = TRUE, check.names = FALSE)

Second, use Reduce() and merge() . 其次,使用Reduce()merge()

Reduce(function(x, y) merge(x, y, all.x = TRUE), 
       list(temp[2:3], temp[4:5]))
#          date weight height
# 1  2010-09-20  49250   17.9
# 2  2010-09-21  50968     NA
# 3  2010-09-22  51286   18.6
# 4  2010-09-23  51461   15.0
# 5  2010-09-24  52174   11.7
# 6  2010-09-27  52885     NA
# 7  2010-09-28  53348   18.2
# 8  2010-09-29  52785   11.3
# 9  2010-09-30  52916   14.3
# 10 2010-10-01  53000   15.3
# 11 2010-10-04  52495   11.6
d <- read.table(header=FALSE, fill=TRUE, text="1   2010-10-04 52495  2010-10-04 11.6  
  2   2010-10-01 53000  2010-10-01 15.3
  3   2010-09-30 52916  2010-09-30 14.3
  4   2010-09-29 52785  2010-09-29 11.3
  5   2010-09-28 53348  2010-09-28 18.2
  6   2010-09-27 52885  2010-09-24 11.7
  7   2010-09-24 52174  2010-09-23 15.0
  8   2010-09-23 51461  2010-09-22 18.6
  9   2010-09-22 51286  2010-09-20 17.9
  10  2010-09-21 50968  
  11  2010-09-20 49250  ")

d1 <- d[2:3]
d2 <- d[!is.na(d[,5]),][4:5]

names(d1) <- c('Date', 'val1')
names(d2) <- c('Date', 'val2')
m <- merge(d1, d2, by='Date', all=TRUE)

> m

##          Date  val1 val2
## 1  2010-09-20 49250 17.9
## 2  2010-09-21 50968   NA
## 3  2010-09-22 51286 18.6
## 4  2010-09-23 51461 15.0
## 5  2010-09-24 52174 11.7
## 6  2010-09-27 52885   NA
## 7  2010-09-28 53348 18.2
## 8  2010-09-29 52785 11.3
## 9  2010-09-30 52916 14.3
## 10 2010-10-01 53000 15.3
## 11 2010-10-04 52495 11.6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM