R：如何组合具有相同id的数据帧的行并获取最新的非NA值？

Question

示例数据框

date       name     speed  acceleration
1/1/17     bob      5      NA
1/1/15     george   5      NA
1/1/15     bob      NA     4
1/1/17     bob      4      NA

我想将具有相同名称的所有行压缩到一行中，并保留速度和加速列的最新非na值。

期望的输出

date       name     speed  acceleration
1/1/17     bob      5      4
1/1/15     george   5      NA

Answer 1

你可以这样做：

library(dplyr)
library(lubridate)

input = read.table(text = 
 "date       name     speed  acceleration
  1/1/17     bob      5      NA
  1/1/15     george   5      NA
  1/1/15     bob      NA     4
  1/1/17     bob      4      NA",
  header = TRUE, stringsAsFactors = FALSE)

output <- input %>%
  mutate(date = mdy(date)) %>% # or maybe dmy, depending on your date format
  group_by(name) %>%
  arrange(desc(date)) %>%
  summarise_all(funs(na.omit(.)[1]))

output
# # A tibble: 2 × 4
#     name       date speed acceleration
#    <chr>     <date> <int>        <int>
# 1    bob 2017-01-01     5            4
# 2 george 2015-01-01     5           NA

Answer 2

这是一个使用data.table的选项。 将'data.frame'转换为'data.table'（ setDT(input) ），在转换为Date类后对'date'进行order ，按'name'分组，循环遍历列并获取第一个非NA元素

library(data.table)
library(lubridate)
setDT(input)[order(-mdy(date)), lapply(.SD, function(x) x[!is.na(x)][1]), name]
#     name   date speed acceleration
#1:    bob 1/1/17     5            4
#2: george 1/1/15     5           NA

R：如何组合具有相同id的数据帧的行并获取最新的非NA值？

问题描述

2 个解决方案

解决方案1
3 已采纳 2017-02-06 23:36:28

解决方案2
0 2017-02-07 03:41:23

R：如何组合具有相同id的数据帧的行并获取最新的非NA值？

问题描述

2 个解决方案

解决方案1 3 已采纳 2017-02-06 23:36:28

解决方案2 0 2017-02-07 03:41:23

解决方案1
3 已采纳 2017-02-06 23:36:28

解决方案2
0 2017-02-07 03:41:23