Example data frame
date name speed acceleration
1/1/17 bob 5 NA
1/1/15 george 5 NA
1/1/15 bob NA 4
1/1/17 bob 4 NA
I want to condense all rows with the same name into one row and keep the newest non-na value for the speed and acceleration column.
Desired output
date name speed acceleration
1/1/17 bob 5 4
1/1/15 george 5 NA
You can do it this way:
library(dplyr)
library(lubridate)
input = read.table(text =
"date name speed acceleration
1/1/17 bob 5 NA
1/1/15 george 5 NA
1/1/15 bob NA 4
1/1/17 bob 4 NA",
header = TRUE, stringsAsFactors = FALSE)
output <- input %>%
mutate(date = mdy(date)) %>% # or maybe dmy, depending on your date format
group_by(name) %>%
arrange(desc(date)) %>%
summarise_all(funs(na.omit(.)[1]))
output
# # A tibble: 2 × 4
# name date speed acceleration
# <chr> <date> <int> <int>
# 1 bob 2017-01-01 5 4
# 2 george 2015-01-01 5 NA
Here is an option using data.table
. Convert the 'data.frame' to 'data.table' ( setDT(input)
), order
the 'date' after converting to Date
class, grouped by 'name', loop through the columns and get the first non-NA element
library(data.table)
library(lubridate)
setDT(input)[order(-mdy(date)), lapply(.SD, function(x) x[!is.na(x)][1]), name]
# name date speed acceleration
#1: bob 1/1/17 5 4
#2: george 1/1/15 5 NA
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.