I'm struggling to complete multiple tasks in a tidyr environment. I would first like to group_by a few variables of a dataframe, then create a matrix from that dataframe, and finally create a new variable with a distance measurement (Mahalanobis) on the matrix.
Some sample data:
head(olddf)
## person place thing measurement1 measurement2
## 1 JohnSmith Paris a 3.2 4.4
## 2 JaneDoe Paris b 4.4 4.4
## 3 MaryJohnson London e 4.2 4.2
## 4 JohnSmith London d 4.1 3.9
## 5 JaneDoe Tokyo e 3.9 3.9
## 6 MaryJohnson Tokyo e 3.2 3.9
newdf <- olddf %>%
group_by(person, place, thing) %>%
m = data.matrix(???) %>% # what do I provide here? how to convert the data subset to a matrix? %>%
cov = var(m) %>%
mutate(dist=mahalanobis(m, cov))
It may be better to better to do a group split
library(dplyr)
library(purrr)
olddf %>%
group_split(person, place, thing, keep = FALSE) %>%
map(~ {m <- data.matrix(.x)
mahalanobis(m, center = FALSE, cov = var(m))
}) %>%
unlist %>%
mutate(olddf, dist = .)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.