[英]dplyr manipulation rowwise grouping mutate
我有數據集
x <- data.frame(Postcode = c(1, 2, 3, 4, 5, 6),
Latitude = c(3.1, 3.2, 3.3, 3.3, 3.4, 3.4),
Longitude = c(100, 101, 102, 102, 103, 104),
Exposure = c(1, 2, 3, 4, 5, 6))
我試圖操縱x內的數據成為
x <- data.frame(Postcode = c(1, 2, 3, 4, 5, 6),
Latitude = c(3.1, 3.2, 3.3, 3.3, 3.4, 3.4),
Longitude = c(100, 101, 102, 102, 103, 104),
Exposure = c(1, 2, 3, 4, 5, 6),
coords = c("3.1, 100", "3.2, 101", "3.3, 102", "3.3, 102",
"3.4, 103", "3.4, 104"),
postcode = c("1", "2", "3,4", "3,4", "5", "6"),
exposure = c(1, 2, 7, 7, 5, 6))
新列的postcode
會將具有相同Latitude
和Longitude
的Postcode
粘貼在一起。 coords
將粘貼Latitude
和Longitude
,而exposure
將coords
具有相同coords
(即相同的Latitude
和Longitude
的Exposure
。
我可以通過使用dplyr
包和for
循環來完成此操作
x <- mutate(x, coords = paste(Latitude, Longitude, sep = ", "))
x <- cbind(x, postcode = rep(0, nrow(x)), exposure = rep(0, nrow(x)))
for(i in unique(x$coords)){
x$postcode[x$coords == i] <- paste(x$Postcode[x$coords == i], collapse = ", ")
x$exposure[x$coords == i] <- sum(x$Exposure[x$coords == i])
}
如何僅通過僅使用dplyr
軟件包而不使用for
循環來完成此操作? 也許還有其他方法比使用for
循環更有效for
因為我的實際數據集非常大
library(dplyr)
library(tidyr) # unite() was used to join Lat, Lon
x %>% unite(coords, Latitude, Longitude, sep = ",", remove = FALSE) %>%
group_by(coords) %>% mutate(exposure = sum(Postcode), postcode = toString(Postcode))
這是使用dplyr
:
library(dplyr)
x %>%
group_by(coords = paste(Latitude, Longitude, sep = ", ")) %>%
mutate(postcode = toString(Postcode), exposure = sum(Exposure))
# Source: local data frame [6 x 7]
# Groups: coords [5]
#
# Postcode Latitude Longitude Exposure coords postcode exposure
# <dbl> <dbl> <dbl> <dbl> <chr> <chr> <dbl>
# 1 1 3.1 100 1 3.1, 100 1 1
# 2 2 3.2 101 2 3.2, 101 2 2
# 3 3 3.3 102 3 3.3, 102 3, 4 7
# 4 4 3.3 102 4 3.3, 102 3, 4 7
# 5 5 3.4 103 5 3.4, 103 5 5
# 6 6 3.4 104 6 3.4, 104 6 6
我們可以使用data.table
來做到這data.table
library(data.table)
setDT(x)[, coords := paste(Latitude, Longitude, sep="," )
][, c("exposure", "postcode") :=.(sum(Postcode), toString(Postcode)), coords]
x
# Postcode Latitude Longitude Exposure coords exposure postcode
#1: 1 3.1 100 1 3.1,100 1 1
#2: 2 3.2 101 2 3.2,101 2 2
#3: 3 3.3 102 3 3.3,102 7 3, 4
#4: 4 3.3 102 4 3.3,102 7 3, 4
#5: 5 3.4 103 5 3.4,103 5 5
#6: 6 3.4 104 6 3.4,104 6 6
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.