[英]operations by groups function within a function and options across multiple columns
我想跨列對我使用的數據幀執行一些類似的操作。 請看鏈接獲取數據庫【非大文件】,由dput()
生成
我想通過group_by(year,country_name_iso3)
分組的cols= c("debt_GDP", "Top10")
列執行以下三個操作:
inter = na.interpolation(cols, option = "spline")
sm=fitted(smooth.spline(cols_intep))
rollmean=rollmean(cols_intep,10, fill = NA)
這是一個包含單列的示例:
# interpolate missing values
df_us <-subset(pkt, country_name_iso3=="USA")
df_us <-droplevels(df_us)
df_us$debt_intep <-na.interpolation(df_us$debt_GDP, option = "spline")
df_us$top10_intep <-na.interpolation(df_us$Top10, option = "spline")
# smooth series with moving average
df_us$debt_sm <- fitted(smooth.spline(df_us$debt_intep))
df_us$top10_sm <- fitted(smooth.spline(df_us$top10_intep))
# rolling mean
df_us$debt_sm_rollmean<-rollmean(df_us$debt_sm,10, fill = NA)
df_us$top10_sm_rollmean<-rollmean(df_us$top10_sm,10, fill = NA)
對於按c(year,country_name_iso3)
分組的每個c("debt_GDP", "Top10")
列c("debt_GDP", "Top10")
我希望完全相同
實現這一目標的最有效代碼是什么?
library(tidyverse)
library(imputeTS)
library(zoo)
使用mutate_at
3 次:
read_delim('dput.df.txt', delim = ' ') %>%
group_by(country_name_iso3) %>%
mutate_at(.vars = c('debt_GDP', 'Top10'),
.funs = list(inter = ~na_interpolation(., option = "spline"))) %>%
mutate_at(.vars = c('debt_GDP_inter', 'Top10_inter'),
.funs = list(sm = ~fitted(smooth.spline(.)))) %>%
mutate_at(.vars = c('debt_GDP_inter_sm', 'Top10_inter_sm'),
.funs = list(rollmean = ~rollmean(., 10, fill = NA)))
或者組合成一個函數:
func <- function(x) {
inter = na_interpolation(x, option = 'spline')
sm = fitted(smooth.spline(inter))
rollmean = rollmean(sm, 10, fill = NA)
}
read_delim('dput.df.txt', delim = ' ') %>%
group_by(country_name_iso3) %>%
mutate_at(.vars = c('debt_GDP', 'Top10'), .funs = func)
繪制輸出
read_delim('dput.df.txt', delim = ' ') %>%
group_by(country_name_iso3) %>%
mutate_at(.vars = c('debt_GDP', 'Top10'), .funs = func) %>%
ggplot() +
geom_line(aes(year, debt_GDP, color=country))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.