如何簡化 R 中的相關代碼？

Question

這是我的df：

                  date                     z         x                    y 
   <dttm>                               <dbl>    <dbl>                <dbl> 
 1 2019-01-01 00:00:00                   1333  3339072.         456700000000 
 2 2019-02-01 00:00:00                    915  4567582.         904600000000 
 3 2019-03-01 00:00:00                   1433  7887962.         247900000000 
 4 2019-04-01 00:00:00                   1444  3454559.         905700000000 
 5 2019-05-01 00:00:00                   1231  9082390.         245600000000 
 6 2019-06-01 00:00:00                    346   781224.         346700000000

如何將此代碼簡化為 for 循環？

df %>%
filter(year(df$date) == 2017) %>%
mutate(correlation = cor(x, y))

df %>%
filter(year(df$date) == 2018) %>%
mutate(correlation = cor(x, y))

df %>%
filter(year(df$date) == 2019) %>%
mutate(correlation = cor(x, y))

df %>%
filter(year(df$date) == 2020) %>%
mutate(correlation = cor(x, y))

這就是我到目前為止所嘗試的，但我有一些 NA：

years <- c(2017, 2018, 2019, 2020)
for (y in years) {
  df %>%
    filter(date == y) %>%
    mutate(correlation = cor(x, y))
    print(df$correlation[y])
}

我想要的 output 會是這樣的

[1] 2017: 0.23
[1] 2018: -0.38
[1] 2019: 0.40
[1] 2020: 0.15

Answer 1

為了按年份獲得相關性，您可能希望能夠將 dttm 列轉換為允許我們按年份進行相等的東西。 我們可以在 lubridate 中使用 function 年份，代碼應該可以工作。

library(lubridate)

df$year <- year(df$date)

for (y in unique(df$year)){
  df %>%
    filter(year == y) %>%
    mutate(correlation = cor(x, y)) %>%
    print(unique(correlation))
}

或者，我們可以更簡潔一些，並使用 group_by 進行以下轉換。

yearDf <- df %>% 
  group_by(year) %>%
  summarize(correlation = cor(x, y))

print(yearDf)

Answer 2

您可以group_by year並計算year x和y的相關性。 此外，由於相關性year只會為您提供一個數字，因此最好summarise而不是mutate ，因為mutate會為所有行重復相同的值。

library(dplyr)
library(lubridate)

df %>% group_by(year = year(date)) %>% summarise(correlation = cor(x, y))

如何簡化 R 中的相關代碼？

問題描述

2 個解決方案

解決方案1
2 2020-05-30 02:09:02

解決方案2
1 2020-05-30 01:53:59

如何簡化 R 中的相關代碼？

問題描述

2 個解決方案

解決方案1 2 2020-05-30 02:09:02

解決方案2 1 2020-05-30 01:53:59

解決方案1
2 2020-05-30 02:09:02

解決方案2
1 2020-05-30 01:53:59