![](/img/trans.png)
[英]How do I take certain values from a dataframe, average them, and place them in a new dataframe in r?
[英]How do you take average values by date/state in a dataframe?
我有一個這樣的數據框(標題):
state start_date end_date created_at cycle party answer candidate_name pct survey_length
1 Florida 2020-11-02 2020-11-02 6/14/21 15:36 2020 REP Trump Donald Trump 48.0 0 days
2 Iowa 2020-11-01 2020-11-02 11/2/20 09:02 2020 REP Trump Donald Trump 48.0 1 days
3 Pennsylvania 2020-11-01 2020-11-02 11/2/20 12:49 2020 REP Trump Donald Trump 49.2 1 days
4 Florida 2020-11-01 2020-11-02 11/2/20 19:02 2020 REP Trump Donald Trump 48.2 1 days
5 Florida 2020-10-31 2020-11-02 11/4/20 09:17 2020 REP Trump Donald Trump 49.4 2 days
6 Nevada 2020-10-31 2020-11-02 11/4/20 10:38 2020 REP Trump Donald Trump 49.1 2 days
我想通過 state 取每個月“pct”列的平均值。你會怎么做? 你會使用for循環嗎?
這是group_by
和summarize
的解決方案。
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
# simulated data
df <- expand_grid(
state = c("fl", "io", "pa", "nv"),
start_date = seq(mdy("1/1/2022"), by = "day", length.out = 300),
) %>%
mutate(pct = runif(nrow(.)))
# mean pct by month
df %>%
mutate(mnth = floor_date(start_date, unit = "month")) %>%
group_by(state, mnth) %>%
summarize(pct = mean(pct), .groups = "drop")
#> # A tibble: 40 x 3
#> state mnth pct
#> <chr> <date> <dbl>
#> 1 fl 2022-01-01 0.443
#> 2 fl 2022-02-01 0.529
#> 3 fl 2022-03-01 0.570
#> 4 fl 2022-04-01 0.583
#> 5 fl 2022-05-01 0.477
#> 6 fl 2022-06-01 0.499
#> 7 fl 2022-07-01 0.497
#> 8 fl 2022-08-01 0.561
#> 9 fl 2022-09-01 0.467
#> 10 fl 2022-10-01 0.437
#> # ... with 30 more rows
由reprex package (v2.0.1) 創建於 2022-03-14
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.