简体   繁体   中英

R: Mutate Returning NA Values When Additional Variable in Select Statement

I'm using dplyr and rollmean mean to calculate a 13 Week Moving Average and Growth rates. The following works:

NEW_DATA <- DATA %>% 
    select(CAT, Inventory_Amount, Sales, Shipments, DATE)%>%
    group_by(CAT, DATE)%>%
    summarise(
            INVENTORY = sum(Inventory_Amount),
            SO = sum(Sale),
            SI = sum(Shipments)
    ) %>%
    arrange(CAT, DATE)%>%
    mutate(SO_13WK_AVG = rollmean(x = SO, 13, align = "right", fill = NA ),
           GROWTH = round(((SO - lag(SO, 52)) / lag(SO, 52)) *100,2))

This codes adds two new columns "SO_13WK_AVG" (the 13 week sales average) and Growth (YoY Growth Rate for Sales)

When I try to select an additional variable from the original dataframe to include in the new summarized dataframe, the values for the new variables being created all turn to NA's. The following code generates NA's for the SO_13WK_AVG and GROWTH (all I've done is selected the "WK" variable:

NEW_DATA <- DATA %>% 
    select(CAT, Inventory_Amount, Sales, Shipments, DATE, WK)%>%
    group_by(CAT, DATE, WK)%>%
    summarise(
            INVENTORY = sum(Inventory_Amount),
            SO = sum(Sale),
            SI = sum(Shipments)
    ) %>%
    arrange(CAT, DATE)%>%
    mutate(SO_13WK_AVG = rollmean(x = SO, 13, align = "right", fill = NA ),
           GROWTH = round(((SO - lag(SO, 52)) / lag(SO, 52)) *100,2))

I searched stackoverflow and one found one thread that seems related:

Group/Mutate only returns NA and not an average

This thread suggests using na.rm = TRUE to remove NA values from calculations. However as far as I can tell I don't have any missing values. Any help / commentary is appreciated.

I just resolved a very similar issue. Can't quite tell whether it will fix yours without spending more time thingking about it, but I was grouping by the two variables which accounted for all of the variation across my data set (location and week). Therefore, the rolling mean was either not able to calculate, or could only create the fill values. Not grouping by "week" solved the issue. Since "WK" is almost certainly 100% dependent on "Date", I expect you have the same issue. Remember, "summarise" drops the last grouping variable from the grouping. Try grouping by WK before your summarise, and then regrouping without week or date.

(BTW, I'm sure you've figured something out, since this was almost two years ago, but I imagine others will encounter this as well, after all, that's why I came to this question.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM