Calculate weighted means for multiple grouping with different weightings in R

Question

I've gone through the many posts on SO trying to get my code to work but still have some errors. I'm trying to calculate weighted means for many columns based on different groupings. Specifically, I want to calculate the weighted mean of traits (in this case wingL, wingW, etc.) weighted by the value column.

Here is a sample dataset (because my matrix is HUGE) and some code:

>df
    year site  Species value  wingL  wingW   proL proW 
    2018    2    Aa      3.0   310.6  54.9   NA   1.1       
    2017    2    Aa      1.0   310.6  54.9   NA   1.1 
    2018    2    Bb      7.5    NA    20     3    1.0    
    2017    2    Bb      5      NA    20     3    1.0
    2018    4    Aa      8     310.6  54.9   NA   1.1       
    2017    4    Aa      6     310.6  54.9   NA   1.1
    2018    4    Cc      1    161.20   143.8  NA   NA 
    2017    4    Cc      1    161.20   143.8  NA   NA
    2018    6    Aa      12    310.6   54.9   NA   1.1  
    2018    6    Aa      9.5   310.6   54.9   NA   1.1
    2018    6    Cc      7    161.20   143.8  NA   NA 
    2017    6    Cc      7    161.20   143.8  NA   NA

Here is my code:

dfnew <- setDT(df)[, lapply(.SD, function(x) weighted.mean(x, value)),
                       by = c("year", "Species"), .SDcols  = wingL:proW]

But all it does it delete the "value" column which is what I want to use as my weights. Basically, I want to calculate the weighted mean across rows for columns wingL:proW. Then, once I have those data I eventually will average across all species (Aa, Bb) at each site.

With code below I was able to correctly create a new df with just one new column (for wingL_wm) but can't figure out how to scale this for the many columns I have::

dfnew <- df %>% 
          group_by(year, site) %>%
          summarise(wingL_wm = weighted.mean(wingL, value))

Hope that makes sense. Thanks for the help Here is a generic desired output though the "x" should be the calculated weighted means:

year site   wingL_WM  wingW_WM   proL_WM proW_WM
2018    2       x        x         x        x       
2017    2       x        x         x        x
2018    4       x        x         x        x
2017    4       x        x         x        x
2018    6       x        x         x        x    
2017    6       x        x         x        x

Answer 1

dfnew <- setDT(df)[, lapply(.SD, function(x) weighted.mean(x, value, na.rm = TRUE)), by = c("year", "site"), .SDcols = wingL:proW]

I had to include the na.rm statement! I think this gives the correct results. Thanks everyone for helping me think it through as I did have errors by grouping - was over thinking it.

It does replace the original values, but I can live with that.

Calculate weighted means for multiple grouping with different weightings in R

Question

1 answers

solution1
1 ACCPTED 2019-02-22 20:50:59

Calculate weighted means for multiple grouping with different weightings in R

Question

1 answers

solution1 1 ACCPTED 2019-02-22 20:50:59

solution1
1 ACCPTED 2019-02-22 20:50:59