简体   繁体   中英

Summing multiple observation rows in R

I have a dataset with 4 observations for 90 variables. The observations are answer to a questionnaire of the type "completely agree" to "completely disagree", expressed in percentages. I want to sum the two positive observations (completely and somewhat agree) and the two negative ones (completely and somewhat disagree) for all variables. Is there a way to do this in R?

My dataset looks like this:

   Albania  Andorra  Azerbaijan  etc.
1  13.3     18.0     14.9        ...
2  56.3     45.3     27.2        ...
3  21.3     27.2     28.0        ...
4  8.9      9.4      5.2         ...     

And I want to sum rows 1+2 and 3+4 to look something like this:

   Albania  Andorra  Azerbaijan  etc.
1  69.6     63.3     65.4        ...
2  30.2     36.6     33.2        ...

I am really new to R so I have no idea how to go about this. All answers to similar questions I found on this website and others either have character type observations, multiple rows for the same observation (with missing data), or combine all the rows into just 1 row. My problem falls in none of these categories, I just want to collapse some of the observations.

Since you only have four rows, it's probably easiest to just add the first two rows together and the second two rows together. You can use rbind to stick the two resulting rows together into the desired data frame:

rbind(df[1,] + df[2, ], df[3,] + df[4,])
#>   Albania Andorra Azerbaijan
#> 1    69.6    63.3       42.1
#> 3    30.2    36.6       33.2

Data taken from question

df <- structure(list(Albania = c(13.3, 56.3, 21.3, 8.9), Andorra = c(18, 
45.3, 27.2, 9.4), Azerbaijan = c(14.9, 27.2, 28, 5.2)), class = "data.frame", 
row.names = c("1", "2", "3", "4"))

Another option could be by summing every 2 rows with rowsum and using gl with k = 2 like in the following coding:

rowsum(df, gl(n = nrow(df), k = 2, length = nrow(df)))
#>   Albania Andorra Azerbaijan
#> 1    69.6    63.3       42.1
#> 2    30.2    36.6       33.2

Created on 2023-01-06 with reprex v2.0.2

Using dplyr

library(dplyr)
 df %>%
   group_by(grp = gl(n(), 2, n())) %>% 
   summarise(across(everything(), sum))

-output

# A tibble: 2 × 4
  grp   Albania Andorra Azerbaijan
  <fct>   <dbl>   <dbl>      <dbl>
1 1        69.6    63.3       42.1
2 2        30.2    36.6       33.2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM