简体   繁体   中英

R: Aggregate error - ‘sum’ not meaningful for factors

I've seen many similar questions on the website but somehow could not find an answer my problem. I have a data frame looking like this:

head(ftrade)
   Imports Value Exports Value  balance nacer2
1          7349        185712   178363     01
2       4772816      99763470 94990654     01
3       4772816      99763470 94990654     01
4       4772816      99763470 94990654     01
5       1022528       7880815  6858287     01
6       8295652        215331 -8080321

I want to aggregate my data by nacer2, while summing the values. My expected output would be like this:

    Imports Value Exports Value  balance nacer2
1         50000        100000    50000     01
2         50000        100000    50000     02
3         50000        100000    50000     03
4         50000        100000    50000     04
5         50000        100000    50000     05

where the values in the first three columns are the sum of the original data. I run the following:

ftrade <- do.call(data.frame, aggregate(cbind("Exports Value",
                                          "Imports Value",
                                           balance) ~ nacer2, 
                                           data = ftrade,
                                            sum))

which returns the error message: Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, : 'sum' not meaningful for factors

All the answers I've seen on this forum state that it is because one of the variables is a factor, and so summing does not make sense. I've checked, and none of my variables are factors:

str(ftrade)
'data.frame':   11963 obs. of  4 variables:
 $ Imports Value: num  7349 4772816 4772816 4772816 1022528 ...
 $ Exports Value: num  185712 99763470 99763470 99763470 7880815 ...
 $ balance      : num  178363 94990654 94990654 94990654 6858287 ...
 $ nacer2       : chr  "01" "01" "01" "01" ...

Since I am aggregating over nacer2, it should not be a problem that it is a character. I've try to convert everything in numeric values again, but nothing seems to solve my issue. I am not sure to understand what really is happening here. Am I missing something here?

Thank you for your help, Clement

If you really need to use spaces in your variable names (you probably don't) then you need to use backticks to refer to them:

names(mtcars)[1] <- 'm p g'

aggregate(cbind(disp, 'm p g') ~ vs, mtcars, sum)
 Error in Summary.factor(c(8L, 8L, 18L, 18L, 12L, 12L, 12L, 22L, 21L, 20L, : 'sum' not meaningful for factors
aggregate(cbind(disp, `m p g`) ~ vs, mtcars, sum)
 vs disp mpg 1 0 5528.7 299.1 2 1 1854.4 343.8

Regardless, my advise is to not use spaces in your variable names.

I was able to do what I wanted using dplyr:

ftrade <- ftrade %>% 
  group_by(nacer2) %>%
  summarise(balance = sum(balance))

It does the work just fine, so I think we can consider the case closed. However, I am still curious to hear explanation about what exactly was happening here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM