简体   繁体   中英

How to Sum Amounts Column Based On Conditions from Categorical Column in R

I have a dataframe in r called house_expenses that looks like this (2 columns: DESCRIPTION and AMOUNT):

DESCRIPTION             AMOUNT
-----------            ---------
COUCH                    $801.713

TV                       $4999.996

TV_MOUNT                 $575.867

ENTERTAINMENT_SYSTEM     $1102.392

MATTRESS                 $1225.893

BEDFRAME                 $356.789

PILLOWS                  $528.989

I would like to create two additional columns to the dataframe that has the sums and is rounded to 2 decimal places:

  1. LIVING_ROOM_COSTS = sum(round(COUCH, TV, TV_MOUNT, ENTERTAINMENT_SYSTEM), =2)
  2. BEDROOM_COSTS = sum(round(MATTRESS, BEDFRAME, PILLOWS), =2)

I have tried doing

house_expenses  <- house_expenses %>%

                   group_by(DESCRIPTION) %>%

                   mutate(LIVING_ROOM_COSTS  = sum(round(DESCRIPTION == "COUCH" &
                                                         DESCRIPTION == "TV" &
                                                         DESCRIPTION == "TV_MOUNT" &
                                                         DESCRIPTION == "ENTERTAINMENT_SYSTEM" , digits = 2)),
                    mutate(BEDROOM_COSTS = sum(round(DESCRIPTION == "MATTRESS" &
                                                     DESCRIPTION == "BEDFRAME" &
                                                     DESCRIPTION == "PILLOWS", digits = 2)))

But unfortunately this hasn't worked. Had anyone come across this before and know how to approach this problem?

To get the solution you want you have to do some subsetting, Description %in% c("COUCH", "TV","TV_MOUNT","ENTERTAINMENT_SYSTEM") Gets you the TRUE or FALSE according to the row, then you subset AMOUNT AMOUNT[Description %in% c("COUCH", "TV","TV_MOUNT","ENTERTAINMENT_SYSTEM")]

Then you wrap the values in a sum and round it:

df$LIVING_ROOM_COSTS = with(df, round(sum(Amount[Description %in% c("COUCH", "TV","TV_MOUNT","ENTERTAINMENT_SYSTEM")]), 2))
df$BEDROOM_COSTS = with(df, round(sum(Amount[Description %in% c("MATRESS", "BEDFRAME","PILLOWS")]), 2))

This gives us the data.frame of:

           Description   Amount LIVING_ROOM_COSTS BEDROOM_COSTS
1                COUCH  801.713           7479.97        885.78
2                   TV 4999.996           7479.97        885.78
3             TV_MOUNT  575.867           7479.97        885.78
4 ENTERTAINMENT_SYSTEM 1102.392           7479.97        885.78
5             MATTRESS 1225.893           7479.97        885.78
6             BEDFRAME  356.789           7479.97        885.78
7              PILLOWS  528.989           7479.97        885.78

Using with allows us to refer to column names without using $

The reason there wasn't an answer sooner enough is because the formatting given required extra work and humans are generally lazy.

If you had formatted your data.frame like this:

           Description   Amount
1                COUCH  801.713
2                   TV 4999.996
3             TV_MOUNT  575.867
4 ENTERTAINMENT_SYSTEM 1102.392
5             MATTRESS 1225.893
6             BEDFRAME  356.789
7              PILLOWS  528.989

Or like this using the function dput :

structure(list(Description = c("COUCH", "TV", "TV_MOUNT", "ENTERTAINMENT_SYSTEM", 
"MATTRESS", "BEDFRAME", "PILLOWS"), Amount = c(801.713, 4999.996, 
575.867, 1102.392, 1225.893, 356.789, 528.989)), class = "data.frame", row.names = c(NA, 
-7L))

It would have been answered swiftly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM