简体   繁体   中英

Create a new column by aggregating multiple columns in R

Background

I have a dataset, df, where I would like to aggregate multiple columns and create a new column. I need to multiply Type, Span and Population columns and create a new Output column

ID       Status      Type     Span   State   Population

A        Yes         2        70%    Ga      10000

Desired output

ID        Status     Type      Span   State   Population   Output

A         Yes        2         70%    Ga      10000        14000      

dput

structure(list(ID = structure(1L, .Label = "A ", class = "factor"), 
Status = structure(1L, .Label = "Yes", class = "factor"), 
Type = 2L, Span = structure(1L, .Label = "70%", class = "factor"), 
State = structure(1L, .Label = "Ga", class = "factor"), Population = 10000L), class = "data.frame", 
row.names = c(NA, 
-1L))

This is what I have tried

 df %>% 
 mutate(Output = Type * Span * Population)

Here, we are creating a new column based on the inputs from different column. We can just use mutate to get the Span percent of Population and multiply by 'Type'. Note that 'Span' is not numeric, as it is having % , so we extract the numeric part with parse_number divide by 100, then multiply with Population along with the 'Type'

library(dplyr)
df %>%
  mutate(Output = Type * Population * readr::parse_number(as.character(Span))/100)
#   ID Status Type Span State Population Output
#1 A     Yes    2  70%    Ga      10000  14000

If the columns 'Type', 'Population' are not numeric, it is better to convert to numeric with as.numeric(as.character(df$Type)) and for 'Population' (assuming they are factor class). Another option is type.convert(df, as.is = TRUE) and then work on that modified class dataset

We can remove the '%' sign using sub , convert to numeric and multiply values.

This can be done in base R as:

df$output <- with(df, Type * as.numeric(sub('%', '', Span)) * Population/100)
df

#  ID Status Type Span State Population  output
#1 A     Yes    2  70%    Ga      10000   14000

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM