简体   繁体   中英

How to add a new column with 6 categories?

library(tidyverse)
library(lubridate)
daily_aqi <- read_csv("data/akl-aqi19.csv")
aqi_cat <- fct_inorder(c("Good", "Moderate", "Unhealthy for Sensitive",
  "Unhealthy", "Very Unhealthy", "Hazardous"))
aqi_pal <- setNames(
  c("#00E400", "#FFFF00", "#FF7E00", "#FF0000", "#8F3F97", "#7E0023"),
  aqi_cat)

How do I do steps 1-3:

  1. adding a new column and then adding 6 categories while also diving by max_aqi
  2. adding a column month by extracting month from the date
  3. adding a column mday by extracting day from the date

Add 3 new columns to daily_aqi:

  1. aqi_cat: divide max_aqi into 6 categories: 0 to 50: Good 51 to 100: Moderate 101 to 150: Unhealthy for Sensitive 151 to 200: Unhealthy 201 to 300: Very Unhealthy 301 and higher: Hazardous
  2. month: extract month of the year
  3. mday: extract day of the month

问题照片和所需的输出

https://i.stack.imgur.com/CacOe.png

We can use cut to create the new categories based on the values from 'max_aqi', the 'month' and 'mday' can be created from the 'date' column using format

library(dplyr)
daily_aqi <- daily_aqi %>%
    mutate(aqi_cat = cut(max_aqi, breaks = c(-Inf, 50, 100, 150, 200, 300, Inf),
       labels = c("Good", "Moderate", "Unhealthy for Sensitive",
            "Unhealthy", "Very Unhealthy", "Hazardous")),
   month = format(date, '%b'), mday = as.integer(format(date, '%d')))

-output

daily_aqi
#         date max_aqi                 aqi_cat month mday
#1  2019-01-01      35                    Good   Jan    1
#2  2019-01-02       0                    Good   Jan    2
#3  2019-01-03      50                    Good   Jan    3
#4  2019-01-04      51                Moderate   Jan    4
#5  2019-01-05     101 Unhealthy for Sensitive   Jan    5
#6  2019-01-06     198               Unhealthy   Jan    6
#7  2019-01-07     201          Very Unhealthy   Jan    7
#8  2019-01-08     300          Very Unhealthy   Jan    8
#9  2019-01-09     301               Hazardous   Jan    9
#10 2019-01-10     350               Hazardous   Jan   10

Or use case_when

daily_aqi <- daily_aqi %>%
       mutate(aqi_cat = case_when(between(max_aqi, 0, 50)~ 'Good',
                  between(max_aqi, 51, 100)~ 'Moderate',
                  between(max_aqi, 101 150) ~ 'Unhealthy for Sensitive',
                  between(max_aqi, 151, 200) ~ 'Unhealthy',
                  between(max_aqi, 201, 300) ~ 'Very Unhealthy',
                  max_aqi > 300 ~ 'Hazardous'),
              month = format(date, '%b'), mday = as.integer(format(date, '%d')))

data

daily_aqi <- structure(list(date = structure(17897:17906, class = "Date"), 
    max_aqi = c(35, 0, 50, 51, 101, 198, 201, 300, 301, 350)), 
    class = "data.frame", row.names = c(NA, 
-10L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM