简体   繁体   中英

How to assign mutate and distinct to another variable in R?

enter image description here I have a huge data set which has data for every 30 seconds . First I get the mean to take hourly data , then sum it for daily data and again sum it for monthly data . I need to assign the mutate function to a new data set / variable called mE_131 . for plotting monthly value .I'm new to this Please Help!

library(dplyr)
library(ggplot2)

attach(data)
data%>% #filtering 131 and 132
  select(time,Column3,m_Pm) %>%
  filter(data,Column3=="131") 
filter(data,Column3=="132")
data_131<-filter(data,Column3=="131") 
data_132<-filter(data,Column3=="132") 

data_131%>%
  mutate(datehour= format(time,"%Y-%m-%d %H"), date1= format(time,"%Y-%m-%d"), month=format(time,"%Y-%m")) %>% 
  group_by(datehour) %>% mutate(hourlyP=mean(m_Pm)) %>% distinct(datehour, .keep_all = TRUE) %>% 
  group_by(date1) %>% mutate(dailyP=sum(hourlyP)) %>% distinct(date1, .keep_all = TRUE) %>% 
  group_by(month) %>% summarise(monthlyP=sum(dailyP))

If your goal is to compare monthly data between column3 == 131 and column3 == 132 then you don't necessarily need to create a separate dataset for each of them although I will show you how to do it in the end.

First, let's create the required summary for both 131 and 132 :

data <- data %>%
    filter(column3 == "131" | column3 == "132") %>% # filtering the required data only
    mutate(datehour= format(time,"%Y-%m-%d %H"), # calculate the required stats
          date1= format(time,"%Y-%m-%d"),
          month=format(time,"%Y-%m")) %>% 
   group_by(datehour) %>%
   mutate(hourlyP=mean(m_Pm)) %>%
   distinct(datehour, .keep_all = TRUE) %>% 
   group_by(date1) %>%
   mutate(dailyP=sum(hourlyP)) %>%
   distinct(date1, .keep_all = TRUE) %>%
   group_by(month) %>% 
   summarise(monthlyP=sum(dailyP))

Note: I have written every part of code in separate line to enhance readability but it is basically the same as your code shown above.

Now, let's do the plotting:

data %>%
    ggplot(aes(x=month, y=monthlyP, fill=column3)) +
    geom_bar(position="dodge") # this will produce similar plot as in your example

If you insist on having a separate dataset for each value in column3 then you can simply use the assignment operator <- to create a new dataframe as follows

mE_131 <- data_131 %>%
   mutate(datehour= format(time,"%Y-%m-%d %H"), 
          date1= format(time,"%Y-%m-%d"),
          month=format(time,"%Y-%m")) %>% 
   group_by(datehour) %>%
   mutate(hourlyP=mean(m_Pm)) %>%
   distinct(datehour, .keep_all = TRUE) %>% 
   group_by(date1) %>%
   mutate(dailyP=sum(hourlyP)) %>%
   distinct(date1, .keep_all = TRUE) %>%
   group_by(month) %>% 
   summarise(monthlyP=sum(dailyP))

Then do the same thing to create mE_132 . However, I don't recommend this because it would be harder to plot them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM