简体   繁体   中英

sort by first group element dplyr

probably a simple answer but proving surprisingly challenging. My data looks like the following, state-wise percentages occurring over three years:

State<-c('Assam','Bihar','Chandigarh','Delhi','Goa')
2012<-c(96, 95, 94, 92, 99)
2013<-c(97, 97, 94, 93, 100)
2014<-c(97, 98, 96, 95, 100)

df<-data.frame(State, 2013, 2013, 2014)

I'm trying to group it by State, arrange the years, and then arrange state groups by ascending 2012 percentages. Also need a separate df that will arrange states by their 2014 percentages.

Here's what I have:

library(reshape2)
library(dplyr)

dfmelt<-melt(df, id = 'State')
colnames(dfmelt)<-c('State','Year','Percent')

dfmelt<-dfmelt %>% arrange(Percent) %>% group_by(State) %>% arrange(Year)

Tried a million combinations of last line and can't crack it. Have looked at similar questions but with no summarizing or mutation here, just pure rearranging, I'm stuck.

Ultimately I'm creating 2 dot plots, one ranking states on Y axis by 2012 %'s and one by 2014 %'s. Figure I need the dataframe in exact order for ggplot to do this, right? Let me know if I'm mistaken.

Thanks!

If your goal is to order the axis in a ggplot you can do it with your df as-is.

You just need to make use of 'reorder' in ggplot

df<-data.frame("State" = c('Assam','Bihar','Chandigarh','Delhi','Goa'),
                             "2012" = c(96, 95, 94, 92, 99),
                             "2013" = c(97, 97, 94, 93, 100),
                             "2014" = c(97, 98, 96, 95, 100))

library(ggplot2)

ggplot(data=df, aes(x=reorder(State, X2012), y=X2014)) +
    geom_bar(stat="identity")

Your data creation code does not run, and you have 2013 repeated.

Here is the code to generate that data:

State <- c('Assam','Bihar','Chandigarh','Delhi','Goa')
p2012 <- c(96, 95, 94, 92, 99)
p2013 <- c(97, 97, 94, 93, 100)
p2014 <- c(97, 98, 96, 95, 100)
df <- data.frame(State, p2012, p2013, p2014)

You can then do the following to receive a data frame (long format) sorted by state groups in the order of 2012 percentage:

library(dplyr)
library(tidyr)
df %>%
  gather(Year, Percentage, -State) %>%
  group_by(State) %>%
  mutate(Percentage2012 = Percentage[Year == 'p2012']) %>%
  ungroup() %>%
  arrange(Percentage2012, State, Year) %>%
  select(-Percentage2012)

Resulting data frame as follows:

Source: local data frame [15 x 3]

        State   Year Percentage
       (fctr) (fctr)      (dbl)
1       Delhi  p2012         92
2       Delhi  p2013         93
3       Delhi  p2014         95
4  Chandigarh  p2012         94
5  Chandigarh  p2013         94
6  Chandigarh  p2014         96
7       Bihar  p2012         95
8       Bihar  p2013         97
9       Bihar  p2014         98
10      Assam  p2012         96
11      Assam  p2013         97
12      Assam  p2014         97
13        Goa  p2012         99
14        Goa  p2013        100
15        Goa  p2014        100

Hope this helps. Of course, you can create a 2014 data frame by simply modifying the above code slightly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM