简体   繁体   English

按第一组元素dplyr排序

[英]sort by first group element dplyr

probably a simple answer but proving surprisingly challenging. 可能是一个简单的答案,但证明令人惊讶的挑战。 My data looks like the following, state-wise percentages occurring over three years: 我的数据如下所示,状态百分比发生在三年内:

State<-c('Assam','Bihar','Chandigarh','Delhi','Goa')
2012<-c(96, 95, 94, 92, 99)
2013<-c(97, 97, 94, 93, 100)
2014<-c(97, 98, 96, 95, 100)

df<-data.frame(State, 2013, 2013, 2014)

I'm trying to group it by State, arrange the years, and then arrange state groups by ascending 2012 percentages. 我试图按国家分组,安排年份,然后通过提高2012年的百分比来安排国家组。 Also need a separate df that will arrange states by their 2014 percentages. 还需要一个单独的df,按照2014年的百分比排列各州。

Here's what I have: 这就是我所拥有的:

library(reshape2)
library(dplyr)

dfmelt<-melt(df, id = 'State')
colnames(dfmelt)<-c('State','Year','Percent')

dfmelt<-dfmelt %>% arrange(Percent) %>% group_by(State) %>% arrange(Year)

Tried a million combinations of last line and can't crack it. 尝试了最后一行的一百万种组合并且无法破解它。 Have looked at similar questions but with no summarizing or mutation here, just pure rearranging, I'm stuck. 看过类似的问题,但这里没有总结或突变,只是纯粹的重新排列,我被卡住了。

Ultimately I'm creating 2 dot plots, one ranking states on Y axis by 2012 %'s and one by 2014 %'s. 最终我创造了2个点图,其中一个在Y轴上按2012年的百分比排名,一个按2014年的百分比排名。 Figure I need the dataframe in exact order for ggplot to do this, right? 图一我需要ggplot的确切顺序的数据帧,对吧? Let me know if I'm mistaken. 如果我弄错了,请告诉我。

Thanks! 谢谢!

If your goal is to order the axis in a ggplot you can do it with your df as-is. 如果您的目标是在ggplot订购轴,则可以使用df原样进行操作。

You just need to make use of 'reorder' in ggplot 你只需要在ggplot使用'reorder'

df<-data.frame("State" = c('Assam','Bihar','Chandigarh','Delhi','Goa'),
                             "2012" = c(96, 95, 94, 92, 99),
                             "2013" = c(97, 97, 94, 93, 100),
                             "2014" = c(97, 98, 96, 95, 100))

library(ggplot2)

ggplot(data=df, aes(x=reorder(State, X2012), y=X2014)) +
    geom_bar(stat="identity")

Your data creation code does not run, and you have 2013 repeated. 您的数据创建代码未运行,并且您重复了2013年。

Here is the code to generate that data: 以下是生成该数据的代码:

State <- c('Assam','Bihar','Chandigarh','Delhi','Goa')
p2012 <- c(96, 95, 94, 92, 99)
p2013 <- c(97, 97, 94, 93, 100)
p2014 <- c(97, 98, 96, 95, 100)
df <- data.frame(State, p2012, p2013, p2014)

You can then do the following to receive a data frame (long format) sorted by state groups in the order of 2012 percentage: 然后,您可以执行以下操作以接收按州百分比排序的状态组排序的数据框(长格式):

library(dplyr)
library(tidyr)
df %>%
  gather(Year, Percentage, -State) %>%
  group_by(State) %>%
  mutate(Percentage2012 = Percentage[Year == 'p2012']) %>%
  ungroup() %>%
  arrange(Percentage2012, State, Year) %>%
  select(-Percentage2012)

Resulting data frame as follows: 得出的数据框如下:

Source: local data frame [15 x 3]

        State   Year Percentage
       (fctr) (fctr)      (dbl)
1       Delhi  p2012         92
2       Delhi  p2013         93
3       Delhi  p2014         95
4  Chandigarh  p2012         94
5  Chandigarh  p2013         94
6  Chandigarh  p2014         96
7       Bihar  p2012         95
8       Bihar  p2013         97
9       Bihar  p2014         98
10      Assam  p2012         96
11      Assam  p2013         97
12      Assam  p2014         97
13        Goa  p2012         99
14        Goa  p2013        100
15        Goa  p2014        100

Hope this helps. 希望这可以帮助。 Of course, you can create a 2014 data frame by simply modifying the above code slightly. 当然,您只需稍微修改上述代码即可创建2014年数据框。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM