简体   繁体   English

如何可视化成分随时间的“逐步”变化

[英]How to visualize “stepwise” change of composition over time

I have a data.frame containing the distribution of seats in parliament between parties at the year of the election. 我有一个data.frame,其中包含选举当年政党之间议会席位的分配。 Eventually, I would like to obtain a graph similar to this one . 最后,我想获得一个类似的图表一个 I want to visualize the composition of the parliament over the years, not only for the election year. 我想想像这些年来的议会组成,而不仅仅是选举年。

results<-structure(list(party = c("PARTY1", "PARTY1", "PARTY1", "PARTY1", "PARTY2", "PARTY2", 
"PARTY2", "PARTY2", "PARTY2", "PARTY2", "PARTY3", "PARTY3", "PARTY3", "PARTY3", "PARTY3", 
"PARTY3", "PARTY3", "PART4", "PART4", "PART4", "PART4"), year = c(1996, 
1998, 2000, 2010, 1996, 2000, 2002, 2006, 2010, 2014, 1996, 1998, 
2000, 2002, 2006, 2010, 2014, 2002, 2006, 2010, 2014), party.seats = c(8, 
6, 5, 3, 19, 8, 10, 9, 7, 10, 9, 4, 6, 5, 3, 4, 5, 3, 7, 8, 6
)), class = "data.frame", row.names = c(NA, -21L), .Names = c("party", 
"year", "party.seats"))

I am able to produce a bar chart, which however only presents me the data for the election year and misses the years between the elections. 我能够生成条形图,但是该条形图仅向我显示选举年的数据,而错过选举之间的年份。

ggplot(data=results,aes(x=as.factor(year), y=party.seats, fill=party, label=party))+geom_bar(stat="identity")

I am able to produce a ggplot chart with geom_area, which is however misleading since it suggests that the distribution of seats is changing during the years following the elections (there is a slop, and not a “step”). 我能够生成带有geom_area的ggplot图表,但是这具有误导性,因为它表明在选举后的几年中,席位的分布正在发生变化(存在倾斜,而不是“阶梯”)。

ggplot(as.data.frame(xtabs(party.seats~year+party, results)), aes(x=as.Date(as.character(year), "%Y"), y = Freq, fill = party)) +  geom_area(position = "stack")

Any help? 有什么帮助吗? I am particularly wondering whether there is a (time-series related?) command which would take the results of the election year to all subsequent years until new elections were held. 我特别想知道是否有一个(与时间序列有关?)命令,它将选举年的结果带到随后的所有年份,直到举行新的选举。 So basically, a command which takes the election event at time x as ongoing (= fills years in between) until new elections are held at time y. 因此,基本上,一个命令将x时刻的选举活动持续进行(=填充之间的年数),直到y时刻举行新的选举为止。

I think that geom_step is what you are looking for, though the simplest implementation will not have the bars/areas stack to the total number of seats allotted (though that may be better): 我认为geom_step是您要寻找的,尽管最简单的实现不会将条形/区域堆栈分配给总座位数(尽管可能更好):

ggplot(data=results
      , aes(x=year
            , y=party.seats
            , col=party)) +
  geom_step()

在此处输入图片说明

If you really want you can get the fills, though like in @Haboryme's answer you will need to generate all of the points in between elections. 如果您真的希望可以得到填补,尽管就像@Haboryme的回答一样,您将需要在两次选举之间生成所有点。 Here, I use dplyr / tidyr to add a new data row for each day between elections (you just need the resolution narrow enough that the "step" appears instantaneous rather than spread over a full year on the final plot) with some added after the most recent election to make those values actually show up. 在这里,我使用dplyr / tidyr为选举之间的每一天添加一个新的数据行(您只需要足够小的分辨率,以使“步骤”是瞬时的,而不是在最终图上散布整整一年),然后在最近一次选举使这些价值观真正得到体现。 I then fill the party seats from before forward until the next election, and set the missings to 0 for good measure (before the party had any seats). 然后,我从前到下一次选举填补党的席位,并充分地将缺失设置为0(在党有任何席位之前)。

Note that you could extend this with the exact dates of elections instead of just the years without needing to modify too much 请注意,您可以将其扩展为确切的选举日期,而不只是年份,而无需过多修改

results %>%
  complete(year = full_seq(c(min(year), max(year) + 1), 1/365), party) %>%
  group_by(party) %>%
  fill(party.seats) %>%
  replace_na(replace = list(party.seats = 0)) %>%
  ggplot(
    aes(x=year
        , y=party.seats
        , fill=party)) +
  geom_area(position = "stack")

gives

在此处输入图片说明

I still prefer the lines though, as it is easier to compare the parties against each other when they are not stacked on top of each other. 不过,我还是喜欢这条线,因为当政党不相互重叠时,将政党彼此比较比较容易。 For example, from 2010 to 2014, it is difficult to tell from the area versions whether party 2 or 4 has more seats (but it is clear from the lines). 例如,从2010年到2014年,很难从区域版本中判断第2方或第4方是否有更多席位(但从线条上可以明显看出)。

Another option could be to create the complete dataframe with all the missing years: 另一种选择是使用所有缺少的年份创建完整的数据框:

library(tidyverse)                      
library(zoo)
all_years=seq(min(results$year),max(results$year)) #get the sequence of all the years considered
filled=data.frame(party=rep(unique(results$party),each=length(all_years)), #build a df with the seq of years for each party
                  year=rep(all_years,length(unique(results$party))))

Then merge with your data and fill the NA (with 0 if at the start, with the most recent value else): 然后与您的数据合并并填充NA(如果开始时为0,否则为最新值):

df=merge(results,filled,by.y=c("party","year"),all.y=T)%>%
  group_by(party)%>%
  na.locf()%>%
  mutate(party.seats=coalesce(as.numeric(party.seats), 0))  

Plot with geom_bar and width=1 to have something that looks continuous: 使用geom_barwidth=1进行绘制以具有连续的外观:

ggplot(data=df,aes(x=as.factor(year), y=party.seats, fill=party, label=party))+
  geom_bar(stat="identity",width = 1)

It gives (the x axis needs some tweaking): 它给出了(x轴需要一些调整):
在此处输入图片说明

You can try fancy streamgraph too (you will get plotly like mouse-hover tooltips too): 您可以尝试花式streamgraph太(你会得到plotly像鼠标悬停提示过):

library(dplyr)
library(streamgraph)
results %>%
  streamgraph("party", "party.seats", "year") %>%
  sg_axis_x(1, "year", "%Y") %>%
  sg_legend(TRUE, "party")

在此处输入图片说明

results %>%
  streamgraph("party", "party.seats", "year", offset="zero", interpolate="step") %>%
  sg_axis_x(1, "year", "%Y") %>%
  sg_fill_brewer("PuOr")

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM