简体   繁体   中英

R not plotting all lines/points

I have the following code:

library(purrr)
library(dplyr)
library(ineq)
library(ggplot2)
taxations <- c(0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1)
N = 1000000
gamma <- 1/12 #scale parameter
beta <- 0.95#shape parameter
u <- runif(N)
v <- runif(N)
tau <- -gamma*log(u)*(sin(beta*pi)/tan(beta*pi*v)-cos(beta*pi))^(1/beta)

make_lorenz_2 <- function(i, tau) {
  newtau = sort(tau)
  transfers = i * sort(tau, decreasing = T)
  newtau = ((1 - i) * tau) + transfers
  OX <- sort(newtau)
  CumWealth <- cumsum(OX)/sum(newtau)
  PoorPopulation <- c(1:N)/N
  index <- c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.95,0.99,0.999,0.9999,0.99999,0.999999,1)*N
  QQth <- CumWealth[index]
  x <- PoorPopulation[index]
  data.frame(x,QQth, Gini(newtau))
}


Lorenzdf1 <- purrr::map(taxations, tau = tau, make_lorenz_2) %>% 
  setNames(taxations) %>% 
  bind_rows(.id = "taxations")

Lorenzdf1
cols <- c("0" = "black", "0.1"="blue","0.2"="green","0.3"="red", "0.4" = "grey", "0.5" = "pink", "0.6"="yellow","0.7"="brown","0.8"="orange", "0.9" = "purple", "1" = "darkgreen")
g <- ggplot(data=Lorenzdf1, aes(x=x, y=QQth, colour = taxations)) +
  geom_point() + 
  geom_line() +
  ggtitle("Lorenz curves after flat taxation") + 
  xlab("Cumulative share of people from lowest to highest wealth") +
  ylab("Cumulative share of wealth") +
  scale_color_manual(name="Rate of taxation",values=cols)

This produces the linked image. Everything works perfect but I'm not sure if some lines are not plotting or if they are just close together so that some don't show. If someone could clarify that would really help.

在此处输入图像描述

It's all there, but the points are too close together, giving you overplotting. Consider the points where x==0.3 :

> Lorenzdf1[which(Lorenzdf1$x==0.3),]

    taxations   x       QQth Gini.newtau.
3           0 0.3 0.02526036    0.7225117
19        0.1 0.3 0.03738772    0.6916739
35        0.2 0.3 0.04451107    0.6708302
51        0.3 0.3 0.04888652    0.6566574
67        0.4 0.3 0.05129901    0.6483685
83        0.5 0.3 0.05207639    0.6456338
99        0.6 0.3 0.05131045    0.6483590
115       0.7 0.3 0.04888973    0.6566413
131       0.8 0.3 0.04451143    0.6708137
147       0.9 0.3 0.03738566    0.6916665
163         1 0.3 0.02526036    0.7225117

You can see, for example the overlap pattern is basically centered around taxations of 0.5 as the maximum and then in pairs around that. I would therefore expect that those points would overplot and you should see 6 lines in the plot when viewing a zoomed in portion around x=0.3, which is in fact what you see ( coord_cartesian() is required so that you don't loose the connection with the rest of the data for the lines):

# where p == your plot
p + coord_cartesian(xlim=c(0.25,0.35), ylim=c(0.02,0.055))

在此处输入图像描述

If you want differentiation at all, you need to plot your data differently. The relationship of what to expect for overlap is even more apparent (centered with maxima around 0.5 and dropping off in each direction for "taxations") when you plot x=taxations and set color=x . (note you also need to set group=x for this to work as well to draw the lines properly among the points):

ggplot(Lorenzdf1, aes(x=taxations, y=QQth, color=factor(x), group=factor(x))) +
 geom_line() + geom_point()

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM