简体   繁体   English

R中的Riverplot包 - 输出图以网格线或轮廓覆盖

[英]Riverplot package in R - output plot covered in gridlines or outlines

I've made a Sankey diagram in R Riverplot (v0.5), the output looks OK small in RStudio, but when exported or zoomed in it the colours have dark outlines or gridlines. 我在R Riverplot中制作了一个Sankey图(v0.5),在RStudio中输出看起来很小,但是当导出或放大时,颜色有黑色轮廓或网格线。

这里链接的Riverplot图像显示了问题

I think it may be because the outlines of the shapes are not matching the transparency I want to use for the fill? 我想这可能是因为形状的轮廓与我想用于填充的透明度不匹配?

I possibly need to find a way to get rid of outlines altogether (rather than make them semi-transparent), as I think they're also the reason why flows with a value of zero still show up as thin lines. 我可能需要找到一种方法来完全摆脱轮廓(而不是让它们半透明),因为我认为它们也是为什么值为零的流仍然显示为细线的原因。

my code is here: 我的代码在这里:

#loading packages
library(readr)
library("riverplot", lib.loc="C:/Program Files/R/R-3.3.2/library")
library(RColorBrewer)

#loaing data
Cambs_flows <- read_csv("~/RProjects/Cambs_flows4.csv")

#defining the edges
edges = rep(Cambs_flows, col.names = c("N1","N2","Value"))
edges    <- data.frame(edges)
edges$ID <- 1:25

#defining the nodes
nodes <- data.frame(ID = c("Cambridge","S Cambs","Rest of E","Rest of UK","Abroad","to Cambridge","to S Cambs","to Rest of E","to Rest of UK","to Abroad"))
nodes$x = c(1,1,1,1,1,2,2,2,2,2)
nodes$y = c(1,2,3,4,5,1,2,3,4,5)

#picking colours
palette = paste0(brewer.pal(5, "Set1"), "90")

#plot styles
styles = lapply(nodes$y, function(n) {
  list(col = palette[n], lty = 0, textcol = "black")
})

#matching nodes to names
names(styles) = nodes$ID

#defining the river
r <- makeRiver( nodes, edges,
                node_labels = c("Cambridge","S Cambs","Rest of E","Rest of UK","Abroad","to Cambridge","to S Cambs","to Rest of E","to Rest of UK","to Abroad"),
                node_styles = styles)

#Plotting
plot( r, plot_area = 0.9)

And my data is here 我的数据就在这里

dput(Cambs_flows)
structure(list(N1 = c("Cambridge", "Cambridge", "Cambridge", 
"Cambridge", "Cambridge", "S Cambs", "S Cambs", "S Cambs", "S Cambs", 
"S Cambs", "Rest of E", "Rest of E", "Rest of E", "Rest of E", 
"Rest of E", "Rest of UK", "Rest of UK", "Rest of UK", "Rest of UK", 
"Rest of UK", "Abroad", "Abroad", "Abroad", "Abroad", "Abroad"
), N2 = c("to Cambridge", "to S Cambs", "to Rest of E", "to Rest of UK", 
"to Abroad", "to Cambridge", "to S Cambs", "to Rest of E", "to Rest of UK", 
"to Abroad", "to Cambridge", "to S Cambs", "to Rest of E", "to Rest of UK", 
"to Abroad", "to Cambridge", "to S Cambs", "to Rest of E", "to Rest of UK", 
"to Abroad", "to Cambridge", "to S Cambs", "to Rest of E", "to Rest of UK", 
"to Abroad"), Value = c(0L, 1616L, 2779L, 13500L, 5670L, 2593L, 
0L, 2975L, 4742L, 1641L, 2555L, 3433L, 0L, 0L, 0L, 6981L, 3802L, 
0L, 0L, 0L, 5670L, 1641L, 0L, 0L, 0L)), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -25L), .Names = c("N1", "N2", 
"Value"), spec = structure(list(cols = structure(list(N1 = structure(list(), class = c("collector_character", 
"collector")), N2 = structure(list(), class = c("collector_character", 
"collector")), Value = structure(list(), class = c("collector_integer", 
"collector"))), .Names = c("N1", "N2", "Value")), default = structure(list(), class = c("collector_guess", 
"collector"))), .Names = c("cols", "default"), class = "col_spec"))

The culprit is a line in riverplot::curveseg . 罪魁祸首是riverplot::curveseg一条线。 We can hack this function to fix it, or there is also a very simple workaround that does not require hacking the function. 我们可以破解此功能来修复它,或者还有一个非常简单的解决方法,不需要破解该功能。 In fact, the simple solution is probably preferably in many cases, but first I explain how to hack the function, so we understand why the workaround also works. 事实上,简单的解决方案可能最好在许多情况下,但首先我解释如何破解该功能,因此我们理解为什么解决方法也有效。 Scroll to the end of this answer if you only want the simple solution: 如果您只想要简单的解决方案,请滚动到此答案的末尾:

UPDATE: The change suggested below has now been implemented in riverplot version 0.6 更新:下面建议的更改现已在河流版本0.6中实施

To edit the function, you can use 要编辑该功能,您可以使用

trace(curveseg, edit=T)

Then find the line near the end of the function that reads 然后找到读取函数末尾附近的行

polygon(c(xx[i], xx[i + 1], xx[i + 1], xx[i]), c(yy[i], 
      yy[i + 1], yy[i + 1] + w, yy[i] + w), col = grad[i], 
      border = grad[i])

We can see here that the package authors chose not to pass the lty parameter to polygon (UPDATE: see this answer for an explanation of why the package author did it this way). 我们在这里可以看到包作者选择不将lty参数传递给polygon (UPDATE:请参阅此答案 ,以解释包作者为何以这种方式执行此操作)。 Change this line by adding lty = 0 (or, if you prefer, border = NA ) and it works as intended for OPs case. 通过添加lty = 0 (或者,如果您愿意,更改border = NA )来更改此行,并且它可以按预期用于OPs情况。 (But note that this may not work well if you wish to render a pdf - see here ) (但请注意,如果您希望渲染pdf,这可能效果不佳 - 请参阅此处

polygon(c(xx[i], xx[i + 1], xx[i + 1], xx[i]), c(yy[i], 
      yy[i + 1], yy[i + 1] + w, yy[i] + w), col = grad[i], 
      border = grad[i], lty=0)

在此输入图像描述

As a side note, this also explains the somewhat odd reported behaviour in the comments that "if you run it twice, the second time the plot looks OK, although export it and the lines come back" . 作为旁注,这也解释了评论中有些奇怪的报告行为, “如果你运行两次,第二次情节看起来不错,虽然导出它并且线条回来” When lty is not specified in a call to polygon , the default value it uses is lty = par("lty") . 如果在调用polygon未指定lty ,则它使用的默认值为lty = par("lty") Initially, the default par("lty") is a solid line, but after running the riverplot function once, par("lty") gets set to 0 during a call to riverplot:::draw.nodes thus, suppressing the lines when riverplot is run a 2nd time. 最初,默认的par("lty")是一条实线,但是在运行riverplot函数一次之后, par("lty")在调用riverplot:::draw.nodes期间被设置为0,从而抑制行riverplot第二次运行。 But if you then try to export the image, opening a new device resets par("lty") to its default value. 但是,如果您尝试导出图像,则打开新设备会将par("lty")重置为其默认值。

An alternative way to update the function with this edit is to use assignInNamespace to overwrite the package function with your own version. 使用此编辑更新函数的另一种方法是使用assignInNamespace用您自己的版本覆盖包函数。 Like this: 像这样:

curveseg.new = function (x0, x1, y0, y1, width = 1, nsteps = 50, col = "#ffcc0066", 
          grad = NULL, lty = 1, form = c("sin", "line")) 
{
  w <- width
  if (!is.null(grad)) {
    grad <- colorRampPaletteAlpha(grad)(nsteps)
  }
  else {
    grad <- rep(col, nsteps)
  }
  form <- match.arg(form, c("sin", "line"))
  if (form == "sin") {
    xx <- seq(-pi/2, pi/2, length.out = nsteps)
    yy <- y0 + (y1 - y0) * (sin(xx) + 1)/2
    xx <- seq(x0, x1, length.out = nsteps)
  }
  if (form == "line") {
    xx <- seq(x0, x1, length.out = nsteps)
    yy <- seq(y0, y1, length.out = nsteps)
  }
  for (i in 1:(nsteps - 1)) {
    polygon(c(xx[i], xx[i + 1], xx[i + 1], xx[i]), 
            c(yy[i], yy[i + 1], yy[i + 1] + w, yy[i] + w), 
            col = grad[i], border = grad[i], lty=0)
    lines(c(xx[i], xx[i + 1]), c(yy[i], yy[i + 1]), lty = lty)
    lines(c(xx[i], xx[i + 1]), c(yy[i] + w, yy[i + 1] + w), lty = lty)
  }
}

assignInNamespace('curveseg', curveseg.new, 'riverplot', pos = -1, envir = as.environment(pos))

Now for the simple solution, which does not require changes to the function: 现在为简单的解决方案,不需要更改功能:

Just add the line par(lty=0) before you plot!!! 只需在绘制之前添加行par(lty=0)

Here is the author of the package. 这是该软件包的作者。 I am now struggling for a satisfactory solution to be included in the next version of the package. 我现在正在努力寻找一个令人满意的解决方案,以包含在下一版本的软件包中。

The problem is with how R renders PDFs as compared to bitmaps. 问题在于R与呈现位图相比如何呈现PDF。 In the original version of the package, indeed I passed on lty=0 to polygon() (you can still see it in the commented source code). 在包的原始版本中,确实我将lty = 0传递给polygon()(您仍然可以在注释的源代码中看到它)。 However, polygon w/o borders looks good only on the png graphics. 但是,没有边框的多边形仅在png图形上看起来很好。 In the pdf output, thin white lines appear between the polygons. 在pdf输出中,多边形之间出现细白线。 Take a look: 看一看:

cc <- "#E41A1C90"
plot.new()
rect(0.2, 0.2, 0.4, 0.4, col=cc, border=NA)
rect(0.4, 0.2, 0.6, 0.4, col=cc, border=NA)
dev.copy2pdf(file="riverplot.pdf")

In X or on png, the output is correct. 在X或png上,输出是正确的。 However, if rendered as PDF, you will see a thin white line between the recangles: 但是,如果呈现为PDF,您将在重叠之间看到一条细白线:

在此输入图像描述

When you render a riverplot graphics as PDF like the one above, this looks really bad: 当您将河流图形渲染为PDF时,如上所示,这看起来非常糟糕:

在此输入图像描述

I therefore forced adding borders, however forgot about checking transparency. 因此我强行添加边框,但忘了检查透明度。 When no transparency is used, this looks OK -- the borders overlap with the polygons as well as which each other, but you cannot see it. 当没有使用透明度时,这看起来没问题 - 边框与多边形以及彼此重叠,但是你看不到它。 The PDF is now acceptable. PDF现已被接受。 However, it messes up the figure if you have transparency. 但是,如果你有透明度,它会弄乱这个数字。

EDIT : 编辑

I have now uploaded version 0.6 of riverplot to CRAN. 我现在已经将版本0.6的riverplot上传到了CRAN。 Besides some new stuff (you can now add riverplot to any part of an existing drawing), by default it uses lty=0 again. 除了一些新东西(你现在可以将河图添加到现有图形的任何部分),默认情况下它再次使用lty = 0。 However, there is now an option called "fix.pdf" which you can set to TRUE in order to draw the borders around the segments again. 但是,现在有一个名为“fix.pdf”的选项,您可以将其设置为TRUE,以便再次绘制段周围的边框。

Bottom line, and solutions for now: 底线和现在的解决方案:

  1. Use riverplot 0.6` 使用riverplot 0.6`
  2. If you want to render a PDF, don't use transparency and use fix.pdf=TRUE 如果要渲染PDF,请不要使用透明度并使用fix.pdf = TRUE
  3. If you want to use both transparency and PDF, help me solving the issue. 如果您想同时使用透明度和PDF,请帮助我解决问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM