[英]Replicating a Data Visualization with R/ggplot
复制我在使用ggplot2
打印介质中看到的可视化ggplot2
内容 :
我一直在努力使数据可视化更吸引人/特别是针对非数据人,这些人是我与之共事的大多数人(诸如市场营销人员,管理人员等利益相关者)-我注意到,当可视化看起来像是学术界时,他们通常会以出版物质量(标准ggplot2
美学)为ggplot2
,即他们无法理解并且ggplot2
,从而一举打败了可视化的全部目的。 但是,当它看起来更具图形性时(例如您可能会在网站或营销材料上看到的东西),他们通常会集中精力并尝试理解可视化效果。 通常,我们最终会遇到来自这些类型的可视化的最有趣的讨论,因此这是我的最终目标。
这是我在一些按地理区域划分的网络流量的设备营销小册子上看到的,尽管它实际上有点忙并且不清楚,但是它比我在标准中创建的类似堆积条形图更好地产生了共鸣-我丝毫没有知道如何在ggplot2
复制这样的ggplot2
,任何尝试将不胜感激! 这是一些样本整洁数据,可用于data.table
:
structure(list(country = c("Argentina", "Argentina", "Argentina",
"Brazil", "Brazil", "Brazil", "Canada",
"Canada", "Canada", "China", "China",
"China", "Japan", "Japan", "Japan", "Spain",
"Spain", "Spain", "UK", "UK", "UK", "USA",
"USA", "USA"),
device_type = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L),
class = "factor",
.Label = c("desktop",
"mobile",
"multi")),
proportion = c(0.37, 0.22, 0.41, 0.3, 0.31, 0.39,
0.35, 0.06, 0.59, 0.19, 0.2, 0.61,
0.4, 0.18, 0.42, 0.16, 0.28, 0.56,
0.27, 0.06, 0.67, 0.37, 0.08, 0.55)),
.Names = c("country", "device_type", "proportion"),
row.names = c(NA, -24L),
class = c("data.table", "data.frame"))
您可以尝试使用“ ggalluvial”软件包及其相应的“ geom”。
您也可以考虑使用googleVis
library(googleVis)
dat <- structure(list(country = c("Argentina", "Argentina", "Argentina",
"Brazil", "Brazil", "Brazil", "Canada",
"Canada", "Canada", "China", "China",
"China", "Japan", "Japan", "Japan", "Spain",
"Spain", "Spain", "UK", "UK", "UK", "USA",
"USA", "USA"),
device_type = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L),
class = "factor",
.Label = c("desktop",
"mobile",
"multi")),
proportion = c(0.37, 0.22, 0.41, 0.3, 0.31, 0.39,
0.35, 0.06, 0.59, 0.19, 0.2, 0.61,
0.4, 0.18, 0.42, 0.16, 0.28, 0.56,
0.27, 0.06, 0.67, 0.37, 0.08, 0.55)),
.Names = c("country", "device_type", "proportion"),
row.names = c(NA, -24L),
class = c("data.table", "data.frame"))
link_order <- unique(dat$country)
node_order <- unique(as.vector(rbind(dat$country, as.character(dat$device_type))))
link_cols <- data.frame(color = c('#ffd1ab', '#ff8d14', '#ff717e', '#dd2c40', '#d6b0ea',
'#8c4fab','#00addb','#297cbe'),
country = c("UK", "Canada", "USA", "China", "Spain", "Japan", "Argentina", "Brazil"),
stringsAsFactors = F)
node_cols <- data.frame(color = c("#ffc796", "#ff7100", "#ff485b", "#d20000",
"#cc98e6", "#6f2296", "#009bd2", "#005daf",
"grey", "grey", "grey"),
type = c("UK", "Canada", "USA", "China", "Spain", "Japan",
"Argentina", "Brazil", "multi", "desktop", "mobile"))
link_cols2 <- sapply(link_order, function(x) link_cols[x == link_cols$country, "color"])
node_cols2 <- sapply(node_order, function(x) node_cols[x == node_cols$type, "color"])
actual_link_cols <- paste0("[", paste0("'", link_cols2,"'", collapse = ','), "]")
actual_node_cols <- paste0("[", paste0("'", node_cols2,"'", collapse = ','), "]")
opts <- paste0("{
link: { colorMode: 'source',
colors: ", actual_link_cols ," },
node: {colors: ", actual_node_cols ,"}}")
Sankey <- gvisSankey(dat,
from = "country",
to = "device_type",
weight = "proportion",
options = list(height = 500, width = 1000, sankey = opts))
plot(Sankey)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.