[英]Achieving a smooth color ramp
我在Excel中有热图,我试图在R中重新创建。它基本上是RFM分割的数据,并且在excel中颜色范围很大但是我很难在R中获得如此漂亮的平滑颜色渐变并尝试了很多方法却无法实现相同的平滑渐变。
我的Excel热图如下所示:
我在R中的热图如下所示:
我的R代码是:
cols <- brewer.pal(9, 'RdYlGn')
ggplot(xxx)+
geom_tile(aes(x= mon, y = reorder(freq, desc(freq)), fill = n)) +
facet_grid(rec~.) +
# geom_text(aes(label=n)) +
# scale_fill_gradient2(midpoint = (max(xxx$n)/2), low = "red", mid =
"yellow", high = "darkgreen") +
# scale_fill_gradient(low = "red", high = "blue") +
scale_fill_gradientn(colours = cols) +
# scale_fill_brewer() +
labs(x = "monetary", y= "frequency") +
scale_x_discrete(expand = c(0,0)) +
scale_y_discrete(expand = c(0,0)) +
coord_fixed(ratio= 0.5) +
theme(legend.position = "none")
如何应用ColorRampPalette
来实现与Excel中相同的平滑颜色渐变或任何使我具有更平滑渐变的方法? R中的渐变不是很好。
我不能在这里发布我的数据集,因为它有30,000条记录。 我使用dput(head(df))转储我的数据集的头部:
structure(list(rfm_score = c(111, 112, 113, 114, 115, 121), n = c(2624L,
160L, 270L, 23L, 5L, 650L), rec = structure(c(1L, 1L, 1L, 1L,
1L, 1L), .Label = c("1", "2", "3", "4", "5"), class = "factor"),
freq = structure(c(1L, 1L, 1L, 1L, 1L, 2L), .Label = c("1",
"2", "3", "4", "5"), class = "factor"), mon = structure(c(1L,
2L, 3L, 4L, 5L, 1L), .Label = c("1", "2", "3", "4", "5"), class =
"factor")), row.names = c(NA,
6L), class = "data.frame")
您可以使用tableHTML
包:
这是我使用的数据:
df <- structure(list(rfm_score = c(111, 112, 113, 114, 115, 121), n = c(2624L,
160L, 270L, 23L, 5L, 650L), rec = structure(c(1L, 1L, 1L, 1L,
1L, 1L), .Label = c("1", "2", "3", "4", "5"), class = "factor"),
freq = structure(c(1L, 1L, 1L, 1L, 1L, 2L), .Label = c("1",
"2", "3", "4", "5"), class = "factor"), mon = structure(c(1L,
2L, 3L, 4L, 5L, 1L), .Label = c("1", "2", "3", "4", "5"), class =
"factor")), row.names = c(NA,
6L), class = "data.frame")
加载包:
library(tableHTML)
重塑data.frame
以反映您拥有的结构:
df <- data.table::dcast(df,
rec + freq ~ mon,
value.var = "rfm_score",
fill = "")
rec freq 1 2 3 4 5
1 1 1 111 112 113 114 115
2 1 2 121
然后,您可以创建一个tableHTML
对象并将css应用于它以调整样式:步骤如下:
tableHTML
对象 "Mon."
背景颜色 "Blues"
为rec
和freq
列添加颜色等级 ""
)为白色 RAG
(红色,琥珀色,绿色)颜色等级应用于Mon.
列 Mon.
下方的标题Mon.
\\
df %>%
tableHTML(rownames = FALSE,
second_headers = list(c(2, 5),
c("", "Mon.")),
caption = "<br>RFM Segmentation <br> Count of Cust in each Segment",
widths = c(rep(80, 2), rep(100, 5))) %>%
add_css_caption(css = list(c("background-color", "border"),
c("#F9E9DC", "1px solid black"))) %>%
add_css_second_header(css = list("background-color",
"lightgray"),
second_headers = 2) %>%
add_css_conditional_column(conditional = "colour_rank",
colour_rank_css = make_css_colour_rank_theme(list(rec = df$rec),
RColorBrewer::brewer.pal(5, "Blues")),
columns = 1) %>%
add_css_conditional_column(conditional = "colour_rank",
colour_rank_css = make_css_colour_rank_theme(list(freq = df$freq),
RColorBrewer::brewer.pal(5, "Blues")),
columns = 2) %>%
add_css_conditional_column(conditional = "==",
value = "",
css = list(c("background-color", "color"),
c("white", "white")),
columns = 3:7) %>%
add_css_conditional_column(conditional = "colour_rank",
colour_rank_theme = "RAG",
columns = 3:7,
decreasing = TRUE) %>%
add_css_header(css = list("background-color",
"#EFF3FF"),
header = 3) %>%
add_css_header(css = list("background-color",
"#BDD7E7"),
header = 4) %>%
add_css_header(css = list("background-color",
"#6BAED6"),
header = 5) %>%
add_css_header(css = list("background-color",
"#3182BD"),
header = 6) %>%
add_css_header(css = list("background-color",
"#08519C"),
header = 7)
结果如下:
主要问题是gradientn()
将产生线性色标。 查看在Excel中完成的示例,值1显示为红色,200显示为黄色,2000显示为绿色。 我不知道Excel是如何缩放的(我会猜测百分位数?),但它绝对不是线性的。
如果线性值是重要的并且转换这些数据是不合适的,那么Excel中的色标会产生误导。 看起来值的分布非常广泛,但实际上,大多数值都相似,因此非常低,如ggplot2
色标所示。
如果log转换值是合理的或合适的,那么就这样做。 这会给你一个与Excel给出的相似的比例,但对于观众来说会更加清晰。
这是一个例子:
library(ggplot2)
library(RColorBrewer)
set.seed(123) ; rn = rnorm(25, mean = 5, sd = 2)
df = data.frame(monetary = rep(seq(5),5),
frequency = sort(rep(seq(5),5)),
val = 10^rn)
pal = brewer.pal(9, "RdYlGn")
# mostly red, a few green (very high) values
ggplot(df, aes(monetary, frequency)) +
geom_tile(aes(fill = val)) +
scale_fill_gradientn(colors = pal)
# log transforming evens out scale
ggplot(df, aes(monetary, frequency)) +
geom_tile(aes(fill = log10(val))) +
scale_fill_gradientn(colors = pal)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.