简体   繁体   English

通过为R中每个条的不同段指定名称,使直方图更清晰

[英]Make a histogram clearer by assigning names to different segments of each bar in R

Assume that I have a data frame with two columns and 19 rows (see below); 假设我有一个包含两列和19行的数据框(见下文); The left column is the name of cell lines and the right one is the expression of gene ZEB1 in corresponding cell line. 左栏是细胞系的名称,右栏是相应细胞系中基因ZEB1的表达。

    CellLines   ZEB1
    600MPE  2.8186
    AU565   2.783
    BT20    2.7817
    BT474   2.6433
    BT483   2.4994
    BT549   3.035
    CAMA1   2.718
    DU4475  2.8005
    HBL100  2.6745
    HCC38   3.2884
    HCC70   2.597
    HCC202  2.8557
    HCC1007 2.7794
    HCC1008 2.4513
    HCC1143 2.8159
    HCC1187 2.6372
    HCC1428 2.7327
    HCC1500 2.7564
    HCC1569 2.8093

I've drawn a histogram for this data using simple code below: 我使用下面的简单代码为这些数据绘制了直方图:

hist(Heiser$ZEB1[1:19], breaks=50, col="grey")

and it gives me the histogram whose x axis is the amount of gene expression and the y axis is the frequency of that expression among cell lines; 它给出了直方图,其x轴是基因表达量,y轴是细胞系中表达的频率; however, I would like to add the name of cell lines to their specific positions on histogram... How can I do that? 但是,我想在直方图上将细胞系的名称添加到它们的特定位置......我怎么能这样做?

Thanks in advance for your time on answering this :-) Best. 提前感谢您回答这个问题的时间:-)最好的。

One alternative is to use text to insert labels into the plot: 另一种方法是使用text在标注中插入标签:

hist(Heiser$ZEB1[1:19], breaks=50, col="grey")
text(Heiser$ZEB1, 2, labels= Heiser$CellLines, srt=90)

在此输入图像描述

Edit: 编辑:

Positioning labels in the same category one over another: 将标签定位在同一类别中:

Heiser_hist <- hist(Heiser$ZEB1[1:19], breaks=50, col="grey")
Heiser$cut <- cut(Heiser$ZEB1, breaks=Heiser_hist$breaks)
library(dplyr)
Heiser <- Heiser %>% group_by(cut) %>% mutate(pos = seq(from=1, to=2, length.out=length(ZEB1)))
with(Heiser, text(ZEB1, pos, labels=CellLines, srt=45, cex=0.9))

在此输入图像描述

You could try the text without inclination changing srt , but the overplotting is worse in that case. 您可以在没有倾斜度改变srt情况下尝试文本,但在这种情况下,过度绘图更糟糕。 You could also play with the x axis to reduce overplottig. 您也可以使用x轴来减少overplottig。

You are going to have a problem with overlapping labels (not sure what you want to do there) but 您将遇到重叠标签的问题(不确定您要在那里做什么)但是

hist(Heiser$ZEB1[1:19], breaks=50, col="grey", xaxt="n")
axis(1,Heiser$ZEB1, Heiser$CellLines )

I think gives you what you're after based on the description. 我认为根据描述给你你所追求的东西。

Are you sure you don't want a bar plot instead? 你确定你不想要一个条形图吗? Because with a histogram, one bar does not represent one observation. 因为使用直方图,一个条形不代表一个观察。 The histogram is an attempt to estimate the underlying probability density function for continuous variables. 直方图是尝试估计连续变量的基础概率密度函数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM