[英]geom_bar ggplot2 stacked, grouped bar plot with positive and negative values - pyramid plot
我甚至不知道如何描述我想要正確生成的情節,這不是一個好的開始。 我將首先向您展示我的數據,然后嘗試解釋/顯示具有該元素的圖像。
我的數據:
strain condition count.up count.down
1 phbA balanced 120 -102
2 phbA limited 114 -319
3 phbB balanced 122 -148
4 phbB limited 97 -201
5 phbAB balanced 268 -243
6 phbAB limited 140 -189
7 phbC balanced 55 -65
8 phbC limited 104 -187
9 phaZ balanced 99 -28
10 phaZ limited 147 -205
11 bdhA balanced 246 -159
12 bdhA limited 143 -383
13 acsA2 balanced 491 -389
14 acsA2 limited 131 -295
我有七個樣本,每個樣本有兩個條件。 對於這些樣本中的每一個,我都有下調基因的數量,以及被上調的基因數量(count.down和count.up)。
我想繪制這個,以便每個樣本分組; 因此除了phbA限制之外,phbA平衡被躲過了。 每個條形圖將在圖的正面具有一部分(表示count.up#),並且在圖的負面部分具有一部分(表示count.down#)。
我希望“平衡”條件下的條形為一種顏色,而“有限”條件下的條形變為另一種顏色。 理想情況下,每種顏色會有兩個漸變(一個用於count.up,另一個用於count.down),只是為了在條形的兩個部分之間產生視覺差異。
一些圖像中包含我想要組合在一起的元素:
我也嘗試應用這個stackoverflow示例的一些部分,但我無法弄清楚如何使它適用於我的數據集。 我喜歡這里的pos v.neg酒吧; 一個覆蓋它們的單個條形圖,以及它的顏色區別。 這沒有一個樣本的條件分組,或者區分條件的顏色編碼額外層
我嘗試了很多東西,但我無法做到。 我認為我真的很掙扎,因為很多geom_bar示例使用計數數據,該圖計算自己,我給它直接計數數據。 我似乎無法在我的代碼中成功區分,當我轉移到stat= "identity"
一切都變得混亂。 任何想法或建議將非常感謝!
使用建議的鏈接:所以我一直在玩這個作為模板,但我已經卡住了。
df <- read.csv("countdata.csv", header=T)
df.m <- melt(df, id.vars = c("strain", "condition"))
ggplot(df.m, aes(condition)) + geom_bar(subset = ,(variable == "count.up"), aes(y = value, fill = strain), stat = "identity") + geom_bar(subset = ,(variable == "count.down"), aes(y = -value, fill = strain), stat = "identity") + xlab("") + scale_y_continuous("Export - Import",formatter = "comma")
當我嘗試運行ggplot行時,它返回了一個錯誤:找不到函數“。”。 我意識到我沒有安裝/加載dplyr,所以我做到了。 然后我玩了很多,最后得出結論:
library(ggplot2)
library(reshape2)
library(dplyr)
library(plyr)
df <- read.csv("countdata.csv", header=T)
df.m <- melt(df, id.vars = c("strain", "condition"))
#this is what the df.m looks like now (if you look at my initial input df, I just changed in the numbers in excel to all be positive). Included so you can see what the melt does
df.m =read.table(text = "
strain condition variable value
1 phbA balanced count.up 120
2 phbA limited count.up 114
3 phbB balanced count.up 122
4 phbB limited count.up 97
5 phbAB balanced count.up 268
6 phbAB limited count.up 140
7 phbC balanced count.up 55
8 phbC limited count.up 104
9 phaZ balanced count.up 99
10 phaZ limited count.up 147
11 bdhA balanced count.up 246
12 bdhA limited count.up 143
13 acsA2 balanced count.up 491
14 acsA2 limited count.up 131
15 phbA balanced count.down 102
16 phbA limited count.down 319
17 phbB balanced count.down 148
18 phbB limited count.down 201
19 phbAB balanced count.down 243
20 phbAB limited count.down 189
21 phbC balanced count.down 65
22 phbC limited count.down 187
23 phaZ balanced count.down 28
24 phaZ limited count.down 205
25 bdhA balanced count.down 159
26 bdhA limited count.down 383
27 acsA2 balanced count.down 389
28 acsA2 limited count.down 295", header = TRUE)
這兩個條件下的應變,count.up和count.down值
ggplot(df.m, aes(strain)) + geom_bar(subset = .(variable == "count.up"), aes(y = value, fill = condition), stat = "identity") + geom_bar(subset = .(variable == "count.down"), aes(y = -value, fill = condition), stat = "identity") + xlab("")
#this adds a line break at zero
labels <- gsub("20([0-9]{2})M([0-9]{2})", "\\2\n\\1",
df.m$strain)
#this adds a line break at zero to improve readability
last_plot() + geom_hline(yintercept = 0,colour = "grey90")
我不能工作的一件事(不幸的是)是如何在每個條形框內顯示代表“值”的數字。 我已經得到了要顯示的數字,但我無法將它們放在正確的位置。 我有點瘋了!
我的數據與上述相同; 這是我的代碼所在的位置
我看了很多例子,顯示了在躲閃圖上使用geom_text的標簽。 我一直無法成功實施。 我得到的最接近的如下 - 任何建議將不勝感激!
library(ggplot2)
library(reshape2)
library(plyr)
library(dplyr)
df <- read.csv("countdata.csv", header=T)
df.m <- melt(df, id.vars = c("strain", "condition"))
ggplot(df.m, aes(strain), ylim(-500:500)) +
geom_bar(subset = .(variable == "count.up"),
aes(y = value, fill = condition), stat = "identity", position = "dodge") +
geom_bar(subset = .(variable == "count.down"),
aes(y = -value, fill = condition), stat = "identity", position = "dodge") +
geom_hline(yintercept = 0,colour = "grey90")
last_plot() + geom_text(aes(strain, value, group=condition, label=label, ymax = 500, ymin= -500), position = position_dodge(width=0.9),size=4)
這給了這個:
你為什么不調整!
我懷疑我的問題與我實際繪制的方式有關,或者我沒有正確地告訴geom_text命令如何定位自己。 有什么想法嗎?
試試這個。 就像你用兩個語句(一個用於正數,一個用於負數)定位條形時,以相同的方式定位文本。 然后,使用vjust
微調它們的位置(在欄內或欄外)。 此外,數據框中沒有'label'變量; 我認為,標簽是value
。
library(ggplot2)
## Using your df.m data frame
ggplot(df.m, aes(strain), ylim(-500:500)) +
geom_bar(data = subset(df.m, variable == "count.up"),
aes(y = value, fill = condition), stat = "identity", position = "dodge") +
geom_bar(data = subset(df.m, variable == "count.down"),
aes(y = -value, fill = condition), stat = "identity", position = "dodge") +
geom_hline(yintercept = 0,colour = "grey90")
last_plot() +
geom_text(data = subset(df.m, variable == "count.up"),
aes(strain, value, group=condition, label=value),
position = position_dodge(width=0.9), vjust = 1.5, size=4) +
geom_text(data = subset(df.m, variable == "count.down"),
aes(strain, -value, group=condition, label=value),
position = position_dodge(width=0.9), vjust = -.5, size=4) +
coord_cartesian(ylim = c(-500, 500))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.