I have this dataframe df
below read as df<- read.table("WT1.txt", header= TRUE)
. I want to plot the histogram labelling ACGT frequency for each length value. Is there a better way to plot this?
df
length A C G T
17 95668 73186 162726 730847
18 187013 88641 120631 334695
19 146061 373719 152215 303973
20 249897 73862 115441 343179
21 219899 82356 109536 636704
22 226368 101499 111974 1591106
23 188187 112155 98002 1437280
You could melt the data frame into long format by the variable length
and plot a stacked bar plot with ggplot2
:
df <- read.table(text=
"length A C G T
17 95668 73186 162726 730847
18 187013 88641 120631 334695
19 146061 373719 152215 303973
20 249897 73862 115441 343179
21 219899 82356 109536 636704
22 226368 101499 111974 1591106
23 188187 112155 98002 1437280", header=T)
library(reshape2)
df <- melt(df, id.vars = "length")
library(ggplot2)
ggplot(df)+
geom_bar(aes(x=length, y=value, fill=variable), stat="identity")
Use dplyr
to calculate frequency for each base and ggplot2
to plot bar plot. I prefer using stat = "identity", position = "dodge"
instead of only stat = "identity"
as it gives better sense what data looks like.
library(tidyverse)
gather(df, Base, value, -length) %>%
group_by(length) %>%
mutate(frequency = value / sum(value)) %>%
ggplot(aes(factor(length), y = frequency, fill = Base))+
geom_bar(stat = "identity", position = "dodge",
color = "black", width = 0.6) +
labs(x = "Base pairs",
y = "Frequency",
fill = "Base") +
scale_y_continuous(limits = c(0, 1)) +
scale_fill_brewer(palette = "Set1") +
theme_classic()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.