[英]How to get all dots in boxplot using ggplot?
我想为我的每个数据显示一个带点的箱线图。
这是我对数据的下采样:
value value1 value2 value3 value4 value5 value6 value7 value8 value9 value10 value11 value12 value13 value14 value15 value16 value17 value18 value19 value20 value21 value22 value23 value24 value25 value26 value27 value28 value29 value30 value31 value32 value33 value34 value35 value36 value37 value38 value39 value40 value41 value42 value43 value44 value45 value46 value47 value48 value49 value50 value51 value52 value53 value54 value55 value56 value57 value58 value59 value60 value61 value62 value63 value64 value65 value66 value67 value68 value69 value70 value71 value72 value73 value74 value75 value76 value77 value78 value79 value80 value81 value82 value83 value84 value85 value86 value87 value88 value89 value90 value91 value92 value93
1 DLBCL 1994.95631 2621.3410 753.2132 0.000000 11197.10111 0.000000 176.337991 2000.983371 862.402989 8491.35251 0.000000 0.000000 0.000000 0.000000 0.000000 1293.604484 431.201495 11022.058175 6899.22391 1557.191604 0.00000 0.0000000 491.33939 0.00000 935.4880 473.089640 117093.3704 267.06673 0.000000 1201.315893 546.473181 817.685797 5550.213652 5864.340327 0.000000 756.0793 1186.963254 0.000000 0.000000 182.35834 0.000000 0.000000 2.221214e+04 546.4731813 0.000000 22467.36115 25197.16560 4527.61569 47851.49797 0.0000000 809.029514 1780.444881 466.4264055 2854.851275 2178.702289 0.000000 1155.2188880 0.000000 0.000000 0.000000 0.0000000 325.947587 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0000000 0.000000 5219.72808 0.000000 1092.946363 1914.235537 0.00000 41395.343 5012.19294 0.0000 0.00000 0.000000 0.00000 211214.036 771.94114 5792.9344 155407.942 586.647915 904.81625 5221.03431 26527.2485 118750.28 103149.05
2 HL 2685.55082 3282.5779 4598.1600 4183.367213 1465.89302 0.000000 66.245848 0.000000 161.991801 61.34601 161.991801 0.000000 485.975403 404.979503 80.995901 80.995901 161.991801 6164.020846 4211.78683 17549.958130 2601.72383 1143.4715367 1292.08891 2101.51526 8785.9960 157.980575 25628.0113 2257.43413 426.060627 3572.830049 410.593080 11519.416962 23630.893343 47042.419019 2594.830952 5964.8488 3901.738003 0.000000 0.000000 376.79150 0.000000 833.100691 1.251683e+05 3797.9859885 4500.351000 231.24480 901.51959 8990.54496 21686.09505 0.0000000 50.655417 0.000000 5081.5230881 766.069601 8594.091339 4754.510950 578.6497823 0.000000 0.000000 540.128957 5906.6921396 1897.982677 0.000000 0.000000 0.00000 517.142472 0.000000 90.021493 0.000000 0.000000 395.929041 51.1553056 0.000000 5501.47987 569.641498 1180.455105 1258.479657 0.00000 31700.549 8406.06103 650.9810 198.52612 1888.006678 183.67574 130532.228 108.74974 3400.4110 58514.733 4600.624542 1019.75167 0.00000 20734.9505 163994.61 181005.92
3 HL 3937.68099 5174.0505 14309.5447 17201.448539 6027.55676 0.000000 1566.266081 246.848582 9575.025066 966.94533 5745.015039 5106.680035 5745.015039 8298.355057 5745.015039 8936.690061 3830.010026 2595.831304 0.00000 3842.016327 932.01765 0.0000000 0.00000 0.00000 12463.7614 2256.666225 105760.7753 165061.07726 2014.690206 296.397390 808.979015 0.000000 684.694530 0.000000 1120.551505 47009.4381 0.000000 0.000000 0.000000 809.86996 0.000000 6565.731474 1.992851e+03 2831.4265541 0.000000 911.22915 0.00000 0.00000 0.00000 0.0000000 0.000000 0.000000 345.2403404 1811.236269 0.000000 1561.277973 0.0000000 0.000000 736.098023 3192.598806 0.0000000 0.000000 0.000000 0.000000 0.00000 9897.983156 0.000000 3015.232206 0.000000 1210.472305 3120.347631 2015.7947507 0.000000 89720.16482 0.000000 0.000000 0.000000 984.42025 23569.292 794.98586 570.0480 0.00000 0.000000 482.52095 42461.843 571.37679 3573.1872 25446.846 1519.791401 0.00000 0.00000 57004.8004 153509.90 112514.3
这是我的代码:
data2=read.table("/../data.txt",sep="\t",header=TRUE )
data2 %>%
ggplot( aes(x=name, y=value, value1, value2, value3, value4, value5, value6, value7, value8, value9, value10, value11, value12, value13, value14, value15, value16, value17, value18, value19, value20, value21, value22, value23, value24, value25, value26, value27, value28, value29, value30, value31, value32, value33, value34, value35, value36, value37, value38, value39, value40, value41, value42, value43, value44, value45, value46, value47, value48, value49, value50, value51, value52, value53, value54, value55, value56, value57, value58, value59, value60, value61, value62, value63, value64, value65, value66, value67, value68, value69, value70, value71, value72, value73, value74, value75, value76, value77, value78, value79, value80, value81, value82, value83, value84, value85, value86, value87, value88, value89, value90, value91, value92, value93, fill=name)) +
geom_boxplot() +
scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
theme_ipsum() +
theme(
legend.position="none",
plot.title = element_text(size=11)
) +
ggtitle("Distribution of ... ") +
xlab("")
我得到了一个 plot 但不是我的所有数据都出现了。 我怀疑只考虑了第一列(值)。
我错过了什么? 有谁知道得到我所有点的技巧吗?
非常感谢!
您可以尝试将数据整形为长:
library(ggplot2)
library(dplyr)
library(tidyr)
#Code
data2 %>%
rename(key=value) %>%
pivot_longer(-key) %>%
ggplot(aes(x=key,y=value,fill=name))+
geom_boxplot() +
#scale_fill_viridis(discrete = TRUE, alpha=0.6) +
geom_jitter(color="black", size=0.4, alpha=0.9) +
#theme_ipsum() +
theme(
legend.position="none",
plot.title = element_text(size=11)
) +
ggtitle("Distribution of total EBV gene expression for each PTCL subtype ") +
xlab("")
Output:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.