简体   繁体   English

ggplot:使用一个公共 y 轴由多个变量(而不是变量中的多个类别)分隔的多面板/分面散点图

[英]ggplot: Multi-panel/facet scatter plots separated by multiple variables (AND NOT by multiple categories within a variable) using one common y-axis

My dataframe loopsubset_created contains 30 observations of 45 variables.我的数据框 loopsubset_created包含 45 个变量的 30 个观察值。 (Below you will find str(loopsubset_created) and a dput(loopsubset_created) sample). (您将在下面找到str(loopsubset_created)dput(loopsubset_created)示例)。

Now I would like to create scatter plots of the PdKeyT -Variable (y) versus five of the band-value-variables ( BLUE , GREEN , RED , SWIR1 , SWIR2 ) (x) with现在我想创建PdKeyT -Variable (y) 与五个波段值变量( BLUEGREENREDSWIR1SWIR2 )(x)的SWIR2

  • each variable in ONE panel一个面板中的每个变量
  • all panels aligned in ONE row所有面板排成一排
  • using the PdKeyT variable as common y-axis.使用PdKeyT变量作为公共 y 轴。

In the end it should basically look like this:最后它应该基本上是这样的:
(I did this with ggscatter, but for flexibility reasons I would prefer basically using ggplot) (我用 ggscatter 做到了这一点,但出于灵活性的原因,我更喜欢基本上使用 ggplot)

在此处输入图片说明

Here now my issue :现在我的问题是
When trying with ggplot, I do not find the right way for the above showed arrangement, as I cannot figure out the right code for separating/grouping by variables.尝试使用 ggplot 时,我没有找到上面显示的排列的正确方法,因为我无法找出按变量分隔/分组的正确代码。 I found hundreds of tutorials for facetting by multiple categorial values within a variable, but not by multiple variables.我找到了数百个通过变量中的多个类别值进行分面的教程,但没有通过多个变量进行分面。

With the following code使用以下代码

ggplot(loopsubset_created, aes(y = PdKeyT)) + 
      geom_point(aes(x = BLUE, col = "BLUE")) + 
      geom_point(aes(x = GREEN, col = "GREEN")) +   
      geom_point(aes(x = RED, col = "RED")) +   
      geom_point(aes(x = SWIR1, col = "SWIR1")) +   
      geom_point(aes(x = SWIR2, col = "SWIR2"))

I came to this basic result我得出了这个基本结果

在此处输入图片说明

Here the basic question:这里的基本问题:
Now, I would like to arrange the 5 layers seperately in one row according to the above depicted way Anybody an idea for me?现在,我想按照上面描述的方式将 5 层单独排成一行 有人对我有想法吗?

Plus some information around the question:加上一些关于这个问题的信息:
Though the following aspects are not directly part of my question, I'd like to describe my final idea of the plot (in order to avoid that your advices may clash with further requirements):虽然以下方面不是我的问题的直接部分,但我想描述一下我对情节的最终想法(以避免您的建议可能与进一步的要求发生冲突):

Each panel should include每个面板应包括

  • Spearman corr value and according p-value (as shown above) and Spearman corr 值和根据 p 值(如上所示)和
  • additionaly Pearson corr value and according p-value此外,Pearson corr 值和相应的 p 值
  • Linear regression with conf.使用 conf 进行线性回归。 interval (as shown above) or other type of regression line (not shown)间隔(如上所示)或其他类型的回归线(未显示)
  • Points should be couloured by variable (BLUE=bLue, RED= red; GREEN=green, SWIR1+2 by some other coulours, eg magenta and violet)点应由变量着色(蓝色 = 蓝色,红色 = 红色;绿色 = 绿色,SWIR1+2 由一些其他颜色,例如品红色和紫色)
  • later on points and regressionlines should be subdived by ranges of PdKeyT (eg below -10, -10-to 30, and above 30) with using differnt brightness values of variable basic colours (blue, green, ...), analogouos to this:稍后点和regressionlines应的范围内进行subdived PdKeyT (例如低于-10,-10到30,和上述30)使用的可变基色(蓝色,绿色,...),analogouos此不同的充亮度值:
    在此处输入图片说明
  • All panels should use ONE common y-axis at the left as explained所有面板都应使用左侧的一个公共 y 轴,如解释
  • And I would like to adpat the x-axes by the range of the respecitve variable (eg range for BLUE, GREEN and RED from 500 to 3000 and the SWIRs from 0 to 1500我想通过相应变量的范围来调整 x 轴(例如,蓝色、绿色和红色的范围从 500 到 3000 以及 SWIR 从 0 到 1500

Finally some information and sample of my data最后一些信息和我的数据样本

    > str(loopsubset_created)
'data.frame':   30 obs. of  45 variables:
 $ Site_ID                      : chr  "A" "A" "A" "A" ...
 $ Spot_Nr                      : chr  "1" "1" "1" "1" ...
 $ Transkt_Nr                   : chr  "2" "2" "2" "2" ...
 $ Point_Nr                     : chr  "4" "4" "4" "4" ...
 $ n                            : int  30 30 30 30 30 30 30 30 30 30 ...
 $ rank                         : int  3 3 3 3 3 3 3 3 3 3 ...
 $ Tile                         : chr  "1008" "1008" "1008" "1008" ...
 $ Date                         : int  20190208 20190213 20190215 20190218 20190223 20190228 20190302 20190305 20190315 20190320 ...
 $ id                           : chr  "22" "22" "22" "22" ...
 $ Point_ID                     : chr  "1022" "1022" "1022" "1022" ...
 $ Site_Nr                      : chr  "1" "1" "1" "1" ...
 $ Point_x                      : num  356251 356251 356251 356251 356251 ...
 $ Point_y                      : num  5132881 5132881 5132881 5132881 5132881 ...
 $ Classification               : num  7 7 7 7 7 7 7 7 7 7 ...
 $ Class_Derived                : chr  "WW" "WW" "WW" "WW" ...
 $ BLUE                         : num  1112 1095 944 1144 1141 ...
 $ GREEN                        : num  1158 1178 1009 1288 1265 ...
 $ RED                          : num  599 708 613 788 835 ...
 $ REDEDGE1                     : num  359 520 433 576 665 761 618 598 881 619 ...
 $ REDEDGE2                     : num  83 82 65 169 247 404 116 118 532 162 ...
 $ REDEDGE3                     : num  73 116 81 142 233 391 56 171 538 131 ...
 $ BROADNIR                     : num  44 93 60 123 262 349 74 113 560 125 ...
 $ NIR                          : num  37 70 66 135 215 313 110 135 504 78 ...
 $ SWIR1                        : num  187 282 184 225 356 251 240 216 507 197 ...
 $ SWIR2                        : num  142 187 155 197 281 209 192 146 341 143 ...
 $ Quality.assurance.information: num  26664 10272 10272 10272 8224 ...
 $ Q00_VAL                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Q01_CS1                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Q02_CSS                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Q03_CSH                      : num  1 0 0 0 0 0 0 0 1 0 ...
 $ Q04_SNO                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Q05_WAT                      : num  1 1 1 1 1 1 1 1 1 1 ...
 $ Q06_AR1                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Q07_AR2                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Q08_SBZ                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Q09_SAT                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Q10_ZEN                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Q11_IL1                      : num  1 1 1 1 0 0 0 0 0 0 ...
 $ Q12_IL2                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ Q13_SLO                      : num  1 1 1 1 1 1 1 1 1 1 ...
 $ Q14_VAP                      : num  1 0 0 0 0 0 0 0 1 0 ...
 $ Q15_WDC                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ PdMax                        : int  -7 -19 -20 -22 -24 -25 -26 -25 -21 -15 ...
 $ PdMin                        : int  -13 -23 -24 -26 -28 -29 -29 -28 -24 -20 ...
 $ PdKeyT                       : int  -10 -20 -22 -22 -27 -26 -26 -27 -22 -17 ...

loopsubset_created <- structure(list(Site_ID = c("A", "A", "A", "A", "A", "A", "A", 
    "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", 
    "A", "A", "A", "A", "A", "A", "A", "A", "A", "A"), Spot_Nr = c("1", 
    "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
    "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
    "1", "1", "1"), Transkt_Nr = c("2", "2", "2", "2", "2", "2", 
    "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", 
    "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2"), Point_Nr = c("4", 
    "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", 
    "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", "4", 
    "4", "4", "4"), n = c(30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 
    30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 
    30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L), rank = c(3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
    3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), Tile = c("1008", 
    "1008", "1008", "1008", "1008", "1008", "1008", "1008", "1008", 
    "1008", "1008", "1008", "1008", "1008", "1008", "1008", "1008", 
    "1008", "1008", "1008", "1008", "1008", "1008", "1008", "1008", 
    "1008", "1008", "1008", "1008", "1008"), Date = c(20190208L, 
    20190213L, 20190215L, 20190218L, 20190223L, 20190228L, 20190302L, 
    20190305L, 20190315L, 20190320L, 20190322L, 20190325L, 20190330L, 
    20190401L, 20190416L, 20190419L, 20190421L, 20190501L, 20190506L, 
    20190524L, 20190531L, 20190603L, 20190620L, 20190625L, 20190630L, 
    20190705L, 20190710L, 20190809L, 20190814L, 20190903L), id = c("22", 
    "22", "22", "22", "22", "22", "22", "22", "22", "22", "22", "22", 
    "22", "22", "22", "22", "22", "22", "22", "22", "22", "22", "22", 
    "22", "22", "22", "22", "22", "22", "22"), Point_ID = c("1022", 
    "1022", "1022", "1022", "1022", "1022", "1022", "1022", "1022", 
    "1022", "1022", "1022", "1022", "1022", "1022", "1022", "1022", 
    "1022", "1022", "1022", "1022", "1022", "1022", "1022", "1022", 
    "1022", "1022", "1022", "1022", "1022"), Site_Nr = c("1", "1", 
    "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
    "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", 
    "1", "1"), Point_x = c(356250.781, 356250.781, 356250.781, 356250.781, 
    356250.781, 356250.781, 356250.781, 356250.781, 356250.781, 356250.781, 
    356250.781, 356250.781, 356250.781, 356250.781, 356250.781, 356250.781, 
    356250.781, 356250.781, 356250.781, 356250.781, 356250.781, 356250.781, 
    356250.781, 356250.781, 356250.781, 356250.781, 356250.781, 356250.781, 
    356250.781, 356250.781), Point_y = c(5132880.701, 5132880.701, 
    5132880.701, 5132880.701, 5132880.701, 5132880.701, 5132880.701, 
    5132880.701, 5132880.701, 5132880.701, 5132880.701, 5132880.701, 
    5132880.701, 5132880.701, 5132880.701, 5132880.701, 5132880.701, 
    5132880.701, 5132880.701, 5132880.701, 5132880.701, 5132880.701, 
    5132880.701, 5132880.701, 5132880.701, 5132880.701, 5132880.701, 
    5132880.701, 5132880.701, 5132880.701), Classification = c(7, 
    7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 
    7, 7, 7, 7, 7, 7, 7, 7), Class_Derived = c("WW", "WW", "WW", 
    "WW", "WW", "WW", "WW", "WW", "WW", "WW", "WW", "WW", "WW", "WW", 
    "WW", "WW", "WW", "WW", "WW", "WW", "WW", "WW", "WW", "WW", "WW", 
    "WW", "WW", "WW", "WW", "WW"), BLUE = c(1112, 1095, 944, 1144, 
    1141, 1010, 968, 1023, 1281, 1124, 1215, 1154, 1188, 1177, 1622, 
    1305, 1215, 2282, 2322, 2337, 2680, 2473, 1143, 1187, 1165, 1040, 
    1290, 1112, 1474, 1131), GREEN = c(1158, 1178, 1009, 1288, 1265, 
    1208, 1122, 1146, 1416, 1298, 1379, 1345, 1379, 1366, 1714, 1446, 
    1354, 2417, 2417, 2500, 2967, 2587, 1469, 1522, 1544, 1253, 1514, 
    1371, 1875, 1416), RED = c(599, 708, 613, 788, 835, 852, 726, 
    729, 1044, 816, 905, 908, 948, 970, 1206, 944, 935, 1648, 1741, 
    2004, 2109, 2032, 1241, 1290, 1419, 1206, 1424, 1339, 1969, 1321
    ), REDEDGE1 = c(359, 520, 433, 576, 665, 761, 618, 598, 881, 
    619, 722, 771, 829, 823, 937, 725, 759, 1327, 1395, 1756, 1718, 
    1753, 1533, 1528, 1683, 1335, 1605, 1499, 2016, 1592), REDEDGE2 = c(83, 
    82, 65, 169, 247, 404, 116, 118, 532, 162, 183, 218, 285, 200, 
    514, 182, 230, 568, 531, 1170, 780, 1101, 1192, 1174, 1250, 949, 
    1121, 1127, 1382, 1159), REDEDGE3 = c(73, 116, 81, 142, 233, 
    391, 56, 171, 538, 131, 205, 137, 321, 253, 503, 193, 214, 564, 
    527, 1192, 698, 1177, 1203, 1259, 1341, 1049, 1146, 1216, 1416, 
    1188), BROADNIR = c(44, 93, 60, 123, 262, 349, 74, 113, 560, 
    125, 121, 211, 325, 221, 480, 184, 178, 461, 435, 1067, 570, 
    1023, 961, 966, 964, 844, 764, 993, 1197, 834), NIR = c(37, 70, 
    66, 135, 215, 313, 110, 135, 504, 78, 115, 216, 197, 163, 462, 
    113, 165, 392, 349, 1006, 574, 1092, 1153, 1143, 1128, 961, 1033, 
    1027, 1164, 1086), SWIR1 = c(187, 282, 184, 225, 356, 251, 240, 
    216, 507, 197, 306, 260, 298, 290, 400, 190, 300, 275, 204, 678, 
    528, 1087, 1091, 1049, 1310, 935, 1199, 1169, 984, 1139), SWIR2 = c(142, 
    187, 155, 197, 281, 209, 192, 146, 341, 143, 271, 220, 246, 232, 
    387, 168, 217, 193, 173, 540, 374, 764, 766, 799, 869, 724, 827, 
    794, 745, 848), Quality.assurance.information = c(26664, 10272, 
    10272, 10272, 8224, 8224, 8224, 8224, 24616, 8224, 8224, 8224, 
    32, 8224, 8288, 24616, 8224, 8240, 48, 8208, 8240, 8192, 8192, 
    24648, 8192, 8192, 8192, 8192, 0, 8224), Q00_VAL = c(0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0), Q01_CS1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 
        Q02_CSS = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Q03_CSH = c(1, 
        0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 
        0, 0, 0, 1, 0, 0, 0, 0, 0, 0), Q04_SNO = c(0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 
        0, 0, 0, 0, 0, 0), Q05_WAT = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 
        0, 1), Q06_AR1 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), Q07_AR2 = c(0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Q08_SBZ = c(0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0), Q09_SAT = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0), Q10_ZEN = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Q11_IL1 = c(1, 
        1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0), Q12_IL2 = c(0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0), Q13_SLO = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 
        1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
        0, 1), Q14_VAP = c(1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 
        0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), Q15_WDC = c(0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0), PdMax = c(-7L, -19L, -20L, 
        -22L, -24L, -25L, -26L, -25L, -21L, -15L, -19L, -17L, -23L, 
        -22L, -4L, -7L, -8L, 55L, 57L, 47L, 67L, 44L, 21L, 18L, 13L, 
        16L, 16L, 9L, 12L, 11L), PdMin = c(-13L, -23L, -24L, -26L, 
        -28L, -29L, -29L, -28L, -24L, -20L, -22L, -22L, -26L, -26L, 
        -7L, -11L, -11L, 46L, 47L, 36L, 52L, 37L, 17L, 14L, 9L, 11L, 
        9L, 5L, 5L, 2L), PdKeyT = c(-10L, -20L, -22L, -22L, -27L, 
        -26L, -26L, -27L, -22L, -17L, -19L, -19L, -23L, -23L, -5L, 
        -9L, -9L, 54L, 53L, 40L, 60L, 43L, 20L, 15L, 13L, 15L, 13L, 
        7L, 9L, 6L)), row.names = 198:227, class = "data.frame")

I think this fulfills most of your requirements, other than the correlation annotations.我认为这满足了您的大部分要求,除了相关注释。 If, as you mention in your question, you wish to have 3 regressions per panel (one for each of the three ranges of PdkeyT ) you would also need 3 correlation coefficients and p values per panel, which will be messy.如果,正如您在问题中提到的,您希望每个面板有 3 个回归( PdkeyT的三个范围中的PdkeyT ),您还需要每个面板有 3 个相关系数和 p 值,这将是混乱的。

The reason why you have not seen tutorials for having different facets per variable is that this is not what facets are .您没有看到每个变量具有不同方面的教程的原因是这不是方面什么。 Facets are a way of displaying data that have the same x and y axis but differ by some other categorical variable.分面是一种显示具有相同 x 轴和 y 轴但因某些其他分类变量而不同的数据的方式。 They are not intended as a way of plotting different x variables against the same y variable.它们的目的不是根据相同的 y 变量绘制不同的 x 变量。 What you are describing is 5 distinct plots side-by-side, not facets.您所描述的是并排的 5 个不同的图,而不是方面。

Having said that, it is still possible to create the plot you are looking for with creative use of facets.话虽如此,仍然可以通过创造性地使用方面来创建您正在寻找的情节。 You first need to shape the data into long format so that the values of the different x axis columns get stacked into a single column called value , and a new column called name is created to label each value according to which column it originally came from.您首先需要将数据整形为长格式,以便不同 x 轴列的值堆叠到一个名为value列中,并创建一个名为name的新列,以根据每个值最初来自哪个列来标记每个值。

We can then use the new value column as our x axis variable, and facet according to the name column.然后我们可以使用新value列作为我们的 x 轴变量,并根据name列进行分面。

To make this look more authentic, we make some theme adjustments to ensure the facet strips resemble axis labels:为了使这看起来更真实,我们进行了一些theme调整以确保刻面条类似于轴标签:

library(dplyr)
library(tidyr)
library(ggplot2)

loopsubset_created %>% 
  select(PdKeyT, BLUE, GREEN, RED, SWIR1, SWIR2) %>%
  pivot_longer(-1) %>%
  mutate(range = cut(PdKeyT, c(-Inf, -10, 30, Inf), c("Low", "Mid", "High"))) %>%
  ggplot(aes(value, PdKeyT, color = name)) +
  geom_point(aes(alpha = range)) +
  geom_smooth(aes(group = range), size = 0.1,
              method = "lm", formula = y ~ x, color = "black") +
  labs(x = "") +
  facet_grid(.~name, switch = "x", scales = "free_x") +
  scale_color_manual(values = c("blue", "green", "red", "magenta", "violet")) +
  theme_bw() +
  theme(strip.placement = "outside",
        strip.background = element_blank(),
        plot.margin = margin(120, 10, 120, 10),
        legend.position = "none")

在此处输入图片说明

To panel plots use facet_wrap or facet_grid .面板图使用facet_wrapfacet_grid Also, generally ggplot2 works better when your data are in a long format.此外,当您的数据为格式时,通常 ggplot2 效果更好。 This allows you to assign a variable to an aesthetic rather than do it manually as you have.这允许您将变量分配给美学,而不是像您那样手动进行。

library(ggplot2)
library(tidyr)
library(purrr)
library(dplyr)
library(tibble)

# lengthen your data so variable names are in a column
df <- loopsubset_created %>% 
  pivot_longer(cols = c(BLUE:RED, starts_with("SWIR")))

# get correlation coef and pvalue
r <- map(split(df, ~ name), ~ with(.x, c(cor(PdKeyT, value, method = "spearman"), 
                                         cor.test(PdKeyT, value, method = "spearman")$p.value))) %>%
  bind_rows() %>% 
  rownames_to_column("i") %>% # first row is coef, second row is p value
  pivot_longer(-i) %>% 
  mutate(lab = ifelse(i == 1, 
                      # formatted so will be parsed by geom_text
                      sprintf("italic(R) == %0.5f", value),
                      sprintf("italic(p) == %0.5f", value)),
         x = -Inf, # left of panel
         y = Inf, # top of panel,
         vjust = ifelse(i == 1, 0.75, 2)) # put p-value below

df %>% 
  ggplot(aes(x = value, y = PdKeyT, color = name)) + 
  geom_point() + 
  geom_text(data = r, 
            aes(x = x, y = y, 
                label = lab,
                vjust = vjust),
            size = 3,
            parse = T,
            inherit.aes = F) + 
  geom_smooth(method = "lm", 
              se = T, 
              formula = y ~ x,
              show.legend = F) + 
  facet_grid(~ name,
             scales = "free_x") +
  labs(color = element_blank(),
       x = "XLAB")

在此处输入图片说明

Update: To fulfill your last task me could make use of the code that is from Allan Cameron: adding another column to set the cuts mutate(range = cut(PdKeyT, c(-Inf, -10, 30, Inf), c("Low", "Mid", "High"))) %>% (this code was provided by Allan Cameron)更新:为了完成你的最后一个任务,我可以使用来自艾伦卡梅隆的代码:添加另一列来设置切割mutate(range = cut(PdKeyT, c(-Inf, -10, 30, Inf), c("Low", "Mid", "High"))) %>% (此代码由艾伦卡梅伦提供)

library(tidyverse)
library(ggpubr)

df_long_list <- loopsubset_created %>%
  select(PdKeyT, BLUE, GREEN, RED, SWIR1, SWIR2) %>% 
  pivot_longer(
    cols = -PdKeyT
  ) %>% 
  mutate(color = case_when(name=="BLUE" ~ "blue",
                           name=="GREEN" ~ "green",
                           name=="RED" ~ "red",
                           name=="SWIR1" ~ "magenta",
                           name=="SWIR2" ~ "violet"))%>% 
  mutate(range = cut(PdKeyT, c(-Inf, -10, 30, Inf), c("Low", "Mid", "High"))) %>%
  group_split(name)
  
  p <- ggplot()
  for (i in 1:5) p <- p + geom_point(data=df_long_list[[i]], aes(value, PdKeyT, color=color, alpha=range))+
    geom_smooth(data=df_long_list[[i]], aes(value, PdKeyT, group=range), method = lm, se=TRUE)+
    theme(legend.position="none") +
    stat_cor(data=df_long_list[[i]], aes(value, PdKeyT, 
                                         label=paste("Spearman",..r.label.., ..p.label.., sep = "~`,`~")), method="spearman",
             # label.x.npc="left", label.y.npc="top", hjust=0) +
             label.x = 3, label.y = 70)+
    stat_cor(data=df_long_list[[i]], aes(value, PdKeyT,
                                         label=paste("Pearson",..r.label.., ..p.label.., sep = "~`,`~")), method="pearson",
             # label.x.npc="left", label.y.npc="top", hjust=0) +
             label.x = 3, label.y = 65)+
    facet_grid(.~name, scales = "free") +
    theme_bw()+
    theme(panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          plot.margin = margin(120, 10, 120, 10),
          panel.border = element_rect(fill = NA, color = "black"))
  p

在此处输入图片说明

Here is how you could do it:以下是您可以这样做的方法:

  1. select all relevant columns选择所有相关列
  2. bring in long format引入长格式
  3. add color column to dataframe将颜色列添加到数据框
  4. make a list of dataframes with group_split使用group_split制作数据帧列表
  5. use a for loop to iterate over each of the 5 dataframes in the list使用 for 循环遍历列表中的 5 个数据帧中的每一个
  6. within the loop add stat_cor for pearson and spearman from ggpubr package在循环中为ggpubr包中的 pearson 和 spearman 添加stat_cor
  7. facet and do some formatting facet 并进行一些格式化
library(tidyverse)
library(ggpubr)

df_long_list <- loopsubset_created %>%
  select(PdKeyT, BLUE, GREEN, RED, SWIR1, SWIR2) %>% 
  pivot_longer(
    cols = -PdKeyT
  ) %>% 
  mutate(color = case_when(name=="BLUE" ~ "blue",
                           name=="GREEN" ~ "green",
                           name=="RED" ~ "red",
                           name=="SWIR1" ~ "magenta",
                           name=="SWIR2" ~ "violet"))%>% 
  group_split(name)
  
  p <- ggplot()
  for (i in 1:5) p <- p + geom_point(data=df_long_list[[i]], aes(value, PdKeyT, color=color))+
    geom_smooth(data=df_long_list[[i]], aes(value, PdKeyT), method = lm, se=TRUE)+
    theme(legend.position="none") +
    stat_cor(data=df_long_list[[i]], aes(value, PdKeyT, 
                                         label=paste("Spearman",..r.label.., ..p.label.., sep = "~`,`~")), method="spearman",
             # label.x.npc="left", label.y.npc="top", hjust=0) +
             label.x = 3, label.y = 70)+
    stat_cor(data=df_long_list[[i]], aes(value, PdKeyT,
                                         label=paste("Pearson",..r.label.., ..p.label.., sep = "~`,`~")), method="pearson",
             # label.x.npc="left", label.y.npc="top", hjust=0) +
             label.x = 3, label.y = 65)+
    facet_grid(.~name, scales = "free_y") +
    theme_bw()+
    theme(panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          plot.margin = margin(120, 10, 120, 10),
          panel.border = element_rect(fill = NA, color = "black"))
  p

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM