简体   繁体   English

使用 ggplot2 - R 绘制数据框

[英]Plot dataframe with ggplot2 - R

I am trying the create a graph using the following dataframe.我正在尝试使用以下数据框创建图形。 It has 10 observations and 25 variables (one of them - ID - is just an ID column for the different observations它有 10 个观察值和 25 个变量(其中一个 - ID - 只是不同观察值的 ID 列

'data.frame':    10 obs. of  25 variables:
 $ NDVI_mean   : num  0.0607 0.0552 0.5811 0.7676 0.0328 ...
 $ NDVI_sd     : num  0.0881 0.0298 0.1644 0.0937 0.0292 ...
 $ NDVI_mean.1 : num  0.0211 0.0549 0.1375 0.1207 0.024 ...
 $ NDVI_sd.1   : num  0.0111 0.0195 0.0227 0.0701 0.0197 ...
 $ NDVI_mean.2 : num  0.0703 0.0715 0.6832 0.769 0.0418 ...
 $ NDVI_sd.2   : num  0.0938 0.0298 0.1601 0.0674 0.0402 ...
 $ NDVI_mean.3 : num  0.0636 0.0552 0.6829 0.732 0.0292 ...
 $ NDVI_sd.3   : num  0.0912 0.0222 0.1613 0.1102 0.0355 ...
 $ NDVI_mean.4 : num  0.092 0.0781 0.6947 0.5256 0.056 ...
 $ NDVI_sd.4   : num  0.0879 0.0211 0.158 0.0686 0.0328 ...
 $ NDVI_mean.5 : num  0.1047 0.091 0.4251 0.3573 0.0722 ...
 $ NDVI_sd.5   : num  0.0441 0.013 0.0585 0.0368 0.0156 ...
 $ NDVI_mean.6 : num  0.0547 0.0654 0.5912 0.6098 0.0404 ...
 $ NDVI_sd.6   : num  0.0874 0.0195 0.2143 0.0975 0.0287 ...
 $ NDVI_mean.7 : num  0.1047 0.0882 0.6914 0.6532 0.0689 ...
 $ NDVI_sd.7   : num  0.0843 0.0177 0.1553 0.0653 0.0299 ...
 $ NDVI_mean.8 : num  0.0859 0.071 0.6905 0.6866 0.0556 ...
 $ NDVI_sd.8   : num  0.0809 0.018 0.1624 0.0866 0.0311 ...
 $ NDVI_mean.9 : num  0.0949 0.1204 0.1434 0.2849 0.1231 ...
 $ NDVI_sd.9   : num  0.00951 0.00719 0.01228 0.03483 0.01023 ...
 $ NDVI_mean.10: num  0.0854 0.0752 0.6712 0.7326 0.0628 ...
 $ NDVI_sd.10  : num  0.0789 0.0212 0.1471 0.0951 0.0326 ...
 $ NDVI_mean.11: num  0.0942 0.0986 0.6434 0.7741 0.0899 ...
 $ NDVI_sd.11  : num  0.0735 0.0188 0.1299 0.0765 0.0277 ...
 $ ID          : int  1 2 3 4 5 6 7 8 9 10

Image of the dataframe数据框的图像

I would like to create a graph with the following characteristics:我想创建一个具有以下特征的图表:

The X axes should be the different NDVI_mean variables (NDVI_mean - NDVI_mean.1 - NDVI_mean.2 - etc.) X 轴应该是不同的 NDVI_mean 变量(NDVI_mean - NDVI_mean.1 - NDVI_mean.2 - 等)

The Y axes should be the values of those variables Y 轴应该是这些变量的值

+ +

I would like the graph to contain 10 lines which correspond to the 10 observations我希望图表包含 10 条线,对应于 10 个观察值

I am completely new with ggplot2.我对 ggplot2 完全陌生。 I have manage to create some graphs using this code but it is not what I want我设法使用此代码创建了一些图形,但这不是我想要的

ggplot2(NDVIdf)+geom_lines(aes(x=NDVI_mean, y=ID))

Edit编辑

Output of dput(NDVIdf) dput(NDVIdf) 的输出

> dput(NDVIdf)
structure(list(NDVI_mean = c(0.0607135413903215, 0.0551773354119158, 
0.58106114381679, 0.767559372904067, 0.0327779986531678, 0.0178615320775541, 
0.242088217197272, 0.20999285277774, 0.0393074640533382, 0.362654323805063
), NDVI_sd = c(0.0881409040764672, 0.0297587817960566, 0.164350402694459, 
0.0937447350009958, 0.0291673504162979, 0.0328954778667684, 0.154674728930805, 
0.143054126101431, 0.0394783704067313, 0.0311605700206721), NDVI_mean.1 = c(0.0210982854397687, 
0.0549092171312182, 0.137518549485032, 0.120670383289592, 0.0240424367577284, 
0.0159096452554129, 0.0672761385275565, 0.0341803495552938, -0.00134448575083377, 
0.016580205828644), NDVI_sd.1 = c(0.0110723260269658, 0.01951851232517, 
0.0227065359549684, 0.0700670943356275, 0.0196608837546022, 0.0121199724795787, 
0.0585473350749026, 0.0227914450038557, 0.0135058392129466, 0.0150912377421709
), NDVI_mean.2 = c(0.0703180934388137, 0.0714783174472453, 0.683213190781725, 
0.769036956136717, 0.0418112830812162, 0.0284743048998433, 0.23348292946483, 
0.235929665998861, 0.0450296798850473, 0.193100322654342), NDVI_sd.2 = c(0.0937618859311873, 
0.0298498793436821, 0.160085159223464, 0.0673810570033997, 0.0402149180186587, 
0.0397066821267195, 0.150683537602667, 0.145471088294412, 0.0437922991992655, 
0.0141486994532874), NDVI_mean.3 = c(0.063601350404069, 0.0551904437586304, 
0.682930851671194, 0.731967082842268, 0.0291583347580934, 0.0193111448443503, 
0.319631300233593, 0.166050320085929, 0.0366014763086276, 0.221872499210234
), NDVI_sd.3 = c(0.0912293427166813, 0.0222453701937956, 0.161322107465979, 
0.110207844254988, 0.0355049011856384, 0.0349930810428516, 0.226276210965238, 
0.140438890801978, 0.0376830428032925, 0.0220584182188743), NDVI_mean.4 = c(0.0919804842383724, 
0.0781265501499422, 0.694671427954745, 0.525632176936541, 0.055980386796576, 
0.0444207693277835, 0.426953378337129, 0.160783372015251, 0.0609390942283722, 
0.280378773041507), NDVI_sd.4 = c(0.0879414801936151, 0.0210552190044792, 
0.158033594194682, 0.0685669288657517, 0.0327713833848639, 0.0354769286383367, 
0.252469606866754, 0.12982572565032, 0.036836867665617, 0.0411465416161875
), NDVI_mean.5 = c(0.104738543138272, 0.0909713368375085, 0.42508525657118, 
0.357320549164012, 0.0721572385527876, 0.0663794698314188, 0.23911562990616, 
0.142111328436142, 0.0838823297267412, 0.251654432686439), NDVI_sd.5 = c(0.0441048632295888, 
0.0130326877444498, 0.0585279766430101, 0.0368348303398042, 0.0155510862094617, 
0.0192137216652464, 0.0585475549304422, 0.0736886597614494, 0.0212806524351524, 
0.0259938407933158), NDVI_mean.6 = c(0.054670992223019, 0.065381296775636, 
0.591168574495215, 0.609806323819807, 0.0403625648315437, 0.00811946347995523, 
0.310916693477462, 0.158498269033413, 0.0443372830506701, 0.371097291211943
), NDVI_sd.6 = c(0.0874297350407085, 0.0195485598323798, 0.214314782421285, 
0.0974716364190341, 0.0286726835375469, 0.0464844141552075, 0.124111018323546, 
0.128024962302557, 0.0389891785245309, 0.0776972260415738), NDVI_mean.7 = c(0.104681150177595, 
0.0882044567680745, 0.691354972003269, 0.65319672247453, 0.0689026139683328, 
0.0564115948034715, 0.345466555679804, 0.189094867024672, 0.0757218905916805, 
0.473698360613464), NDVI_sd.7 = c(0.0842570243914433, 0.0176701260206802, 
0.15529028462675, 0.0653161775993753, 0.0298643217871498, 0.0350802264835342, 
0.179839784256719, 0.123083052319927, 0.0381348337459403, 0.0504351117371588
), NDVI_mean.8 = c(0.08591308296691, 0.0710169977816541, 0.69050244219096, 
0.686550053201553, 0.05559868259279, 0.0386856009179078, 0.385427380396624, 
0.191494000375466, 0.0582605982908748, 0.634092671211804), NDVI_sd.8 = c(0.0809198942299164, 
0.0179761950231585, 0.162375086927031, 0.0865933396475938, 0.0311069109690036, 
0.0341123644056808, 0.221678551331174, 0.123510636808576, 0.0352156193862569, 
0.037441682380018), NDVI_mean.9 = c(0.0948832729278301, 0.120400127877444, 
0.143375746425582, 0.284877572639076, 0.123096886134381, 0.111171634746743, 
0.223262233590715, 0.120538120679937, 0.0971369124338333, 0.233815280325772
), NDVI_sd.9 = c(0.00951402820331221, 0.00718755312976778, 0.0122787887859583, 
0.0348298462083254, 0.0102326571562652, 0.00707392292527096, 
0.047660749828529, 0.0127667426992843, 0.00615732059786784, 0.0341929453840942
), NDVI_mean.10 = c(0.0853653180560701, 0.0751638478379656, 0.671240397847597, 
0.732629951796317, 0.0628200256114987, 0.0389376153602489, 0.261580310922298, 
0.237551383820751, 0.0543488352606212, 0.729810364384283), NDVI_sd.10 = c(0.0788860954632659, 
0.0212295674726634, 0.147125862744127, 0.0951195938807548, 0.0325819971840338, 
0.0313179762667036, 0.149062631594425, 0.123975319460501, 0.0295479331978356, 
0.0347570926406439), NDVI_mean.11 = c(0.0941656718689444, 0.0985743153705462, 
0.643407964040386, 0.774084527469533, 0.0899420980061257, 0.0535166413991826, 
0.303595683796766, 0.245633779631581, 0.0643377575135575, 0.697236179516483
), NDVI_sd.11 = c(0.0735030069394199, 0.0187835191570716, 0.129850840148202, 
0.0764938134743573, 0.0276954256603995, 0.0260888900038652, 0.122217568202193, 
0.0934074608484564, 0.0272831553843282, 0.026010072370012), ID = 1:10), .Names = c("NDVI_mean", 
"NDVI_sd", "NDVI_mean.1", "NDVI_sd.1", "NDVI_mean.2", "NDVI_sd.2", 
"NDVI_mean.3", "NDVI_sd.3", "NDVI_mean.4", "NDVI_sd.4", "NDVI_mean.5", 
"NDVI_sd.5", "NDVI_mean.6", "NDVI_sd.6", "NDVI_mean.7", "NDVI_sd.7", 
"NDVI_mean.8", "NDVI_sd.8", "NDVI_mean.9", "NDVI_sd.9", "NDVI_mean.10", 
"NDVI_sd.10", "NDVI_mean.11", "NDVI_sd.11", "ID"), row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10"), class = "data.frame")

Your data is in wide format but ggplot requires that you first convert it to long format.您的数据采用宽格式,但ggplot要求您首先将其转换为长格式。 That is, you should have one row per observation per measurement.也就是说,每次测量每个观察值都应该有一行。 In your case, you should have a dataframe with 12*2*10 rows, and three columns being: observation (1-10), statistic (mean.1, sd.2, ...) and value.在您的情况下,您应该有一个包含 12*2*10 行和三列的数据框:观察(1-10)、统计数据(mean.1、sd.2、...)和值。

You can use tidyr 's gather function to easily reformat the data into long format:您可以使用tidyrgather函数轻松地将数据重新格式化为长格式:

library(tidyr)

NDVIdf_forplot <- gather(NDVIdf, key = statistic, value = value, -ID)
ggplot(NDVIdf_forplot, aes(x = statistic, y = value) + geom_line()

The gather function takes two arguments: key and value, that set the column names for the relevant fields in the new long format dataframe. gather函数采用两个参数:key 和 value,它们为新的长格式数据帧中的相关字段设置列名。 In this case, key is the name given to the new column that encodes the previous column headings (mean, sd, ...) and value to the previous values (0.0607,...).在这种情况下,key 是赋予新列的名称,该列将前一列标题(mean、sd、...)和 value 编码为之前的值(0.0607、...)。

The expression -id tells gather not to remove the ID column, so each record in the new dataframe is still associated with the correct id.表达式-id告诉gather不要删除ID列,因此新数据框中的每条记录仍与正确的 id 相关联。 That way you can use it to separate the different lines in the ggplot call like this:这样你就可以用它来分隔ggplot调用中的不同行,如下所示:

ggplot(NDVIdf_forplot, aes(x = statistic, y= value, group = ID)) + geom_line()

And, if you want to colour the different observations differently and include a legend, you can use the colour aesthetic:而且,如果您想对不同的观察结果进行不同的着色并包含一个图例,您可以使用颜色美学:

ggplot(NDVIdf_forplot, aes(x = statistic, y = value, group = ID, colour = ID)) + 
    geom_line()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM