简体   繁体   English

R中具有ggplot和melt函数的线图:5类中的多条线

[英]Line plot in R with ggplot and melt function: multiple lines in 5 categories

I am really new to R and RStudio (and to this forum). 我真的是R和RStudio(以及这个论坛)的新手。 I did a lot of research on my problem (also in this forum), but I still have problems to get the correct code. 我对我的问题进行了很多研究(也在该论坛中),但是我仍然很难获得正确的代码。 I am quite close, but it starts to be very frustrating. 我已经很近了,但是开始变得非常沮丧。

Situation: I have to do a REE pattern. 情况:我必须做一个REE模式。 X Axis are elements, Y-values are the concentration of the elements. X轴是元素,Y值是元素的浓度。 Each sample is shown as a line from first to last element (x-axis). 每个样本都显示为从第一个元素到最后一个元素的一条线(x轴)。 So it's a multiple line plot. 所以这是多线图。 My dataframe looks like this: Dataframe 我的数据如下所示:

Element PA      PJ          PA          VA          VJ          PA          PA          PA          R
Cs  8.393644832 9.274061495 8.466114498 124.8302919 14.17884799 24.29026324 16.62652167 136.5543529 15.7077603
Rb  66.08861281 74.96446056 66.4222049  80.31878486 113.7845646 104.5795331 91.41634436 202.6518905 93.96286011
Ba  162.7360691 196.7689123 132.1882321 87.87655638 108.7807453 64.40911125 56.2519533  34.28604744 77.26184806
Th  69.50420273 10.69239264 60.48609257 10.7117353  61.83547442 79.0044607  97.33558025 92.98479452 58.67343532
U   22.17827063 16.22661665 21.03802793 7.427212489 60.72442183 63.23055432 70.64986596 51.39206236 42.45965964
Nb  7.575924774 5.89169239  6.667024084 5.004676505 16.69613523 16.67449315 13.346969   43.34980892 13.17651141
Ta  10.71199686 10.60149917 7.779458029 6.835789229 15.94188008 20.1485504  15.27092298 31.27845584 17.07176294
K   233.8150547 271.8452141 241.561939  266.9153787 251.42161   239.4491524 213.3914505 423.9658521 251.42161
La  85.781713   16.03251185 67.342503   20.17716423 28.44896832 56.02416655 86.39273611 27.73347387 43.5324784
Ce  65.93594156 11.79019617 47.55025458 17.54266241 25.52628696 49.26589625 57.6574228  12.21647606 34.22346809
Pb  5.673083989 10.26288212 4.169977919 59.04878053 53.42872487 62.81513974 48.16121863 93.96287593 101.9287591
Pr  53.02764512 9.938334989 42.03809952 14.99962348 17.24082014 37.33542354 53.81996734 25.90256871 28.40450355
Nd  42.33110774 8.364811267 33.97954887 13.48174221 13.93479643 29.74581887 43.06564505 26.65600445 23.33544314
Sr  0.835397313 0.815930916 0.586568694 9.996068224 0.960554876 0.536331654 0.258305773 5.683560546 0.942533523
Sm  21.35644343 5.451089335 16.96532562 9.760893837 9.675593776 20.01885453 24.97813208 27.39269895 16.0149219
Hf  44.23389487 52.43907046 42.33828695 4.98724425  30.28451128 49.09584912 60.28147686 9.971733073 24.74464941
Zr  53.96191223 65.62184274 53.86924455 5.318772828 26.1413139  53.09855665 65.71920565 9.34974258  26.87927243
Ti  1.436464088 1.215469613 1.270718232 10.66298343 0.497237569 0.662983425 0.662983425 5.524861878 0.607734807
Eu  10.35812973 4.071632021 8.46110334  8.611540363 2.338303868 8.328014705 9.786671125 16.16876122 7.070426445
Gd  14.83675531 4.409737144 11.44401365 7.645177015 8.221991883 15.42873831 18.54842542 27.21876767 12.99376358
Tb  9.092304297 3.898598538 6.982306648 6.673348685 8.230478353 14.87434634 15.41911057 32.14506684 12.56736368
Dy  6.38743838  3.392714189 4.532800141 5.210803147 7.695892687 12.39499316 12.92923541 31.99972441 11.63756207
Ho  4.466797664 3.082328768 3.346515335 4.589583127 7.111016931 11.57170602 10.63770512 32.956692   11.03232412
Y   3.35940512  2.382622411 2.505043001 5.01436475  6.489644503 8.926279165 8.788977547 37.04567217 10.68214568
Er  4.715669314 3.488584654 3.470548704 4.488104792 7.032818937 11.60405599 10.95403677 34.6355416  11.22917717
Tm  4.269381986 3.989071741 3.178992509 3.900228104 6.798645341 11.83388929 10.19664082 33.36983427 10.85995832
Yb  5.223135226 4.959299109 3.870356399 3.60128161  6.859780617 11.56204692 10.80225244 32.97149663 10.56174395
Lu  7.20048667  6.451947335 4.9601101   3.949574922 6.395672788 11.91831865 11.2065581  31.70363964 9.943874048

I want to have the x-axis with the elements in this specific order as shown in the dataframe column1 (as done with the levels=unique option). 我想使x轴具有按此特定顺序排列的元素,如数据框column1所示(与levels = unique选项相同)。 And for the y-values I want 5 categories (PA, PJ, VA, VJ, R) each with a specific colour. 对于y值,我想要5种类别(PA,PJ,VA,VJ,R),每种类别都具有特定的颜色。 All columns should be plotted as lines. 所有列都应绘制为线。 Important: each sample (column) should be one line and plotted. 重要提示:每个样本(列)应为一条线并作图。 The legend should be simple and only show: colour = category. 图例应该简单,仅显示:color = category。 But thats not so necessary, I can also do the legend manualy at the end with a graphic editing programm. 但这并不是必须的,我还可以通过图形编辑程序最后手动完成图例。 So thats not the main problem. 所以那不是主要问题。

My result so far: 到目前为止,我的结果是:

require(ggplot2)
require(reshape2) 
df <- read.csv2("ultra_REE_ref.csv", header = T, sep = ";", dec = ".")   
df <- melt(df ,  id.vars = 'Element', variable.name = "series")
df$Element <- factor(df$Element, levels=unique(df$Element)) 
ggplot(df,aes(Element,value, col=series)) + geom_point() +
       theme(legend.position="none") + scale_y_log10()

which produces this picture: 产生这张照片: 在此处输入图片说明

Does anyone has an idea how to 有谁知道如何

1.make lines instead of points? 1.用线代替点吗? I had lines once, but I can't reproduce it since I manipulated the alphabetical order of the x-axis. 我曾经有过线条,但是由于我操纵了x轴的字母顺序,所以无法复制它。 When i change the code to geom_line() it will give no output at all. 当我将代码更改为geom_line()时,将完全没有输出。

  1. remove the points on the bottom at y = 0? 删除y = 0底部的点? I already removed all zeros from the input file (at least I think I did it properly). 我已经从输入文件中删除了所有零(至少我认为我做得正确)。

  2. define each category with distinct colour? 用不同的颜色定义每个类别? I would also be happy when I make 5 different input files and define the style for any file itself. 当我制作5个不同的输入文件并为任何文件本身定义样式时,我也会很高兴。 Like plotting the lines into an existing plot. 就像将线绘制到现有图上一样。 This would also be quite cool. 这也将很酷。

  3. making a empty background without any lines/shades. 制作没有任何线条/阴影的空背景。

I would be so happy if someone could help me doing this. 如果有人可以帮助我做到这一点,我将非常高兴。 Thank you so much for reading so far :) Greetings! 到目前为止,非常感谢您阅读:)问候!

You've done most of the work in reshaping the data and creating the factor levels. 您已经完成了重塑数据和创建因子级别的大部分工作。 For lines, the issue is that you need to group by sample. 对于生产线,问题在于您需要按样品分组。 For the y = 0, there must be zero values in your data frame, or else they would not appear in the plot. 对于y = 0,数据框中必须有零值,否则它们将不会出现在图中。 To remove the grey background, you can apply a theme such as theme_minimal . 要删除灰色背景,可以应用一个主题,例如theme_minimal To completely remove all background lines, you need to modify the panel.grid element. 要完全删除所有背景线,您需要修改panel.grid元素。

Let's put all that together. 让我们放在一起。 I prefer dplyr for data manipulation and when I created a data frame from your data, the PA columns were renamed as they are non-unique. 我更喜欢使用dplyr进行数据操作,当我根据您的数据创建数据框时, PA列被重命名,因为它们是唯一的。

library(dplyr)
library(ggplot2)

df %>% 
  mutate(Element = factor(Element, levels = unique(Element))) %>%
  gather(sample, value, -Element) %>% 
  ggplot(aes(Element, value)) + 
    geom_line(aes(color = sample, group = sample)) + 
    scale_y_log10() + 
    theme_minimal() + 
    theme(panel.grid = element_blank())

Result: 结果: 在此处输入图片说明

You can tweak the colours using eg scale_color_manual or scale_color_brewer . 您可以使用例如 scale_color_manualscale_color_brewer来调整颜色。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM