简体   繁体   English

将刻痕频率分配给facet_grid中的离散数据轴

[英]Assigning tick marks frequency to discrete data axes in facet_grid

I'm having some trouble setting readable tick marks on my axes. 我在轴上设置可读的刻度线时遇到麻烦。 The problem is that my data are at different magnitudes, so I'm not really sure how to go about it. 问题是我的数据大小不同,因此我不确定如何处理。

My data include ~400 different products, with 3/4 variables each, from two machines. 我的数据包括来自两台机器的约400种不同产品,每个产品都有3/4个变量。 I've pre-processed it into a data.table and used gather to convert it to long form- that part is fine. 我已经将其预处理为data.table,并使用collect将其转换为长格式-这部分很好。

Overview: Data is discrete, each X_________ on the x-axis represents a separate reading, and its relative values from machine 1/2 - the idea is to compare the two. 概述:数据是离散的,x轴上的每个X_________代表一个单独的读数,其相对值来自机器1/2-目的是将两者进行比较。 The graphical format is perfect for my needs, I would just like to set the ticks at say, every 10 products on the x-axes, and at reasonable values on the y-axis. 图形格式非常适合我的需求,我只想在x轴上每10个产品设置刻度,在y轴上设置合理的值。

  • Y_1: from 150 to 250 Y_1:从150到250
  • Y_2: from say, 1.5* to 2.5 Y_2:从1.5 *到2.5
  • Y_3: from say, 0.8* to 2.3 Y_3:从0.8 *到2.3
  • Y_4: from say, 0.4* to 1.5 Y_4:从0.4 *到1.5

*Bottom value, rounded down *底值,四舍五入

Here's the code I'm using so far 这是我到目前为止使用的代码

var.Parameter <- c("Var1", "Var2", "Var3", "Var4")

MProduct$Parameter <- factor(MProduct$Parameter,
                          labels = var.Parameter)
labels_x <- MProduct$Lot[seq(0, 1626, by= 20)]
labels_y <- MProduct$Value[seq(0, 1626, by= 15)]


plot.MProduct <- ggplot(MProduct, aes(x = Lot,
                                y = Value,
                                colour = V4)) +
  facet_grid(Parameter ~.,
            scales = "free_y") + 
  scale_x_discrete(breaks=labels_x) +
  scale_y_discrete(breaks=labels_y) +
  geom_point() +
  labs(title = "Product: Select  Trends | 2018",
       x = "Time (s)",
       y = "Value") +
  theme(axis.text.x = element_text (angle = 90,
                                    hjust = 1,
                                    vjust = 0.5)) 
 # ggsave("MProduct.png")
plot.MProduct

Mproduct.png

Anyone knows how to possibly render this graph more readable? 有谁知道如何使该图更具可读性? Setting labels/breaks manually greatly limits flexibility and readability - there should be an option to set it to every X ticks, right? 手动设置标签/中断会极大地限制灵活性和可读性-应该有一个选项可以将其设置为每个X刻度,对吗? Same with y. 与y相同。

I need to apply this as a function to multiple datasets, so I'm not very happy about having to specify the column length of the "gathered" dataset every time either, which, in this case is 1626. 我需要将此函数应用到多个数据集,因此我对每次都指定“聚集的”数据集的列长感到不满意,在这种情况下为1626。

Since I'm here, I would also like to take the opportunity to ask about this code: 由于我在这里,因此我也想借此机会询问以下代码:

var.Parameter <- c("Var1", "Var2", "Var3", "Var4")

More often than not, I need to label my data in a specific order, which is not necessarily alphabetical. 通常,我需要按特定顺序标记数据,而不必按字母顺序排列。 R, however, defaults to some kind of odd behaviour whereupon I have to plot and verify that the labels are indeed where they should be. 但是,R默认为某种奇怪的行为,因此我必须绘制并验证标签确实在应有的位置。 Any clue how I could force them to be presented in order? 有什么线索可以迫使他们按顺序提出吗? As it is, my solution is to keep shifting their position in that line of code until it produces the graph correctly. 照原样,我的解决方案是继续移动他们在该行代码中的位置,直到正确生成图形为止。

Many thanks. 非常感谢。

Okay. 好的。 I'm going to ignore the y axis labels because the defaults seem to work just fine as long as you don't try to overwrite them with your custom labels_y thing. 我将忽略y轴标签,因为只要您不尝试使用自定义的labels_y覆盖它们,默认值似乎就可以正常工作。 Just let the defaults do their work. 只需让默认值完成工作即可。 For the X axis, we'll give a couple options: 对于X轴,我们将提供几个选项:

(A) label every N products on X-axis. (A)在X轴上每N个产品贴上标签。 Looking at ?scale_x_discrete , we can set the labels to a function that takes all the level of the factor and returns the labels we want. 查看?scale_x_discrete ,我们可以将标签设置为一个函数,该函数采用所有因子水平并返回所需的标签。 So we'll write a function al that returns a function that returns every Nth label: 因此,我们将编写一个函数al ,该函数返回一个返回第N个标签的函数:

every_n_labeler = function(n = 3) {
  function (x) {
    ind = ((1:length(x)) - 1) %% n == 0
    x[!ind] = ""
    return(x)
  }
}

Now let's use that as the labeler: 现在,将其用作标签器:

ggplot(df, aes(x = Lot,
               y = Value,
               colour = Machine)) +
  facet_grid(Parameter ~ .,
             scales = "free_y") +
  geom_point() +
  scale_x_discrete(labels = every_n_labeler(3)) +
  labs(title = "Product: Select  Trends | 2018",
       x = "Time (s)",
       y = "Value") +
  theme(axis.text.x = element_text (
    angle = 90,
    hjust = 1,
    vjust = 0.5
  )) 

在此处输入图片说明

You can change the every_n_labeler(3) to (10) to make it every 10th label. 您可以将every_n_labeler(3)更改为(10) ,使其每10个标签一次。

(B) Maybe more appropriate, it seems like your x-axis is actually numeric, it just happens to have "X" in front of it, let's convert it to numeric and let the defaults do the labeling work: (B)也许更合适,似乎您的x轴实际上是数字轴,恰好在它的前面有“ X”,让我们将其转换为数字并让默认值完成标注工作:

df$time = as.numeric(gsub(pattern = "X", replacement = "", x = df$Lot))

ggplot(df, aes(x = time,
               y = Value,
               colour = Machine)) +
  facet_grid(Parameter ~ .,
             scales = "free_y") +
  geom_point() +
  labs(title = "Product: Select  Trends | 2018",
       x = "Time (s)",
       y = "Value") +
  theme(axis.text.x = element_text (
    angle = 90,
    hjust = 1,
    vjust = 0.5
  )) 

在此处输入图片说明

With your full x range, I imagine that would look nice. 有了您的完整x范围,我想那会很好。

(C) But who wants to read those 9-digit numbers? (C)但是谁想读这些9位数字? You're labeling the x-axis a "Time (s)", which makes me think it's actual a time , measured in seconds from some start time. 您将x轴标记为“时间(s)”,这使我认为它是实际的时间 ,从某个开始时间开始以秒为单位。 I'll make up that your start time is 2010-01-01 and covert these seconds to actual times, and then we get a nice date-time scale: 我将把您的开始时间设为2010年1月1日,并将这些秒数转换为实际时间,然后我们得到一个不错的日期时间比例:

ggplot(df_s, aes(x = as.POSIXct(time, origin = "2010-01-01"),
               y = Value,
               colour = Machine)) +
  facet_grid(Parameter ~ .,
             scales = "free_y") +
  geom_point() +
  labs(title = "Product: Select  Trends | 2018",
       x = "Time (s)",
       y = "Value") +
  theme(axis.text.x = element_text (
    angle = 90,
    hjust = 1,
    vjust = 0.5
  )) 

在此处输入图片说明

If this is the real meaning behind your data, then using a date-time axis is a big step up for readability. 如果这是数据背后的真正含义,那么使用日期时间轴将大大提高可读性。 (Again, notice that we are not specifying the breaks, the defaults work quite well.) (再次,请注意,我们没有指定中断,默认值效果很好。)


Using this data (I subset your sample data down to 2 facets and used dput to make it copy/pasteable): 使用此数据(我将样本数据细分为2个小平面,并使用dput将其复制/粘贴):

df = structure(list(Lot = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 
4L, 1L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 1L, 2L, 3L, 4L, 1L, 
2L, 3L, 4L, 1L), .Label = c("X180106482", "X180126485", "X180306523", 
"X180526326"), class = "factor"), Value = c(201, 156, 253, 211, 
178, 202.5, 203.4, 204.3, 205.2, 2.02, 2.17, 1.23, 1.28, 1.54, 
1.28, 1.45, 1.61, 2.35, 1.34, 1.36, 1.67, 2.01, 2.06, 2.07, 2.19, 
1.44, 2.19), Parameter = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L), .Label = c("Var 1", "Var 2", "Var 3", "Var 4"
), class = "factor"), Machine = structure(c(2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Machine 1", "Machine 2"), class = "factor"), 
    time = c(180106482, 180126485, 180306523, 180526326, 180106482, 
    180126485, 180306523, 180526326, 180106482, 180106482, 180126485, 
    180306523, 180526326, 180106482, 180126485, 180306523, 180526326, 
    180106482, 180106482, 180126485, 180306523, 180526326, 180106482, 
    180126485, 180306523, 180526326, 180106482)), row.names = c(NA, 
-27L), class = "data.frame")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM