简体   繁体   English

自定义自动 R Markdown 报告

[英]Customize Automated R Markdown Reports

I am trying to create automated fitness testing reports by training hub using RMarkdown.我正在尝试使用 RMarkdown 通过培训中心创建自动健身测试报告。 I am building off a previous question I asked on the topic here using the same dataframe structure and similar code.我建立关前一个问题,我问的话题在这里使用相同的数据帧结构和相似的代码。

I would like to slightly customize each RMarkdown report based on hub while still allowing it to be an automated process (ie 1 script + 1 .rmd).我想根据集线器稍微自定义每个 RMarkdown 报告,同时仍然允许它是一个自动化过程(即 1 个脚本 + 1 个 .rmd)。 For example, all hubs will have the Broad Jump plot but for AI want a broad jump plot and a 10m sprint plot.例如,所有集线器都有跳远图,但对于 AI 需要跳远图和 10m 冲刺图。

I have tried using params in the YAML and render function but not had any luck so far.我曾尝试在 YAML 和渲染函数中使用参数,但到目前为止还没有任何运气。

Sample Data样本数据

         Date  Athlete       Test    Average Hub
1  2019-06-03    Athlete1 Broad_Jump 175.000000 A
2  2019-06-10    Athlete1 Broad_Jump 187.000000 A
3  2019-06-10    Athlete2 Broad_Jump 200.666667 B
4  2019-06-10 Athlete3 10m_Sprint   1.831333 B
5  2019-06-10    Athlete2 10m_Sprint   2.026667 B
6  2019-06-17    Athlete1 Broad_Jump 191.500000 A
7  2019-06-17    Athlete2 Broad_Jump 200.666667 B
8  2019-06-17 Athlete3 10m_Sprint   1.803667 B
9  2019-06-17    Athlete2 10m_Sprint   2.090000 B
10 2019-06-24    Athlete1 Broad_Jump 192.000000 A

R Script R 脚本

library(rmarkdown)
library(knitr)
library(dplyr)
library(ggplot2)

WT <- read.csv("WT.csv")

WT_10m <- WT %>%
  filter(Test == "10m_Sprint") %>%
  select(Date, Athlete, Hub, Average)

plot2 <- ggplot(WT_10m, aes(x=Date, y=Average))+
  geom_point()

for (hub in unique(WT$Hub)){
  subgroup <- subset(WT, Hub == hub)
  render(
    input = "Hub_Test.rmd",
    params = list("plot2"=plot2),
    output_file = paste0('report.', hub, '.pdf'))

.Rmd File .Rmd 文件

---
title: "WT Monitoring: Hub"
output: pdf_document
params:
  plot2: plot2
  hub:
    label: "hub"
    value: A
    choices: [A, B]
---


```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)

library(rmarkdown)
library(knitr)
library(dplyr)
library(ggplot2)

WT <- read.csv("WT.csv")

subgroup <- subset(WT, Hub == hub)

subgroup_Broad <- subgroup %>%
  filter(Test == "Broad_Jump") %>%
  select(Date, Athlete, Hub, Average)


ggplot(subgroup_Broad, aes(x=Date, y= Average)) +
  geom_point()


params$plot2

I'm not sure whether to use params, the render file, or some other method to accomplish this task.我不确定是否使用参数、渲染文件或其他一些方法来完成此任务。 There are several hubs and tests so I'm trying to avoid having a separate Rmarkdown template for each hub.有几个集线器和测试,所以我试图避免为每个集线器使用单独的 Rmarkdown 模板。

So I have no idea whether this is the most efficient solution (probably not).所以我不知道这是否是最有效的解决方案(可能不是)。 Also it was not completely clear for me what code you want to run exactly, so below you find someting general which you have to adjust a bit to your specific situation.此外,我并不完全清楚您想要准确运行哪些代码,因此在下面您会找到一些一般情况,您必须根据您的具体情况对其进行一些调整。

If I understand correctly you want to run the markdown at least twice with slightly different settings.如果我理解正确,您希望使用略有不同的设置至少运行两次降价。 What I would do is indeed using params and create a function, which specifies the render.我会做的确实是使用 params 并创建一个函数,该函数指定渲染。 I think you had it almost right, but I am unsure about you label, value and choices part.我认为你几乎是对的,但我不确定你的标签、价值和选择部分。 So that is the thing I would change.所以这就是我要改变的事情。

So in the markdown you need the specify the params with a default setting (I also include title as I think each report should have its own title).因此,在 Markdown 中,您需要使用默认设置指定参数(我还包括标题,因为我认为每个报告都应该有自己的标题)。 I here assumed you want a html doc, but you can adjust that ofcourse我在这里假设你想要一个 html 文档,但你可以调整它ofcourse

author: "Author"
date: "Dec 24, 2019"
params: 
    hub: "B"
    title: "Some basic title"
output: html_document

The first thing you should put after specifing the author, date, parmas and output is:在指定作者、日期、parmas 和输出之后,您应该放置的第一件事是:

---
title: `r params$title`
---

The dashes need to be included.需要包括破折号。 This wil give your document the specific title you want.这将为您的文档提供您想要的特定标题。 After this you can do whatever you want with the code till you reach the point where you want to include plot2 (the sprint plot).There I would do something like:在此之后,您可以使用代码做任何您想做的事情,直到达到您想要包含 plot2(冲刺图)的点。在那里我会做类似的事情:

if(params$hub == "A"){
YOUR PLOT CODE FOR PLOT 2
}

So the plot will only be run if the param of hub is A (or if you actuall want it for multiple than use %in% c("A","") ).因此,只有当集线器的参数为 A (或者如果您实际上想要它的多个而不是使用 %in% c("A","") )时,该图才会运行。 You can put more code after this if you want.如果需要,您可以在此之后放置更多代码。

Than in the script you can use the render function to changes the params.比在脚本中,您可以使用渲染函数来更改参数。 I put it in my own function: again you can change the name of the function if you want and be sure to check whether the .html matches your markdown document output.我把它放在我自己的函数中:如果你愿意,你可以再次更改函数的名称,并确保检查 .html 是否与你的 Markdown 文档输出匹配。 I use this user defined functions so I can run multiple inputs through the render and get a different output.我使用这个用户定义的函数,所以我可以通过渲染运行多个输入并获得不同的输出。

render_html_fun <- function(hub_in){
  rmarkdown::render('FILE LOCATION WHERE YOU SAVE THE RMD FILE.Rmd',
                    output_file = paste0('SOME TITLE', hub_in, "_", Sys.Date(), '.html'),
                    params = list(hub = hub_in,
                                  title = paste0("SOME BASIC TITLE", hub_in)))
}

So basically the title of your knitted document will be saved with the basic title + the hub so that you know beforehand which file you are opening.所以基本上你的编织文件的标题将与基本标题+中心一起保存,以便你事先知道你正在打开哪个文件。 So next thing to do in your scipt is one by one put your hubs into the function.因此,在您的 scipt 中要做的下一件事是将您的集线器一一放入函数中。 As a result you will get a document for each hub separately.因此,您将分别获得每个集线器的文档。

hubs_input <- c("B", "A")
library(purrr)
walk(hubs_input, render_html_fun)

Personally, I use the c() function and manually specify the dimensions, so that就个人而言,我使用 c() 函数并手动指定尺寸,以便

1 you do not have to load the data twice (once in the script and once in the markdown), especially when your data is big loading the data twice is not prefered 1 您不必两次加载数据(一次在脚本中,一次在降价中),尤其是当您的数据很大时,不希望两次加载数据

2 you can than choose not to run the script for a specific level if you do not want to. 2 如果您不想,您可以选择不为特定级别运行脚本。

But ofcourse if you prefer, you can also replace the c() part with your unique(WT$Hub) part.但是当然,如​​果您愿意,您也可以将c()部分替换为您unique(WT$Hub)部分。

Edit: I also found the website very usefull.编辑:我也发现该网站非常有用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM