简体   繁体   English

R knitr:将 spin() 与 R 和 Python 代码一起使用

[英]R knitr: use spin() with R and Python code

With the advent of reticulate , combining R and Python in a single.Rmd document has become increasingly popular among the R community (myself included).随着reticulate的出现,将 R 和 Python 组合在一个单一的文件中。Rmd 文档在 R 社区(包括我自己)中变得越来越流行。 Now, my personal workflow usually starts with an R script and, at some point, I create a shareable report using knitr::spin() with the plain.R document as input in order to avoid code duplication (see also Knitr's best hidden gem: spin for more on the topic).现在,我的个人工作流程通常从 R 脚本开始,在某些时候,我使用knitr::spin()和 plain.R 文档作为输入创建一个可共享的报告,以避免代码重复(另请参阅Knitr 的最佳隐藏宝石:旋转以获取有关该主题的更多信息)。

However, as soon as Python code is involved in my analysis, I am currently forced to break this workflow and manually convert (ie. copy and paste) my initial.R script into.Rmd before compiling the report.但是,一旦 Python 代码参与我的分析,我目前被迫中断此工作流程并在编译报告之前手动将我的 initial.R 脚本转换(即复制和粘贴)为.Rmd。 I wonder, does anybody know whether it is – or for that matter, will ever be – possible to make knitr::spin() work with both R and Python code chunks in a single.R file without taking this detour?我想知道,是否有人知道是否有可能——或者就此而言,是否有可能使knitr::spin()与 R 和 Python 代码块一起工作在一个单一的.ZE1E1D3D40573127D69EE 文件中? I mean, just like it works when mixing the two languages, and exchanging objects between them, in a.Rmd file.我的意思是,就像它在混合两种语言并在它们之间交换对象时一样,在一个 .Rmd 文件中。 There is, at least to the best of my knowledge, no possibility to add something like engine = 'python' to spin documents at the moment.至少据我所知,目前不可能添加类似engine = 'python'东西来旋转文档。

Use of reticulate::source_python could be one solution.使用reticulate::source_python可能是一种解决方案。

For example, here is a simple.R script which will be spun to.Rmd and then rendered to.html例如,这是一个简单的.R 脚本,它将被旋转到.Rmd,然后渲染到.html

spin-me.R旋转我.R

#'---
#'title: R and Python in a spin file.
#'---
#'
#' This is an example of one way to write one R script, containing both R and
#' python, and can be spun to .Rmd via knitr::spin.
#'
#+ label = "setup"
library(nycflights13)
library(ggplot2)
library(reticulate)
use_condaenv()

#'
#' Create the file flights.csv to
#'
#+ label = "create_flights_csv"
write.csv(flights, file = "flights.csv")

#'
#' The file flights.py will read in the data from the flights.csv file.  It can
#' be evaluated in this script via source_python().  This sould add a data.frame
#' called `py_flights` to the workspace.
source_python(file = "flights.py")

#'
#' And now, plot the results.
#'
#+ label = "plot"
ggplot(py_flights) + aes(carrier, arr_delay) + geom_point() + geom_jitter()


# /* spin and knit this file to html
knitr::spin(hair = "spin-me.R", knit = FALSE)
rmarkdown::render("spin-me.Rmd")
# */

The python file is python 文件是

flights.py航班.py

import pandas
py_flights = pandas.read_csv("flights.csv")
py_flights = py_flights[py_flights['dest'] == "ORD"]
py_flights = py_flights[['carrier', 'dep_delay', 'arr_delay']]
py_flights = py_flights.dropna()

And a screen capture of the resulting.html is:结果的屏幕截图。html 是:

在此处输入图像描述

EDIT If keeping everything in one file is a must, then before the source_python call you could create a python file, eg,编辑如果必须将所有内容保存在一个文件中,那么在source_python调用之前,您可以创建一个 python 文件,例如,

pycode <-
'import pandas
py_flights = pandas.read_csv("flights.csv")
py_flights = py_flights[py_flights["dest"] == "ORD"]
py_flights = py_flights[["carrier", "dep_delay", "arr_delay"]]
py_flights = py_flights.dropna()
'
cat(pycode, file = "temp.py")
source_python(file = "temp.py")

My opinion: having the python code in its own file would be preferable to having it created in the R script for two reasons:我的意见:在自己的文件中包含 python 代码比在 R 脚本中创建代码更好,原因有二:

  1. Easier reuse of the python code更容易重用 python 代码
  2. Syntax highlighting in my IDE is lost for the python code when written as a string an not in its own file.我的 IDE 代码中的语法突出显示在 python 代码写为字符串而不是在其自己的文件中时丢失。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM