简体   繁体   中英

Pandoc version 2.7.3 fails to convert knitr .tex file to .docx

I've been using knitr in combination with .Rnw files in Rstudio to generate both pdf and docx files without any issue until today. The pdf conversion runs natively with Rstudio and for the docx conversion I simply call pandoc under the hood by giving the .tex file resulting from 'knitting' the .Rnw file. So far, I've been using pandoc version 1.19.2.1 and works just fine. However, after sharing some of my code to perform this with a colleague, I've realized that the strategy fails when using a newer version of pandoc (2.7.3).

So far, I've tried to update knitr and understand the error without much success. The issue appears to be present only when shaded areas of the generated .tex file are needed, usually after say, echo=TRUE is set.

This is my Rnw file (min_reproducible_example.Rnw)

\documentclass{article}
\usepackage{multirow}
\setlength\parindent{0pt}
\usepackage{geometry}
\usepackage{longtable}
\usepackage{float}
\usepackage{verbatim}
\usepackage{hyperref}
\geometry{left=1.5cm,right=1.5cm,top=1.5cm,bottom=1.5cm}

\title{Docx from tex file example}

\begin{document}

\maketitle

<<chunk1,echo=TRUE,message=FALSE>>=
library(survival) 

str(lung)
@

\end{document}

which after hitting 'Compile PDF' in Rstudio generates files: min_reproducible_example.pdf and min_reproducible_example.tex.

Just in case, the .tex output of the .Rnw file (min_reproducible_example.tex) is

\documentclass{article}\usepackage[]{graphicx}\usepackage[]{color}
% maxwidth is the original width if it is less than linewidth
% otherwise use linewidth (to make sure the graphics do not exceed the margin)
\makeatletter
\def\maxwidth{ %
  \ifdim\Gin@nat@width>\linewidth
    \linewidth
  \else
    \Gin@nat@width
  \fi
}
\makeatother

\definecolor{fgcolor}{rgb}{0.345, 0.345, 0.345}
\newcommand{\hlnum}[1]{\textcolor[rgb]{0.686,0.059,0.569}{#1}}%
\newcommand{\hlstr}[1]{\textcolor[rgb]{0.192,0.494,0.8}{#1}}%
\newcommand{\hlcom}[1]{\textcolor[rgb]{0.678,0.584,0.686}{\textit{#1}}}%
\newcommand{\hlopt}[1]{\textcolor[rgb]{0,0,0}{#1}}%
\newcommand{\hlstd}[1]{\textcolor[rgb]{0.345,0.345,0.345}{#1}}%
\newcommand{\hlkwa}[1]{\textcolor[rgb]{0.161,0.373,0.58}{\textbf{#1}}}%
\newcommand{\hlkwb}[1]{\textcolor[rgb]{0.69,0.353,0.396}{#1}}%
\newcommand{\hlkwc}[1]{\textcolor[rgb]{0.333,0.667,0.333}{#1}}%
\newcommand{\hlkwd}[1]{\textcolor[rgb]{0.737,0.353,0.396}{\textbf{#1}}}%
\let\hlipl\hlkwb

\usepackage{framed}
\makeatletter
\newenvironment{kframe}{%
 \def\at@end@of@kframe{}%
 \ifinner\ifhmode%
  \def\at@end@of@kframe{\end{minipage}}%
  \begin{minipage}{\columnwidth}%
 \fi\fi%
 \def\FrameCommand##1{\hskip\@totalleftmargin \hskip-\fboxsep
 \colorbox{shadecolor}{##1}\hskip-\fboxsep
     % There is no \\@totalrightmargin, so:
     \hskip-\linewidth \hskip-\@totalleftmargin \hskip\columnwidth}%
 \MakeFramed {\advance\hsize-\width
   \@totalleftmargin\z@ \linewidth\hsize
   \@setminipage}}%
 {\par\unskip\endMakeFramed%
 \at@end@of@kframe}
\makeatother

\definecolor{shadecolor}{rgb}{.97, .97, .97}
\definecolor{messagecolor}{rgb}{0, 0, 0}
\definecolor{warningcolor}{rgb}{1, 0, 1}
\definecolor{errorcolor}{rgb}{1, 0, 0}
\newenvironment{knitrout}{}{} % an empty environment to be redefined in TeX

\usepackage{alltt}
\usepackage{multirow}
\setlength\parindent{0pt}
\usepackage{geometry}
\usepackage{longtable}
\usepackage{float}
\usepackage{verbatim}
\usepackage{hyperref}
\geometry{left=1.5cm,right=1.5cm,top=1.5cm,bottom=1.5cm}

\title{Docx from tex file example}
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\begin{document}

\maketitle

\begin{knitrout}
\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}\begin{kframe}
\begin{alltt}
\hlkwd{library}\hlstd{(survival)}
\end{alltt}
\end{kframe}
\end{knitrout}

\end{document}

Next, I can call a wrapper function that runs the following code in command line to generate the docx file:

path/to/pandoc/pandoc -o min_reproducible_example.docx min_reproducible_example.tex

I am working in Windows, so, I have not checked if this issue remains in other OS.

There are couple of lines that I believe can be informative:

This line is the culprit, I believe:

Error at "source" (line 68, column 67):
unexpected end of input
\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}\begin{kframe}

which, for what I've been digging 'kframe' comes from a latex environment created by knitr when doing the 'knitting'. This line generates the following error from pandoc:

Warning message:
In shell(command) :
  '"C:/pandoc/pandoc" -o min_reproducible_example.docx min_reproducible_example.tex --default-image-extension=png' execution failed with error code 65

I have no idea what this error code 65 means. I've seen threads from previous issues with pandoc that suggest to look directly at the code to understand the error. I can do that if needed be it is just odd to me that previous pandoc versions work and the newer one is crashing. I've decided to post this in here, wondering if anybody has run into the same issue.

I'm going to post one possible solution for future reference based on another conversation at pandoc-discuss.

After some back and forth with John MacFarlane, he kindly suggested me to redefine the environment that was causing trouble to pandoc: kframe. He suggested to redefine it simply as:

\renewenvironment{kframe}{}{}

So, what I did is to redefine a custom R function I have that calls pandoc internally. I'm only including the relevant lines below.

  ## read the original .tex file (.Rnw output)
  tx  <- readLines(paste0(fname, '.tex'),warn=FALSE)
  ## rename the environment to something simpler as suggested by John MacFarlane in the pandoc-discuss thread
  tx2 <- gsub(pattern = "\\begin{document}", 
              replace = "\\renewenvironment{kframe}{}{}\\begin{document}", 
              x = tx, fixed = TRUE)
  ## create a file with the workaround for the kframe environment and use it in the pandoc call below 
  zz <- file(paste0(fname, '_cp.tex'), "wb")
  writeLines(tx2, con=zz)
  close(zz)

  command <- paste0('"',pdwd,'/pandoc" -o ', fname, '.docx ', fname, '_cp.tex ',
                    "--default-image-extension=png ")
  shell(command)
  # remove the file 
  file.remove(paste0(fname, '_cp.tex')) 

After that pandoc is able to execute without complaints. I noticed that an earlier version of pandoc (1.19.2.1), although executes without error, outputs a docx file that lacks what's inside the kframe environment, while this fix renders a more accurate representation of the pdf.

I haven't tried this fix extensively so, please leave comments in case you find any issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM