I'm downloading a Google Doc as .docx and then converting to markdown for manipulation and export to multiple formats.
Problem: When I convert using pandoc, it strips title (and subtitle) and does not add any YAML header information. I could add title manually in the header, but I need it to be scripted, so need to not lose the title (ideally) or extract title from docx and add to YAML header, which would then be concatenated to the converted markdown file.
Example Code, where title is lost on conversion from docx to markdown:
require(rmarkdown);require(devtools)
examplefile=paste0(tempdir(),"/example.docx")
download.file("https://file-examples.com/wp-content/uploads/2017/02/file-sample_100kB.docx",destfile=examplefile)
pandoc_convert(examplefile,to="markdown",output = "example.rmd", options=c("--extract-media=."))
render(paste0(tempdir(), "/example.rmd"),"html_document")
browseURL(paste0(tempdir(),"/example.html"))
When converting from docx to markdown (or another markup format like rst) you need to include the -s
or --standalone
option.
From the pandoc documentation :
-s, --standalone
Produce output with an appropriate header and footer (eg a standalone HTML, LaTeX, TEI, or RTF file, not a fragment). This option is set automatically for pdf, epub, epub3, fb2, docx, and odt output. For native output, this option causes metadata to be included; otherwise, metadata is suppressed.
Without the -s
this data is suppressed.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.