简体   繁体   中英

Extract package names from R scripts

I am trying to write a function to extract package names from a list of R script files. My regular expression do not seem to be working and I am not sure why. For begginers, I am not able to match lines that include library . For example

str <- c("           library(abc)", "library(def)", "some other text")
grep("library\\(", str, value = TRUE)
grep("library\\(+[A-z]\\)", str, value = TRUE)

Why does my second grep do not return elements 1 and 2 from the str vector? I have tried so many options but all my results come back empty.

Your second grep does not return 1,2 for two reasons.

  1. You used value=TRUE which makes it return the matching string instead of the location. and
  2. You misplaced the +. You want grep("library\\\\(\\\\w+\\\\)", str)

If you'd like something a bit more robust that will handle some edge cases ( library() takes a number of parameters and the package one can be a name/symbol or a string and doesn't necessarily have to be specified first):

library(purrr)

script <-  '
library(js) ; library(foo)
#
library("V8")
ls()
library(package=rvest)
TRUE
library(package="hrbrthemes")
1 + 1
library(quietly=TRUE, "ggplot2")
library(quietly=TRUE, package=dplyr, verbose=TRUE)
'
x <- parse(textConnection(script)) # parse w/o eval

keep(x, is.language) %>%                       # `library()` is a language object
  keep(~languageEl(.x, 1) == "library") %>%    # other things are too, so only keep `library()` ones
  map(as.call) %>%                             # turn it into a `call` object 
  map(match.call, definition = library) %>%    # so we can match up parameters and get them in the right order
  map(languageEl, 2) %>%                       # language element 1 is `library`
  map_chr(as.character) %>%                    # turn names/symbols into characters
  sort()                                       # why not
## [1] "dplyr"      "foo"        "ggplot2"    "hrbrthemes" "js"         "rvest"      "V8"

This won't catch library() calls within functions (it could be expanded to do that) but if top-level edge cases are infrequent, there is an even smaller likelihood of ones in functions (those wld likely use require() as well).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM