简体   繁体   中英

How cut latex acronym chain in R dataframe

I have a latex file with my acronym definitions like :

\newacronym{AEP}{AEP}{Alimentation en Eau Potable}
\newacronym{AERMC}{AERMC}{Agence de l'Eau Rhône Méditerranée et Corse}
\newacronym[longplural=Cotes d'Abondance Numériques]{CAN}{CAN}{Cote d'Abondance Numérique}

My aim is to have a data frame with two columns like :

AEP     Alimentation en Eau Potable
AERMC   Agence de l'Eau Rhône Méditerranée et Corse
CAN     Cote d'Abondance Numérique

I think it's possible with regex or strsplit formula, but I can't establish it, with lot of problems with {

acronymes <- read_lines("acronymes.tex")
acronymes <- as.tbl(as.data.frame(acronymes))
acronymes %>% 
    rename(Complet = acronymes) %>% 
    filter(!grepl("^%.*", Complet)) # Because I have non used lines
acronymes$ABR <- sub("}.*","", acronymes$Complet)

Have you ideas or explicite manual for regex formulas ? Thank you

Maybe not the most elegant solution, but this works. You need to escape the braces with a double backslash:

a <- readLines("acronymes.tex")
acronyms <- gsub(".*\\}\\{(.*)\\}\\{.*", "\\1", a)
descriptions <- gsub(".*\\}\\{(.*)\\}$", "\\1", a)
data.frame(acronyms, descriptions)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM