简体   繁体   中英

how to apply gsub in html text in R

I have this vector in r called t :

"<!-- html table generated in R 3.0.0 by xtable 1.7-1 package -->\n<!-- Wed May 07 13:40:25 2014 -->\n<TABLE border=1>\n<TR> <TH>  </TH> <TH> Description </TH> <TH> Value </TH>  </TR>\n

I need to add a bgcolor to the TH propery as follows:

t <- gsub("\\<TH\\> Description \\<\\/TH\\> \\<TH\\> Value \\<\\/TH\\>","\\<TH bgcolor\\="#CAC740"\\> Description \\<\\/TH\\> \\<TH bgcolor\\="#CAC740"\\> Value \\<\\/TH\\>",t)

I've made sure that I covered all the double quotes. It looks like the gsub is not working. Any ideas what might be wrong here?

You could use package:XML .

library(XML)

html <- "<!-- html table generated in R 3.0.0 by xtable 1.7-1 package -->\n<!-- Wed May 07 13:40:25 2014 -->\n<TABLE border=1>\n<TR> <TH>  </TH> <TH> Description </TH> <TH> Value </TH>  </TR>\n"

doc <- htmlParse(html)

for (x in c("Description", "Value")) {
    xpath <- sprintf("//th[contains(string(.), '%s')]", x)
    node <- getNodeSet(doc, xpath)[[1]]
    addAttributes(node, bgcolor = "#CAC740")
}

f <- file()
saveXML(doc, f)

paste(tail(readLines(f), -1), collapse = "")
## [1] "<!-- html table generated in R 3.0.0 by xtable 1.7-1 package --><!-- Wed May 07 13:40:25 2014 --><html><body><table border=\"1\"><tr><th>  </th> <th bgcolor=\"#CAC740\"> Description </th> <th bgcolor=\"#CAC740\"> Value </th>  </tr></table></body></html>"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM