简体繁体中英

How should I avoid 404 errors when scraping with R

原文 2012-04-04 20:25:00 2 1 r/ web-scraping

I'm accessing web pages by looping over a couple of variables to insert into the URL

There will be occasional 404 errors.

How do I insert some sort of catch for these pages to avoid breaking the code. I currently use the XML package but of course could load others if appropriate

TIA

1 answers

Most of times I use RCurl::url.exists() . In case you have a list or a data frame containing all the urls you can try this:

map(p, ~ifelse(RCurl::url.exists(.), ., NA))

HTH!

How to avoid null character when scraping a web page with getURL() in R?

R - How do I skip a bad url in scraping pdfs from websites to avoid rerunning the scraping task?

How do i avoid flashing errors in Shiny R plot?

R - How do I solve 404 web scraping multiple links at once?

Amazon reviews web scraping in R: how to avoid running into an error, when one of the reviews is from another country?

R Script coming back with a few errors when scraping basic page

R how to avoid “for” when I want to go through dataframe

how do i avoid error in open.connection(x, "rb") : HTTP error 404 when webscraping with rvest

how to avoid R fisher.test workspace errors

How to avoid R errors while writing queries with single and double quotes

暂无

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How to avoid null character when scraping a web page with getURL() in R? R - How do I skip a bad url in scraping pdfs from websites to avoid rerunning the scraping task? How do i avoid flashing errors in Shiny R plot? R - How do I solve 404 web scraping multiple links at once? Amazon reviews web scraping in R: how to avoid running into an error, when one of the reviews is from another country? R Script coming back with a few errors when scraping basic page R how to avoid “for” when I want to go through dataframe how do i avoid error in open.connection(x, "rb") : HTTP error 404 when webscraping with rvest how to avoid R fisher.test workspace errors How to avoid R errors while writing queries with single and double quotes

Related Tags

粤ICP备18138465号 © 2020-2024 STACKOOM.COM