简体   繁体   中英

Scrape entire site with relative links

I am currently working a php-script based on Symfony Dom Crawler and Goutte . They offer a fairly good possibility to scrape tags, selectors, but is it some easy good way to scrape the entire site and append full link to all links in the source-code?

When i make a instance of my crawl-class i specify the page, and just want to append that link in front of all the local links on the page. Any ideas?

Are you tied to PHP? If not, you could use Zillabyte's domain_crawler component from the shell:

$ zillabyte execute domain_crawl "example.com" --output_file some_file

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM