简体   繁体   中英

Is it possible to automatically scrape articles from websites - Python & Beautiful Soup

Trying to make a script to scrape one or two articles (article URLs only) from different websites, i was able to make a Python script that uses BeautifulSoup to get the website's HTML, find the website's Navbar menu via its Class name, and loop trough each website section, the problem is that each website has a different Class name or Xpath for the Navbar menu and its sections ..

Is there a way to make the script work for multiple websites with as little human intervention as possible ?

Any suggestions are more than welcome,

Thanks

做到了,我只需要使用Python和Selenium,每个网站的Navbar元素使用一个Xpath,不同网站页面上的所有类型的文章使用另一个Xpath,将所有内容保存在数据库中,其余仅针对我们的特定需求进行定制需求,最终并没有那么复杂,感谢您的帮助<3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM