简体   繁体   English

抓取后出现网站错误

[英]Website error after scraping

I made a simple scraper that accesses an album, and scrapes lyrics for each song from azlyrics.com. 我制作了一个简单的刮板,可以访问一张专辑,并从azlyrics.com刮刮每首歌曲的歌词。

After about an hour of working, the website crashed, with an error: 工作大约一个小时后,该网站崩溃了,并出现错误:

Chrome: 铬:

www.azlyrics.com didn't send any data. www.azlyrics.com没有发送任何数据。 ERR_EMPTY_RESPONSE ERR_EMPTY_RESPONSE

Tor, firefox, waterfox: Tor,firefox,waterfox:

The connection was reset The connection to the server was reset while the page was loading. 重置连接加载页面时重置了与服务器的连接。

It's the same for all devices on my home network. 我的家庭网络中的所有设备都是一样的。 If I use mobile data to access it via my phone it works fine. 如果我使用移动数据通过手机访问它,则可以正常工作。

I tried fixing it with ipconfig /release /renew, but it didn't work. 我尝试使用ipconfig / release / renew修复它,但是没有用。 I'm at a loss for what else I could do or why it even happened. 我茫然不知所措,甚至为何会发生。 Any help is greatly appreciated. 任何帮助是极大的赞赏。

Apparently your IP was banned by the website for suspicious activity. 显然,您的IP因可疑活动而被该网站禁止。 There are couple ways around that: 有几种解决方法:

  • talk to website owners. 与网站所有者交谈。 This is the most straightforward and nicest way 这是最直接,最好的方法
  • change your IP, eg by connecting though a pool of public proxies or Tor. 更改您的IP,例如通过公共代理池或Tor连接。 This is a little bit dirty and it is not so robust, eg you can be banned by user-agent or some other properties of your scraper. 这有点脏,并且不够坚固,例如,用户代理或刮板的某些其他属性可能会禁止您使用它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM