简体   繁体   English

无法在代理中打开链接

[英]Unable to open the link in proxy

I am actually using a proxy to scrape data from some sites but the problem is sometimes some proy url returns nothing and programmed stopped after a few tries, I need some logic to overcome this issue so that even if IP does not respond program should renew the IP and try to open the page again, I am using TOR as a proxy in python. 我实际上是使用代理来从某些站点获取数据,但问题是有时一些proy url没有返回任何东西并且编程在几次尝试后停止,我需要一些逻辑来克服这个问题,以便即使IP不响应程序也应该更新IP并尝试再次打开页面,我在Tthon中使用TOR作为代理。

Here is my website opening code: 这是我的网站开放代码:

mainPage = requests.get("http://proxy_IP/?link=http://example.com/")
mainTree = html.fromstring(mainPage.text)

You can simply put your code in while loop and give it certain condition, when that condition becomes TRUE, it means your page is properly opened. 您可以简单地将代码置于while循环中并赋予其特定条件,当该条件变为TRUE时,表示您的页面已正确打开。

mainPage = requests.get("http://proxy_IP/?link=http://example.com/")
mainTree = html.fromstring(mainPage.text)

mainTree
while (mainTree.xpath('boolean(some_xpath_to_be_true])') != True):
    mainPage = requests.get("http://proxy_IP/?link=http://example.com/")
    mainTree = html.fromstring(mainPage.text)

Now your mainTree contains the page source correctly. 现在,您的mainTree正确包含页面源。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM