简体   繁体   English

程序停留在URL上,无法前进并且不会超时

[英]Program stuck on an URL, cant move forward and won't time out

My program is a web crawler and its been stuck on a URL that apparently corresponds to a random Chinese site. 我的程序是一个网络爬虫,它被卡在一个URL上,该URL显然对应于一个随机的中文站点。 For some reason its not throwing an exception and the connection is not timing out. 由于某种原因,它没有引发异常,并且连接没有超时。 I would have thought that these lines would prevent that. 我本以为这些行会阻止这种情况。

static URLConnection in;
in = curURL.openConnection();
in.setConnectTimeout(2000);
pageSource = new StreamedSource(in);

I'm nearly positive this is the issue, any checks on the heap dump for memory leaks turned up nothing. 我几乎肯定这是问题,对堆转储进行的任何内存泄漏检查都没有结果。

setConnectTimeout() only controls the timeout for establishing the connection. setConnectTimeout()仅控制建立连接的超时。 Once it has been started, it can last for a long time (basically until the server closes it). 一旦启动,它可以持续很长时间(基本上直到服务器关闭它为止)。 For instance you might be downloading a very large file over a slow link. 例如,您可能正在通过慢速链接下载很大的文件。

One solution would be to add a watchdog thread monitoring the connections and closing those which exceed some time limit. 一种解决方案是添加一个监视程序线程来监视连接并关闭超过一定时间限制的连接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM