简体   繁体   English

Python GUI Scraper挂起问题

[英]Python GUI Scraper hanging issues

I wrote a scraper using python a while back, and it worked fine in the command line. 我前一段时间使用python编写了一个scraper,在命令行中它工作正常。 I have made a GUI for the application now, but I am having trouble with one issue. 我已经为应用程序制作了一个GUI,但是遇到一个问题。 When I attempt to update text inside the gui (eg 'fetching URL 12/50'), I am unable seeing as the function within the scraper is grabbing 100+ links. 当我尝试更新gui中的文本时(例如'fetching URL 12/50'),由于刮板中的功能正在捕获100多个链接,因此无法看到。 Also when going from one scraping function, to a function that should update the gui, to another function, the gui update function seems to be skipped over while the next scrape function is run. 同样,当从一个抓取功能转到应更新gui的功能,再转到另一个功能时,运行下一个抓取功能时,gui更新功能似乎被跳过了。 An example would be: 一个例子是:

scrapeLinksA() #takes 20 seconds
updateInfo("LinksA done")
scrapeLinksB() #takes another 20 seconds

in the above example, updateInfo is never executed, unless I end the program with a KeyboardInterrupt. 在上面的示例中,除非我以KeyboardInterrupt结束程序,否则永远不会执行updateInfo。

I'm thinking my solution is threading, but I'm not sure. 我以为我的解决方案是线程化,但不确定。 What can I do to fix this? 我该怎么做才能解决此问题?

I am using: 我在用:

  • PyQt4 PyQt4中
  • urllib2 的urllib2
  • BeautifulSoup BeautifulSoup

I'd suggest to use QNetworkAccessManager for a non-blocking way of downloading the websites. 我建议将QNetworkAccessManager用于下载网站的非阻塞方式。 It's a different approach, so you will probably rewrite the handling part of your application. 这是另一种方法,因此您可能会重写应用程序的处理部分。 Instead of waiting until the page is downloaded so that you can parse it, you have multiple smaller functions, connected via signals and they are executed when some events happen (eg "the page is downloaded"). 您不必等待直到页面下载完毕即可解析它,而是通过信号连接了多个较小的函数,这些函数在发生某些事件(例如“页面下载”)时执行。

Lukáš Lalinský 's answer is very good. LukášLalinský的回答非常好。

Another possibility would be to use the PyQt threads . 另一种可能性是使用PyQt线程

If the problem is merely the 'updating' part (and not the need for asynchronous processing), try putting this call: 如果问题仅是“更新”部分(而不是异步处理的需要),请尝试进行以下调用:

QCoreApplication.processEvents()

between scrapeLinksA and scrapeLinksB to see if that helps (it temporarily interrupts the main event loop to see if there are other (paint requests eg) pending). scrapeLinksAscrapeLinksB之间查看是否有帮助(它会临时中断主事件循环,以查看是否还有其他(例如绘画请求)未决)。

If that doesn't, please provide us with the source of updateInfo . 如果不是这样,请向我们提供updateInfo的源。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM