简体   繁体   English

QtWebkit在Windows上崩溃了Python

[英]QtWebkit crashes Python on Windows

I am trying to scrape a website that uses javascript. 我试图刮一个使用JavaScript的网站。 I am using the following code: 我使用以下代码:

import os
import sys
import re
import requests
import mechanize
import cookielib
from bs4 import BeautifulSoup
from PyQt4.QtGui import *  
from PyQt4.QtCore import *  
from PyQt4.QtWebKit import *  
from lxml import html
import pandas as pd
import time

class Render(QWebPage):
    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebPage.__init__(self)
        self.loadFinished.connect(self._loadFinished)
        self.mainFrame().load(QUrl(url))
        self.app.exec_()


    def _loadFinished(self, result):
        self.frame = self.mainFrame()
        self.app.quit()

def read_page(url):
    r = Render(url)
    result = r.frame.toHtml()
    text = str(result.toAscii())
    html_source = html.fromstring(text)
    return text, html_source

for url in urls:
    text, html_source = read_page(url)

After reading successfully the first url, on the second url it displays the following message and python.exe crashes. 成功读取第一个URL后,在第二个URL上显示以下消息并且python.exe崩溃。

content-type missing in HTTP POST, defaulting to application/x-www-form-urlencoded. Use QNetworkRequest::setHeader() to fix this problem.
QObject::connect: Cannot connect (null)::configurationAdded(QNetworkConfiguration) to QNetworkConfigurationManager::configurationAdded(QNetworkConfiguration)
QObject::connect: Cannot connect (null)::configurationRemoved(QNetworkConfiguration) to QNetworkConfigurationManager::configurationRemoved(QNetworkConfiguration)
QObject::connect: Cannot connect (null)::configurationChanged(QNetworkConfiguration) to QNetworkConfigurationManager::configurationChanged(QNetworkConfiguration)
QObject::connect: Cannot connect (null)::onlineStateChanged(bool) to QNetworkConfigurationManager::onlineStateChanged(bool)
QObject::connect: Cannot connect (null)::configurationUpdateComplete() to QNetworkConfigurationManager::updateCompleted()

This is a little late, but I've been playing around with web scraping recently and I ran into your problem. 这有点晚了,但我最近一直在玩网络抓取,我遇到了你的问题。 The problem is you're trying to run several QApplications, which doesn't work, (i don't really completely understand why though :/). 问题是你正在尝试运行几个QApplications,这是行不通的(我真的不完全理解为什么:/)。 You should try doing something like whats here . 你应该尝试做类似什么的事情

So basically instead of trying to start a few Qapps, just make the one Qapp and have it load everything. 所以基本上没有尝试启动一些Qapp,只需制作一个Qapp并让它加载一切。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM