简体   繁体   中英

Set page size in QWebView in PyQt4

I write a python script using PyQt4 to crawl web pages including AJAX. It worked, but it can only get 'one-screen' page which I think relates to the screen resolution. I run the script on CentOS server which doesn't have X environment. I use Xvfb and do the following settings:

$ Xvfb :100 -screen 0 9000x9000x24 &
$ export DISPLAY=:100

but it didn't help to get more web info.

I'm new to PyQt4. Is there a method to set the QWebView size to enlarge the display window size?

Any manual about QtWebkit is also appreciated.

The following is my code:

#!/usr/bin/env python
#coding: utf-8


import sys

from PyQt4.QtCore import QUrl, SIGNAL, QSize
from PyQt4.QtGui import QApplication
from PyQt4.QtWebKit import QWebPage, QWebView, QWebSettings

class WebPage(QWebPage):

    def javaScriptConsoleMessage(self, message, lineNumber, sourceID):
        sys.stderr.write('Javascritp error at line number %d\n' % (lineNumber))
        sys.stderr.write('%s\n' % (message, ))
        sys.stderr.write('Source ID: %s\n' % (sourceID, ))


class Crawler(QApplication):

    def __init__(self, url):
        super(Crawler, self).__init__(sys.argv)
        self.url = url
        self.web_view = QWebView()
        self.web_page = WebPage()
        self.web_view.setPage(self.web_page)
        self.web_frame = self.web_page.currentFrame()

        print 'Before connecting'
        self.connect(self.web_view, SIGNAL('loadFinished(bool)'), self.loadFinished)
        print 'After connecting'

        print 'Before loading'
        self.web_frame.load(QUrl(self.url))
        print 'After loading'

    def loadFinished(self, ok):
        self.web_page.setViewportSize(self.web_page.mainFrame().contentsSize())
        print 'In callback, before writing'
        with open('jd.txt', 'ab+') as fp:
            fp.write(self.web_page.currentFrame().toHtml().toUtf8())
        print 'In callback, after writing'


if __name__ == '__main__':
    url = 'http://www.360buy.com/product/707047.html'
    crawler = Crawler(url)
    sys.exit(crawler.exec_())

You can resize the web_page to it's actual size by using the setViewportSize method:

self.web_page.setViewportSize(self.web_page.mainFrame().contentsSize())

Trigger scroll event after loadFinished is emitted:

def loadFinished(self, ok):
    js_scroll = "window.scrollBy(0, 200);"
    self.web_page.mainFrame().documentElement().evaluateJavaScript(js_scroll)

Although I'm not sure how the page you're loading is working actually, you may need to wait until the ajax request is completed after the scroll event for the data to appear on the page.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM