简体   繁体   English

使用带斜线的show()将控制台与HTML一起使用

[英]Using show() with twill spams the console with HTML

I've been using the fuction twill.commands.show() to get the raw HTML from a page. 我一直在使用功能twill.commands.show()从页面获取原始HTML。 I run this about every 5 seconds. 我大约每5秒运行一次。 Every time the function is ran, It spams the console with the mentioned webpages raw HTML. 每次运行该函数时,它都会使用提到的webpages原始HTML来控制控制台。 I need to use the console for debugging, and since the console is filled with HTML constantly, Doing so is impossible. 我需要使用控制台进行调试,并且由于控制台不断填充HTML,因此这样做是不可能的。 Since show() is programmed to print the HTML and return it as a string, I would have to edit twill, which is way beyond my skillset, and makes the program incompatible on other devices. 由于show()被编程为打印HTML并返回一个字符串,我将不得不修改斜纹,这是远远超出了我的技能,并使得程序上的其他设备不兼容。 Although saving and reading the file over and over might work, it seems impractical to do every 5 seconds. 尽管一遍又一遍地保存和读取文件可能会起作用,但每5秒做一次似乎是不切实际的。

Code: 码:

go('http://google.com/')
html=show()

Again, twill has a save_html , which could be used to save to a file, but I'm doing this every 5 seconds and it could slow the program/computer, especially if it's being run on an older OS. 再次,twill有一个save_html ,可用于保存到文件,但我每隔5秒就会这样做,这可能会使程序/计算机变慢,特别是如果它在较旧的操作系统上运行。

Thanks! 谢谢!

Twill writes to stdout by default. Twill默认写入stdout

You can use twill.set_output(fp) for redirecting its standard output. 您可以使用twill.set_output(fp)重定向其标准输出。 There're several possible implementations for doing this: 这样做有几种可能的实现方式:

Write to a StringIO : 写入StringIO

from StringIO import StringIO
sio = StringIO()
twill.set_output(sio)
html = show() # html+'\n' == sio.getvalue()

or to /dev/null : 或者/dev/null

import os
null = open(os.devnull, 'w')
twill.set_output(null)
html = show() # writing to /dev/null or nul
null.close()

or to nothing at all: 或者根本没有:

class DevNull(object):
    def write(self, str):
        pass
twill.set_output(DevNull())
html = show()

or to any other writable file-like python object of your liking. 或者你喜欢的任何其他可写的文件类python对象。

Capture output in a string and replace all tags with empty string using regex, so that you can get text. 捕获字符串中的输出并使用正则表达式将所有标记替换为空字符串,以便您可以获取文本。

import re
from StringIO import StringIO

sio = StringIO()
twill.set_output(sio)
show()
print(re.sub(r'<.*?>','',sio.getvalue(),flags=re.DOTALL))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM