繁体   English   中英

从html中截取输入值

[英]Scrape input value from html

我需要使用BeautifulSoup从HTML中删除输入隐藏值,我有这个html格式:

<form method="post" enctype="multipart/form-data" action="http://localhost/wp-admin/update.php?action=upload-plugin">
        <input type="hidden" id="_wpnonce" name="_wpnonce" value="e2e315cd8f">
        <input type="hidden" name="_wp_http_referer" value="/wp-admin/plugin-install.php?tab=upload">       
        <label class="screen-reader-text" for="pluginzip">Plugin zip file</label>
        <input type="file" id="pluginzip" name="pluginzip">
        <input type="submit" class="button" value="Install Now">
</form>

我写了这段代码:

buf_pagina1 = cStringIO.StringIO()
c.setopt(c.URL, wp_url)
c.setopt(c.WRITEFUNCTION, buf_pagina1.write)
c.setopt(c.COOKIEFILE, '')
c.setopt(c.CONNECTTIMEOUT, 5)
c.setopt(c.AUTOREFERER,1)
c.setopt(c.FOLLOWLOCATION, 1)
c.setopt(c.TIMEOUT, 15)
c.setopt(c.USERAGENT, 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1667.0 Safari/537.36')
c.perform()
html1           = buf_pagina1.getvalue()
buf_pagina1.close()
print html1

我需要从这个输入中获取价值:

<input type="hidden" id="_wpnonce" name="_wpnonce" value="e2e315cd8f">

您可以通过id找到输入并获取“value”属性的值。 这是一个例子:

from bs4 import BeautifulSoup


data = """<form method="post" enctype="multipart/form-data" action="http://localhost/wp-admin/update.php?action=upload-plugin">
        <input type="hidden" id="_wpnonce" name="_wpnonce" value="e2e315cd8f"><input type="hidden" name="_wp_http_referer" value="/wp-admin/plugin-install.php?tab=upload">     <label class="screen-reader-text" for="pluginzip">Plugin zip file</label>
        <input type="file" id="pluginzip" name="pluginzip">
        <input type="submit" class="button" value="Install Now">
    </form>"""

soup = BeautifulSoup(data)
print soup.find("input", {'id': "_wpnonce"}).attrs['value']

打印:

e2e315cd8f

希望有所帮助。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM