简体繁体中英

How to get the source of html in lxml?

原文 2012-12-31 06:31:00 4 1 python/ lxml

import urllib
import lxml.html
down='http://blog.sina.com.cn/s/blog_71f3890901017hof.html'
file=urllib.urlopen(down).read()
root=lxml.html.document_fromstring(file)
body=root.xpath('//div[@class="articalContent  "]')[0]
print body.text_content()

When i run the code, what i get is the text content ,how can i get the html source code of it,not the text content?

1 answers

Use

html = lxml.html.tostring(node)

and please: read the basic documentation of the tools you are using first.

How to get text of broken html with lxml

How to get an attribute value with lxml on html

How to get a html elements with python lxml

How to get text from HTML element by using lxml.html

python, lxml and how to get html code from subset

how to get the objectname from <class 'lxml.html.HtmlElement'>

Get the inner HTML of a element in lxml

How to get current url of a parsed HTML page in Python with lxml?

How can I get the text from this HTML snippet using lxml?

how to get unresolved entities from html attributes using python and lxml

暂无

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How to get text of broken html with lxml How to get an attribute value with lxml on html How to get a html elements with python lxml How to get text from HTML element by using lxml.html python, lxml and how to get html code from subset how to get the objectname from <class 'lxml.html.HtmlElement'> Get the inner HTML of a element in lxml How to get current url of a parsed HTML page in Python with lxml? How can I get the text from this HTML snippet using lxml? how to get unresolved entities from html attributes using python and lxml

Related Tags

粤ICP备18138465号 © 2020-2024 STACKOOM.COM