简体繁体中英

Extracting a section from web page using python

原文 2012-02-27 18:47:02 1 1 python/ html/ web-scraping/ lxml/ webpage

I want to extract the section of test for the section symptoms from the website below using python and lxml. Can anyone please help.

http://www.ncbi.nlm.nih.gov/pubmedhealth/PMH0001851/

Thank you,

1 answers

You want to Scrape a webpage with lxml? try this:

 from lxml.html import parse
 doc = parse("http://www.ncbi.nlm.nih.gov/pubmedhealth/PMH0001851/").getroot()
 for h2 in doc.cssselect('h2'):
     print h2.text_content()

this will open up grab the h2s from your page.

Python - Extracting data from web page using Beautifulsoup

Extracting data from a web page using BS4 in Python

Extracting a section of text from text file using python

Extracting table data from web using python

Extracting all Colors from Style of Span Class in web page using Python

Extracting data from multiple links within the same web page using python

Extracting specific section from txt file - python

UnicodeDecodeError when extracting comments from a web page using lxml and xpath

Extracting population by using web scraping in Python from dynamic graph

Python Extracting binary from a POST request using web.py

暂无

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Python - Extracting data from web page using Beautifulsoup Extracting data from a web page using BS4 in Python Extracting a section of text from text file using python Extracting table data from web using python Extracting all Colors from Style of Span Class in web page using Python Extracting data from multiple links within the same web page using python Extracting specific section from txt file - python UnicodeDecodeError when extracting comments from a web page using lxml and xpath Extracting population by using web scraping in Python from dynamic graph Python Extracting binary from a POST request using web.py

Related Tags

粤ICP备18138465号 © 2020-2024 STACKOOM.COM