简体繁体 English

使用python beautifulsoup进行Web爬网

[英]Web crawling using python beautifulsoup

原文 2016-03-04 08:45:19 6 1 python/ html/ beautifulsoup

如何提取<p>段落标签和<li>中命名的<div>类下的数据？

1 个解决方案

Use the functions find() and find_all() : 使用功能find()和find_all() ：

import requests
from bs4 import BeautifulSoup

url = '...'

r = requests.get(url)
data = r.text
soup = BeautifulSoup(data, 'html.parser')

div = soup.find('div', {'class':'class-name'})
ps = div.find_all('p')
lis = div.find_all('li')

# print the content of all <p> tags
for p in ps:
    print(p.text)

# print the content of all <li> tags
for li in lis:
    print(li.text)

使用 BeautifulSoup 进行网页抓取 - Web Crawling using BeautifulSoup

在Python中使用Beautifulsoup和Selenium爬网iframe - Crawling iframe using beautifulsoup and selenium in python

使用LazyLoader和Python BeautifulSoup对页面进行爬网 - Crawling a page using LazyLoader with Python BeautifulSoup

使用Selenium和Beautifulsoup执行javascript后爬网 - Crawling web after executing javascript using selenium and Beautifulsoup

使用python3中的beautifulsoup从html抓取锚标签的困难 - difficulty in crawling anchor tags from html using beautifulsoup in python3

Python-使用pandas和枚举进行网络爬网 - Python - Using pandas and enumerate for web crawling

使用启用了Cookies的Python请求进行网络爬网 - Web Crawling Using Python Request with Cookies Enabled

使用python html错误抓取网络数据 - crawling web data using python html error

在Python中使用re.findall（）进行Web爬网 - Using re.findall() in Python for Web Crawling

beautifulsoup Web爬网搜索ID列表 - beautifulsoup web crawling search id list

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 BeautifulSoup 进行网页抓取 - Web Crawling using BeautifulSoup 在Python中使用Beautifulsoup和Selenium爬网iframe - Crawling iframe using beautifulsoup and selenium in python 使用LazyLoader和Python BeautifulSoup对页面进行爬网 - Crawling a page using LazyLoader with Python BeautifulSoup 使用Selenium和Beautifulsoup执行javascript后爬网 - Crawling web after executing javascript using selenium and Beautifulsoup 使用python3中的beautifulsoup从html抓取锚标签的困难 - difficulty in crawling anchor tags from html using beautifulsoup in python3 Python-使用pandas和枚举进行网络爬网 - Python - Using pandas and enumerate for web crawling 使用启用了Cookies的Python请求进行网络爬网 - Web Crawling Using Python Request with Cookies Enabled 使用python html错误抓取网络数据 - crawling web data using python html error 在Python中使用re.findall（）进行Web爬网 - Using re.findall() in Python for Web Crawling beautifulsoup Web爬网搜索ID列表 - beautifulsoup web crawling search id list

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM