python网页抓取，提取标签的内部元素

Question

我想从在线购物网站上抓取产品和价格，需要帮助提取标签之间的字符串

import bs4
from urllib.request import urlopen
from bs4 import BeautifulSoup as soup
my_url='https://www.flipkart.com/cameras/mirrorless~type/pr?sid=jek%2Cp31'
cl=urlopen(my_url)
page_html=cl.read()
ps=soup(page_html,'html5lib')
ps1=(ps.prettify())
cn=ps.findAll('div',{'class':'_1-2Iqu row'})
len(cn)                     
cn[0].div.div

#output-"<div class="_3wU53n">Canon M50 Mirrorless Camera Body with Single Lens EF-M 15-45 mm ISSTM</div>
#i need Canon M50 Mirrorless Camera Body with Single Lens EF-M 15-45 mm ISSTM

Answer 1

将 cn=ps.findAll('div',{'class':'_1-2Iqu row'}) 替换为 cn=ps.findAll('div',{'class':'_1-2Iqu row'},text=真的）

python网页抓取，提取标签的内部元素

问题描述

1 个解决方案

解决方案1
0 2020-01-03 18:50:32

python网页抓取，提取标签的内部元素

问题描述

1 个解决方案

解决方案1 0 2020-01-03 18:50:32

解决方案1
0 2020-01-03 18:50:32