[英]Beautiful soup - Find children tag attribute content
源代碼:
<div class="wrapper">
<div id="mask" style="display: none;"></div>
<div id="video">
<span id="pid" hidden="">2</span>
<div poster="https://thumbs.vodgc.net/57377706F7D28069F41A23A14DC5CC64.jpg?673333" autoplay="true" data-setup="{ "techOrder": ["html5"]}"
preload="none" class="video-js vjs-default-skin vjs-controls-enabled vjs-workinghover vjs-has-started media_player-dimensions vjs-paused vjs-user-inactive"
id="media_player" role="region" aria-label="video player">
<video id="media_player_html5_api" class="vjs-tech" preload="none" data-setup="{ "techOrder": ["html5"]}"
autoplay="" src="blob:https://api.vodgc.net/5bb5a7a7-6c9b-49f1-883b-784871f95d8b">
<source src="https://vod.vodgc.net/manifest/57377706F7D28069F41A23A14DC5CC64.m3u8" type="application/x-mpegURL">
</video>
<div>
我正在嘗試在“ source”標記中查找“ src”屬性的內容,但結果一直為None或為空列表。
這是我的代碼:
from urllib import request
from bs4 import BeautifulSoup
hdr = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36',
'Accept':
'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
'Accept-Encoding': 'none',
'Accept-Language': 'en-US,en;q=0.8',
'Connection': 'keep-alive'}
url = 'https://www.eltrecetv.com.ar/programas/simona/capitulos-completos/capitulo-4_099474'
req = request.Request(url, headers=hdr)
page = request.urlopen(req)
soup = BeautifulSoup(page,'lxml')
sources = soup.find('div', class_ ='wrapper')
for tag in sources:
video = tag.find_next_siblings('video')
print(video)
通過將source
標記傳遞給find_all
方法來訪問src
屬性:
from bs4 import BeautifulSoup as soup
s = """
<div class="wrapper">
<div id="mask" style="display: none;"></div>
<div id="video">
<span id="pid" hidden="">2</span>
<div poster="https://thumbs.vodgc.net/57377706F7D28069F41A23A14DC5CC64.jpg?673333" autoplay="true" data-setup="{ "techOrder": ["html5"]}"
preload="none" class="video-js vjs-default-skin vjs-controls-enabled vjs-workinghover vjs-has-started media_player-dimensions vjs-paused vjs-user-inactive"
id="media_player" role="region" aria-label="video player">
<video id="media_player_html5_api" class="vjs-tech" preload="none" data-setup="{ "techOrder": ["html5"]}"
autoplay="" src="blob:https://api.vodgc.net/5bb5a7a7-6c9b-49f1-883b-784871f95d8b">
<source src="https://vod.vodgc.net/manifest/57377706F7D28069F41A23A14DC5CC64.m3u8" type="application/x-mpegURL">
</video>
<div>
"""
d = soup(s, 'lxml')
print([i['src'] for i in d.find_all('source')])
輸出:
['https://vod.vodgc.net/manifest/57377706F7D28069F41A23A14DC5CC64.m3u8']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.