[英]Python: replace html img src after extraction with xpath
我從這個站點中提取了一些 html 代碼,現在我可以看到我刮除圖像的所有代碼,因為它們的 src 不正確。
#!C:/Python27/python
from lxml import etree
import requests
q = "http://www.dlib.org/dlib/november14/giannakopoulos/11giannakopoulos.html"
page = requests.get(q)
tree = etree.HTML(page.text)
element = tree.xpath('./body/form/table[3]/tr/td/table[5]')
content = etree.tostring(element[0])
print "Content-type: text\n\n"
print content.strip()
現在我讀取了正確的 img src(我想要的)並將其放入一個數組中:
pic=[]
link = q.rsplit("/",1)
images = tree.xpath("//img/@src")
for i in images:
if i.find('.gif') == -1:
pic.append(link[0]+"/"+i)
如何用數組中的 src 替換刮取的 src?
我很確定這就是你要找的。
link = q.rsplit("/",1)
images = tree.xpath("//img")
for idx, image in enumerate(images):
if '.gif' not in image.attrib['src']:
images[idx].attrib['src'] = link[0]+'/'+image.attrib['src']
for image in images:
print image.attrib['src']
它遍歷所選的每個圖像,如果'.gif'
不在圖像src
屬性中,它會將src
屬性更新為您指定的 PNG/JPG 路徑。
輸出
../../../img2/space.gif
../../../img2/search2.gif
../../../img2/space.gif
../../../img2/D-Lib-blocks.gif
../../../img2/transparent.gif
../../../img2/magazine.gif
../../../img2/transparent.gif
../../../img2/transparent.gif
../../../img2/space.gif
../../../img2/space.gif
http://www.dlib.org/dlib/november14/giannakopoulos/giann-formula1.png
http://www.dlib.org/dlib/november14/giannakopoulos/giann-fig1-sm.png
http://www.dlib.org/dlib/november14/giannakopoulos/giann-fig2.png
http://www.dlib.org/dlib/november14/giannakopoulos/giann-fig3.png
http://www.dlib.org/dlib/november14/giannakopoulos/giann-fig4.png
http://www.dlib.org/dlib/november14/giannakopoulos/giannakopoulos.jpg
http://www.dlib.org/dlib/november14/giannakopoulos/foufoulas.jpg
http://www.dlib.org/dlib/november14/giannakopoulos/stamatogiannakis.png
http://www.dlib.org/dlib/november14/giannakopoulos/dimitropoulos.jpg
http://www.dlib.org/dlib/november14/giannakopoulos/manola.jpg
http://www.dlib.org/dlib/november14/giannakopoulos/ioannidis.png
../../../img2/transparent.gif
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.