[英]How to extract the following src (iframe) from the code using python (BeautifulSoup)
我试图从中提取“src”,但我没有成功。 此页面是动态的,只有在我搜索时才会出现。
网站:http ://191.253.16.180 :8080/ConsultaLei/Default.aspx?numero=3001
查看源:http ://191.253.16.180 :8080/ConsultaLei/Default.aspx?numero=3001
r = requests.get("http://191.253.16.180:8080/ConsultaLei/Default.aspx?numero=3001")
arquivo = BeautifulSoup(r.content, "html.parser")
for link in arquivo.find_all("iframe"):
print(link)
要在此站点请求上模拟 POST,您可以使用以下示例:
import requests
from bs4 import BeautifulSoup
url = "http://191.253.16.180:8080/ConsultaLei/Default.aspx"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = {}
for inp in soup.select("input[value]"):
data[inp["name"]] = inp["value"]
data["ctl00$MainContent$txtNumero"] = "3001" # <-- this is your number
data["ctl00$MainContent$ddlEspecie"] = ""
data["ctl00$MainContent$ddlAno"] = ""
data["ctl00$MainContent$txtConteudo"] = ""
data["ctl00$MainContent$txtEmenta"] = ""
data["ctl00$MainContent$imgBuscar.x"] = "1"
data["ctl00$MainContent$imgBuscar.y"] = "9"
soup = BeautifulSoup(requests.post(url, data=data).content, "html.parser")
print(soup.iframe["src"])
印刷:
../procuradoriacg/Leis\1994/8277_LEI30011994pag0001_strDocumentoOficial.pdf
编辑:要获得多个页面:
import requests
from bs4 import BeautifulSoup
url = "http://191.253.16.180:8080/ConsultaLei/Default.aspx"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = {}
for inp in soup.select("input[value]"):
data[inp["name"]] = inp["value"]
data["ctl00$MainContent$ddlEspecie"] = ""
data["ctl00$MainContent$ddlAno"] = ""
data["ctl00$MainContent$txtConteudo"] = ""
data["ctl00$MainContent$txtEmenta"] = ""
data["ctl00$MainContent$imgBuscar.x"] = "1"
data["ctl00$MainContent$imgBuscar.y"] = "9"
for i in range(3000, 3010):
data["ctl00$MainContent$txtNumero"] = i
s = BeautifulSoup(requests.post(url, data=data).content, "html.parser")
if s.find("iframe"):
print(i, s.iframe["src"])
else:
print(i, "Not Found")
印刷:
3000 Not Found
3001 ../procuradoriacg/Leis\1994/8277_LEI30011994pag0001_strDocumentoOficial.pdf
3002 Not Found
3003 ../procuradoriacg/Leis\1994/8279_LEI30031994pag0001_strDocumentoOficial.pdf
3004 Not Found
3005 Not Found
3006 ../procuradoriacg/Leis\1994/8282_LEI30061994pag0001_strDocumentoOficial.pdf
3007 Not Found
3008 Not Found
3009 Not Found
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.