簡體   English   中英

Python beautifulSoup 抓取下拉菜單

[英]Python beautifulSoup scraping dropdowns

I'm trying to scrape the search result on this link: https://www.inecnigeria.org/elections/polling-units/ which requires that I select a dropdown value and then another shows up which I have to select from before searching . 我能夠從第一個下拉選擇中獲取值,但不能從其他下拉選擇中獲取值。 這是我目前擁有的:

from bs4 import BeautifulSoup
import requests

base = 'https://www.inecnigeria.org/elections/polling-units/'

base_req = requests.get(base, verify=False)

soup = BeautifulSoup( base_req.text, "html.parser" )

# states
states = soup.find('select', id = "statePoll")

stateItems = states.select('option[value]')

stateValues = [ stateItem.text for stateItem in stateItems ]


# print(stateValues)

lgas = soup.find('select', id = "lgaPoll")

lgaItems = lgas.select('option[value]')

lgaValues = [ lgaItem.text for lgaItem in lgaItems ]


print(lgas)

實際上,您無法通過在該頁面上抓取 HTML 來獲得這些值。 該頁面使用 JavaScript 從另一個頁面請求選項並將它們動態插入到頁面中。 您將不得不使用您可以抓取的信息自己提出此類請求。 這是一個如何進行下一步的示例,應該向您展示總體思路:

from bs4 import BeautifulSoup
import requests

base = 'https://www.inecnigeria.org/elections/polling-units/'
lga_view = 'https://www.inecnigeria.org/wp-content/themes/independent-national-electoral-commission/custom/views/lgaView.php'
base_req = requests.get(base, verify=False)
soup = BeautifulSoup(base_req.text, "html.parser" )

states = soup.find('select', id = "statePoll")
state_options = states.find_all('option')
states = {opt.text: int(opt['value']) for opt in state_options if 'value' in opt.attrs}

lga = {k: requests.post(lga_view, {'state_id': v}, verify=False).json() for k,v in states.items()}

print(lga)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM