简体   繁体   English

使用python 3和Selenium抓取动态生成的表

[英]Using python 3 and Selenium to scrape a dynamicaly generated table

I'm new to Python, and trying to scrape a dynamically generated table. 我是Python的新手,正在尝试抓取动态生成的表。 I've got far enough to open the page, input a search, and have the results table show off. 我已经足够打开页面,输入搜索内容,并显示结果表。 I'm having trouble scraping the results, and I noticed the specific text of the results isn't part of the HTML. 我在抓取结果时遇到了麻烦,并且我注意到结果的特定文本不是HTML的一部分。 Here's my code so far, thanks for any and all help. 到目前为止,这是我的代码,感谢您的所有帮助。

## module importation
import os, requests, bs4, openpyxl, webbrowser, lxml, html5lib, re
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

print('Type in the FIRST NAME of the individual.')
#I've been using [Mike] here.
firstName = input()
print('Thanks. Now type in the individual\'s LAST NAME.')
#I've been using [Jones] here.
lastName = input()

browser = webdriver.Firefox(executable_path='/usr/local/bin/geckodriver')
#BoP inmate locator

#Goes to BoP website
browser.get('https://www.bop.gov/inmateloc/')
res = requests.get('https://www.bop.gov/inmateloc/')

#Clicks Search by name option (just in case)
searchByNameButton = browser.find_element_by_css_selector("#ui-id-1")
searchByNameButton.click() # clicks the Search by Name Button

#enters first name
bopSearchFirstNameElem = 
browser.find_element_by_css_selector('#inmNameFirst')
bopSearchFirstNameElem.send_keys(firstName)

#enters last name
bopSearchLastNameElem = 
browser.find_element_by_css_selector('#inmNameLast')
bopSearchLastNameElem.send_keys(lastName)

# Clicks search
searchSubmitButton = 
browser.find_element_by_css_selector('#searchNameButton')
searchSubmitButton.click() # clicks the Search Button on the BoP page

# Scrape table results
bopResultsPage = bs4.BeautifulSoup(res.text, 'html.parser')

This will work perfectly: 这将完美地工作:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.ui import WebDriverWait

firstName = input('Insert your first name: ')
lastName = input('Insert your last name: ')

browser = webdriver.Firefox(executable_path='/usr/local/bin/geckodriver')
browser.get('https://www.bop.gov/inmateloc/')
browser.implicitly_wait(2)
browser.find_element_by_css_selector("#ui-id-1").click()
browser.find_element_by_css_selector('#inmNameFirst').send_keys(firstName)
browser.find_element_by_css_selector('#inmNameLast').send_keys(lastName)
browser.find_element_by_css_selector('#searchNameButton').click()
WebDriverWait(browser, 5).until(expected_conditions.text_to_be_present_in_element((By.XPATH, '//*[@id="nameBriefTd"]'), 'Results for search'))

for row in browser.find_elements_by_xpath('//*[@id="inmateTable"]/tbody/tr'):
    for cell in row.find_elements_by_xpath('td'):
        print(cell.text)
    print()

browser.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM