用 Python 抓取动态页面

Question

I'm trying to scrape this website in python.我正在尝试用 python 抓取这个网站。 When we enter the company code (example 6177) the URL doesn't change but the page and the values on it, change.当我们输入公司代码（例如 6177）时，URL 不会更改，但页面及其上的值会更改。

There's just one cell that needs scraping.只有一个细胞需要刮擦。 Screenshot attached for the exact cell.附上确切单元格的屏幕截图。 The cell's address is:单元格的地址是：

xpath - //*[@id="company"]/table[3]/tbody/tr[4]/td[1]
cssselector - #company > table:nth-child(17) > tbody > tr:nth-child(4) > td:nth-child(1)

How should I go on about this?我该怎么做呢？

Thank you!谢谢！

Answer 1

To get the Text 190,843 from a table induce WebDriverWait () and visibility_of_element_located () and use the following xpath .要从表中获取文本190,843 ，请引入WebDriverWait () 和visibility_of_element_located () 并使用以下xpath 。

from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys

driver=webdriver.Chrome()
driver.get("https://mops.twse.com.tw/mops/web/index")
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.ID,"keyword"))).send_keys("6177",Keys.ENTER)
print(WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH,"//div[text()='營收資訊']/following::table[1]//tr[4]/td[1]"))).text)

Output :输出：

190,843

用 Python 抓取动态页面

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-04-02 17:28:53

用 Python 抓取动态页面

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-04-02 17:28:53

解决方案1
1 已采纳 2020-04-02 17:28:53