简体   繁体   English

用 Python 抓取动态页面

[英]Scraping a Dynamic Page in Python

I'm trying to scrape this website in python.我正在尝试用 python 抓取这个网站 When we enter the company code (example 6177) the URL doesn't change but the page and the values on it, change.当我们输入公司代码(例如 6177)时,URL 不会更改,但页面及其上的值会更改。

There's just one cell that needs scraping.只有一个细胞需要刮擦。 Screenshot attached for the exact cell.附上确切单元格的屏幕截图。 The cell's address is:单元格的地址是:

xpath - //*[@id="company"]/table[3]/tbody/tr[4]/td[1]
cssselector - #company > table:nth-child(17) > tbody > tr:nth-child(4) > td:nth-child(1)

How should I go on about this?我该怎么做呢?

Thank you!谢谢!

在此处输入图片说明

To get the Text 190,843 from a table induce WebDriverWait () and visibility_of_element_located () and use the following xpath .要从表中获取文本190,843 ,请引入WebDriverWait () 和visibility_of_element_located () 并使用以下xpath

from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.keys import Keys

driver=webdriver.Chrome()
driver.get("https://mops.twse.com.tw/mops/web/index")
WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.ID,"keyword"))).send_keys("6177",Keys.ENTER)
print(WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH,"//div[text()='營收資訊']/following::table[1]//tr[4]/td[1]"))).text)

Output :输出

190,843

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM