简体   繁体   中英

Printing table values from specific web page

I want to extract and print all the entries for a specific month from the table

import os
from webdriver_manager.chrome import ChromeDriverManager
import time

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = Options()
options.add_argument('--ignore-certificate-errors')
options.add_argument('--start-maximized')
options.page_load_strategy = 'eager'

driver = webdriver.Chrome(options=options)

wait = WebDriverWait(driver, 20)   
driver.get("https://www.sebi.gov.in/sebiweb/home/HomeAction.do?doListing=yes&sid=3&ssid=22&smid=18")

month = "Apr"
year = "2021"

How to print all the values from the table which matches specific month and year?

You can try something like this:

driver = webdriver.Chrome()
driver.get('https://www.sebi.gov.in/sebiweb/home/HomeAction.do?doListing=yes&sid=3&ssid=22&smid=18')

month = "Apr"
year = "2021"

for row in driver.find_elements_by_xpath("//table/tbody/tr/td[1]"):
    if month in row.text and year in row.text:
        x = row.find_element_by_xpath("./following-sibling::td")
        print(row.text, " ", x.text)

Prints:

Apr 29, 2021   Rane Brake Lining Ltd. - Post Buyback Public Announcement
Apr 06, 2021   Insecticides (India) Limited - Public Announcement
Apr 06, 2021   Jagran Prakashan Limited - Filing of Public Announcement
Apr 05, 2021   Sreeleathers Limited - Post Buyback Public Announcement

Of course, that only gets the results on the first page, you would need to incorporate pagination if you wanted more than that.

First of all set the date range in filters. Then get the page source using data = driver.page_source

Next, use bs4 to parse your data, soup = BeautifulSoup(data) Next loop through for row in soup.select('div.table-scrollable tbody tr') , date = row.select('td')[0] and title = row.select('td')[1]

Happy coding.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM