简体   繁体   中英

How to use Selenium to get content from a table and filter it

I just want to apologize for my inaccurate question. So my problem is that I have written a Python script which opens a music database website, then it looks up a certain artist (in my case "cro"). After that, a table comes up where I want to extract the data from. My code looks like this:

from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)

driver.get("https://repsearch.ppluk.com/")
print(driver.title)

search = driver.find_element_by_name("pt1:rec_band_artist")
search.send_keys("cro")
search.send_keys(Keys.RETURN)
    
try:
    table = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "pt1:searchResultsTable::db"))
    )
except:
    driver.quit()

print(table.text)

driver.quit()

input()

There are multiple columns in the table like "Artist Name", "Recording Title", "Release Date", and so on. I'd like the program to ONLY print a line if the "Release Date" has the value "2021". The recording title "ALLES DOPE" has this value, so the program is supposed to only print the information in the row where "ALLES DOPE" is located.

In the following, you can see how the table looks like

As you've probably already guessed, I'm a noob to python. I just started a few weeks ago and looking this problem up, I couldn't find any useful help. So thanks in advance <3

To make the search try this:

search = driver.find_element_by_xpath("//input[contains(@ID,'rec_band_artist') and contains(@class,'af_inputText_content')]")
search.send_keys("cro")
search.send_keys(Keys.RETURN)

The locator that you used won't always work. After you see the table: Loop through it and get all information into the object with attributes:

  • name
  • title
  • isrc
  • rightsholder
  • recording_date
  • duration.

For this you will need to find a unique locator for all rows and get data from it into your object.

When you will have the object, you can extract any data from there. I would use this approach... As you see, your question is not an easy one. You are asking to write the whole program.

I suggest you to divide it into parts, google for approaches of getting data from tables.

Edit: Finding table row is relatively easy. To find all rows use: rows = driver.find_elements_by_css_selector(".af_table_data-row")

Next, to find name and use:

name = driver.find_element_by_css_selector(".af_table_data-row>td:nth-child(1)")
title = driver.find_element_by_css_selector(".af_table_data-row>td:nth-child(2)")

and so on.

try this:

for i in range(1,16):
    row = driver.find_element_by_xpath(f'//*[@id="pt1:searchResultsTable::db"]/table/tbody/tr[{i}]').text
    year = driver.find_element_by_xpath(f'//*[@id="pt1:searchResultsTable::db"]/table/tbody/tr[{i}]/td[5]').text
    YOUR_YEAR = '2021'
    if year == YOUR_YEAR:
        print(row)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM