简体   繁体   中英

selenium python return all outputs

The purpose of this code is to scrape a bunch of URLs then extract the title from every web page. Then use the outputs in another function.

Here is the code:

from selenium import webdriver


class DataEngine:
    def __init__(self):
        self.urls = open(r"C:\Users\Sayed\Desktop\script\links.txt").readlines()
        self.driver = webdriver.Chrome(r"D:\Projects\Tutorial\Driver\chromedriver.exe")

    def title(self):
        for url in self.urls:
            self.driver.get(url)
            title = self.driver.find_element_by_xpath('//*[@id="leftColumn"]/h1').text
            return title

    def rename(self):
        names = self.title()
        for name in names:
            print(name)


x = DataEngine()
x.rename()

Here is what I expected:

Title (1)

Title (2)

Title (3)

Title (4)

Here is the output:

T

i

t

l

e

(

1

)

Build a list of the results for each URL, currently you only returning one (the first) result which is why it is printing like that:

from selenium import webdriver

class DataEngine:
    def __init__(self):
        self.urls = open(r"C:\Users\Sayed\Desktop\script\links.txt").readlines()
        self.driver = webdriver.Chrome(r"D:\Projects\Tutorial\Driver\chromedriver.exe")

    def title(self):
        titles = []
        for url in self.urls:
            self.driver.get(url)
            title = self.driver.find_element_by_xpath('//*[@id="leftColumn"]/h1').text
            titles.append(title)
        return titles

    def rename(self):
        names = self.title()
        for name in names:
            print(name)


x = DataEngine()
x.rename()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM