The purpose of this code is to scrape a bunch of URLs then extract the title from every web page. Then use the outputs in another function.
Here is the code:
from selenium import webdriver
class DataEngine:
def __init__(self):
self.urls = open(r"C:\Users\Sayed\Desktop\script\links.txt").readlines()
self.driver = webdriver.Chrome(r"D:\Projects\Tutorial\Driver\chromedriver.exe")
def title(self):
for url in self.urls:
self.driver.get(url)
title = self.driver.find_element_by_xpath('//*[@id="leftColumn"]/h1').text
return title
def rename(self):
names = self.title()
for name in names:
print(name)
x = DataEngine()
x.rename()
Here is what I expected:
Title (1)
Title (2)
Title (3)
Title (4)
Here is the output:
T
i
t
l
e
(
1
)
Build a list of the results for each URL, currently you only returning one (the first) result which is why it is printing like that:
from selenium import webdriver
class DataEngine:
def __init__(self):
self.urls = open(r"C:\Users\Sayed\Desktop\script\links.txt").readlines()
self.driver = webdriver.Chrome(r"D:\Projects\Tutorial\Driver\chromedriver.exe")
def title(self):
titles = []
for url in self.urls:
self.driver.get(url)
title = self.driver.find_element_by_xpath('//*[@id="leftColumn"]/h1').text
titles.append(title)
return titles
def rename(self):
names = self.title()
for name in names:
print(name)
x = DataEngine()
x.rename()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.