[英]Can't get elements with Python and Selenium by Xpath
I am trying to get the "job-title " and the "href" of this webpage with python and selenium.我正在尝试使用 python 和 selenium 获取此网页的“职位名称”和“href”。
It only returns me blanks and no data.它只返回空白而没有数据。
job_card = driver.find_elements_by_xpath('//div[contains(@class,"job-info-wrapper ")]')
for job in job_card:
try:
title = job.find_elements_by_xpath('.//a[contains(@class, "job-title ")]')
except:
title = job.find_elements_by_xpath('.//a[contains(@class, "job-title ")]').get_attribute(name="job-title ")
titles.append(title)
print(title)
links.append(job.get_attribute(name="a href"))
this is the webpage:这是网页:
What I am doing wrong here?我在这里做错了什么?
As per the DOM, the title of the job is the text contained in the tag a
.根据 DOM,作业的标题是包含在标签
a
的文本。
Use .get_attribute("innerText")
or .text
to get the title from the job option.使用
.get_attribute("innerText")
或.text
从作业选项中获取标题。
And to retrieve the href
attribute from the element use .get_attribute("href")
并从元素使用
.get_attribute("href")
检索href
属性
And to find an element use - find_element
instead of find_elements
.并找到一个元素使用 -
find_element
而不是find_elements
。 find_elements
will return a list of webelements. find_elements
将返回一个 webelements 列表。
Try like below.尝试如下。
driver.get("https://www.vietnamworks.com/job-search/all-jobs?filtered=true")
wait = WebDriverWait(driver,30)
try:
wait.until(EC.element_to_be_clickable((By.XPATH,"//div[@class='sc-fznWqX dAkvW']//*[name()='svg' and @class='filter-close']"))).click()
except:
print("No pop-up")
titles = []
links = []
job_card = driver.find_elements_by_xpath('//div[contains(@class,"job-info-wrapper ")]')
for job in job_card:
element = job.find_element_by_xpath(".//a[contains(@class,'job-title')]")
title = element.get_attribute("innerText")
link = element.get_attribute("href")
print(f"{title} : {link}")
No pop-up
Chuyên Viên Triển Khai Phần Mềm ERP / ERP Specialist(NEW) : https://www.vietnamworks.com/chuyen-vien-trien-khai-phan-mem-erp-erp-specialist-1438267-jd/?source=searchResults&searchType=2&placement=1438268&sortBy=date
[HN] Data Engineer(NEW) : https://www.vietnamworks.com/hn-data-engineer-2-1431499-jd/?source=searchResults&searchType=2&placement=1431500&sortBy=date
Chuyên Viên Pháp Chế(NEW) : https://www.vietnamworks.com/chuyen-vien-phap-che-510-1-1429155-jd/?source=searchResults&searchType=2&placement=1429156&sortBy=date
...
So when you have the job card just append it's href and innertext.因此,当您拥有工作卡时,只需附加它的 href 和 innertext。 Also the next page should be unindented.
下一页也应该是无缩进的。 Also errors would be to use waits to catch any popups at first.
错误也是首先使用等待来捕获任何弹出窗口。
wait=WebDriverWait(driver, 10)
driver.get('https://www.vietnamworks.com/job-search/all-jobs?filtered=true')
titles=[]
links =[]
###########################################################################################
# Click search Button
try:
wait.until(EC.element_to_be_clickable((By.XPATH, '//a[contains(@class, "button searchBar__button")]'))).click()
except:
pass
try:
wait.until(EC.element_to_be_clickable((By.XPATH, '//a[contains(@class, "button searchBar__button")]'))).click()
except:
pass
###########################################################################################
#loop
for i in range(0,20):
job_card = wait.until(EC.presence_of_all_elements_located((By.XPATH, "//div[contains(@class,'job-info-wrapper ')]//a[@class='job-title priorityJob']")))
print(len(job_card))
for job in job_card:
links.append(job.get_attribute("href"))
titles.append(job.text)
print(job.get_attribute("href"),job.text)
try:
wait.until(EC.element_to_be_clickable((By.XPATH, "//a[@class='page-link' and .='>']"))).click()
except NoSuchElementException:
break
print("Page: {}".format(str(i+2)))
df_da=pd.DataFrame()
df_da['Title']=titles
df_da['Link']=links
print(df_da)
Outputs输出
Title Link
0 QC Engineers (Tester, QA QC, Manual)(NEW) https://www.vietnamworks.com/qc-engineers-test...
1 Business Analyst (IT Industry)(NEW) https://www.vietnamworks.com/business-analyst-...
2 Unity Game Developer (Up to 40,000,000 VNĐ)(NEW) https://www.vietnamworks.com/unity-game-develo...
3 3D Modeler ( Background Modeler ) - Up to 30,0... https://www.vietnamworks.com/3d-modeler-backgr...
4 Chuyên Viên Quản Trị Hệ Thống Công Nghệ Thông ... https://www.vietnamworks.com/chuyen-vien-quan-...
5 Financial Analyst(NEW) https://www.vietnamworks.com/financial-analyst...
6 Chuyên Viên Cao Cấp Tuyển Dụng (Nghỉ Thứ 7 Và ... https://www.vietnamworks.com/chuyen-vien-cao-c...
7 Chuyên Viên Cao Cấp Tài Chính(NEW) https://www.vietnamworks.com/chuyen-vien-cao-c...
8 Trưởng Ban Kiểm Toán Nội Bộ(NEW) https://www.vietnamworks.com/truong-ban-kiem-t...
9 Dealer Operation Staff(NEW) https://www.vietnamworks.com/dealer-operation-...
10 Supervisor - Tiếng Nhật - Phòng Sale(NEW) https://www.vietnamworks.com/supervisor-tieng-...
11 IT Manager – Back Office Division(NEW) https://www.vietnamworks.com/it-manager-back-o...
12 Hot Job - Nhân Viên Xuất Nhập Khẩu (Lương Thưở... https://www.vietnamworks.com/hot-job-nhan-vien...
13 General Accountant for Luxury Brand - Attracti... https://www.vietnamworks.com/general-accountan...
14 Chuyên Viên Kinh Doanh (Thương Mại Điện Tử)(NEW) https://www.vietnamworks.com/chuyen-vien-kinh-...
15 Java Developer (Thu Nhập Tương Đương Từ 14 - 2... https://www.vietnamworks.com/java-developer-th...
16 Logistics Executive (Salary up to 500$ Per mon... https://www.vietnamworks.com/logistics-executi...
17 Customs Liquidation & Customs Declaration Staf... https://www.vietnamworks.com/customs-liquidati...
18 Chuyên Viên Kinh Doanh Thiết Bị Y Tế - [Mức Lư... https://www.vietnamworks.com/chuyen-vien-kinh-...
19 Trade Operation Officer(NEW) https://www.vietnamworks.com/trade-operation-o...
20 Nhân Viên PR - Quản Lý Đô Thị(NEW) https://www.vietnamworks.com/nhan-vien-pr-quan...
You were almost there.你快到了。 You just need two minor modification as follows:
你只需要两个小的修改如下:
get_attribute()
is an attribute of a WebElement. get_attribute()
是 WebElement 的属性。 So instead of find_elements*
you need to use find_element*
find_elements*
您需要使用find_element*
get_attribute()
you just need to pass the attribute name as get_attribute("class")
get_attribute()
您只需要将属性名称作为get_attribute("class")
传递
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.