I'm tying to extract each link that has dungeon separate them and add each link to a list and remove the duplicates I'm sure I separate them removing the duplicates will be easy I have a feeling its something simple I'm missing I would give the site but it requires an account
This is what I got it grabs them but its one big string how do I separate them into a list so i can pick each one out I will be clicking them later. or is there a better way to filter out links with "dungeon" in them using xpath link text don't work
for elem in elems:
if "dungeon" in elem.get_attribute("href"):
list = elem.get_attribute("href")
print(list)
print(list[0])
and this is the output
javascript:dungeon(0,84579684);
j
javascript:dungeon(0,84579684);
j
javascript:dungeon(0,84579674);
j
javascript:dungeon(0,84579674);
j
javascript:dungeon(0,84579672);
j
javascript:dungeon(0,84579672);
j
javascript:dungeon(0,84579662);
j
javascript:dungeon(0,84579662);
j
its one big string output i think
print(list)
javascript:dungeon(0,84579684);
javascript:dungeon(0,84579684);
javascript:dungeon(0,84579674);
javascript:dungeon(0,84579674);
javascript:dungeon(0,84579672);
javascript:dungeon(0,84579672);
javascript:dungeon(0,84579662);
javascript:dungeon(0,84579662);
I want to be able to print(list[3]) and have javascript:dungeon(0,84579674); come up not "a" come up
I would do something like this:
use .append
method to add into a list
.
url = "https://www.sofascore.com/de/tennis/2019-01-01"
driver.get(url)
href_bucket = []
elems = driver.find_elements_by_xpath("//a")
print(len(elems))
counter = 1
fail_counter = 0
for ele in elems:
if "de" in ele.get_attribute('href'):
counter = counter + 1
href_bucket.append(ele.get_attribute('href'))
else:
#print("fail", fail_counter)
fail_counter = fail_counter + 1
print(href_bucket[3])
If you want to remove duplicates:
seen = set(href_bucket)
if item not in seen:
seen.add(item)
href_bucket.append(item)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.