简体   繁体   English

Selenium(PYTHON) 获取 href 的特定属性

[英]Selenium(PYTHON) get specific attribute of href

So I have this href element and im trying to only print the number of what's inside the href, however path of the href after the element also includes numbers so i'm not really sure how exactly how to grab only the numbers, as oppose to also printing the numbers in what you see as Mad1000所以我有这个href元素,我试图只打印href里面的数字,但是元素后面的href路径也包括数字,所以我不确定如何只获取数字,而不是还打印您看到的 Mad1000 中的数字

href https://www.game.com/items/20573078/Mad1000 href https://www.game.com/items/20573078/Mad1000

userLink = driver.find_element_by_xpath(f"//*[@id='bc_owners_table']/tbody/tr[{i+1}]/td[7]/a").get_attribute("href")


userID = re.sub('[^0-9]', '', userLink)
print(userID)

the outcome ends up being 205730781000 but im trying to navigate to where i can only print 20573078, how would i achieve this结果最终是 205730781000 但我试图导航到我只能打印 20573078 的地方,我将如何实现这一点

There are 4 good ways to do this:有 4 种好方法可以做到这一点:

userID = [int(s) for s in href.split("/") if s.isdigit()]
print(userID[0])

userID = re.findall(r'\d+', href)
print(userID[0])

userID = href.split("/")[4]
print(userID)

userID = re.sub('[^0-9]', '', href)[:-4]
print(userID)

Let me explain.让我解释。 PS: I used the href variable, but you can change it to userLink and it should work. PS:我使用了href变量,但是您可以将其更改为userLink并且它应该可以工作。

The first method splits the string into a list everytime there is a / .每当有/时,第一种方法将字符串拆分为列表。 It then checks to see if a value is an interger for every item in the list.然后它检查一个值是否是列表中每个项目的整数。 This is returned as a list, so we use userID[0] to get the first (and usually only.) element in the list.这是作为列表返回的,因此我们使用userID[0]来获取列表中的第一个(通常是唯一的)元素。 The reason Mad1000 will not be in the list is because it consists of a string AND integer. Mad1000不在列表中的原因是它由字符串 AND integer 组成。 The list will only contain integers.该列表将仅包含整数。

The second method returns a list of EVERY number in the string as a list.第二种方法将字符串中每个数字的列表作为列表返回。 Therefore, this time, 1000 will be added because it's a number, Therefore, we use userID[0] to get the first element of the list, which will be 20573078 because there aren't any numbers before it (there may be however if the href changes.)因此,这一次将添加1000 ,因为它是一个数字,因此,我们使用userID[0]来获取列表的第一个元素,即20573078 ,因为它之前没有任何数字(但是如果href 更改。)

The third method splits the string into a list again by / .第三种方法通过/再次将字符串拆分为列表。 The difference is that this time, we get the 4th element of the list straight away.不同的是,这一次,我们直接得到了列表的第 4 个元素。 You might need to play around because, depending on the hyperlink, you might need to access the 3rd or 5th element instead.您可能需要尝试一下,因为根据超链接,您可能需要访问第 3 个或第 5 个元素。 This is an alternative to option 1, which is similar to this, but also checks if the value is a number.这是选项 1 的替代方案,与此类似,但还会检查值是否为数字。

The final 1 gets the number using your method, but removes the last 4 values using [:-4] .最后 1 使用您的方法获取数字,但使用[:-4]删除最后 4 个值。

None of these methods are perfect, but they should work for what you want.这些方法都不是完美的,但它们应该可以满足您的需求。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM