How to extract just the number from html?

Question

I am trying to extract on the number from this html element:

<td bgcolor="green">
    <font color="white">
        "49.8 "
        <small>dBmV</small>
    </font>
</td>

How do only extract the 49.8 without getting the bBmV also?

I am able to use the xpath on to return the all of 49.8 dbmv but when searching the xpath of just "49.8" I receive error

Error:

invalid selector: The result of the xpath expression "/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font/text()" is: [object Text]. It should be an element.

I have tried:

browser.find_element_by_xpath("/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font").text

which returns 49.8 dBmV

And then:

browser.find_element_by_xpath("/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font/text()").text

returns the exception above.

I just want the number 49.8 (which changes obviously). i know i could extract the number later but im hoping there something I can use to just to get the details directly from the html, something a bit tidier

Answer 1

To extract the text 49.8 you can use the following Locator Strategy :

Using xpath through execute_script() and textContent :

 print(driver.execute_script('return arguments[0].firstChild.textContent;', driver.find_element_by_xpath("//td[@bgcolor='green']/font[@color='white']")).strip())

Using xpath through splitlines() and get_attribute() :

 print(driver.find_element_by_xpath("//td[@bgcolor='green']/font[@color='white']").get_attribute("innerHTML").splitlines()[1])

Answer 2

You can use the first line and just get the number like this:

text_num = browser.find_element_by_xpath("/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font").text
print(float(text_num.split()[0]))

Hope this helped!

Answer 3

You can replace the extra text like this:

first_text = browser.find_element_by_xpath("/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font").text
second_text = browser.find_element_by_xpath("/html/body/p[1]/table/tbody/tr/td/table[2]/tbody/tr[2]/td[4]/font/small").text
only_first_text = first_text.replace(second_text, '')

Answer 4

Selenium中的find_element_by_xpath API仅支持返回元素，因此即使在XPath中也可以指定一个表达式，该表达式仅返回您要查找的文本，在这种情况下，仅使用XPath是不可能的。

How to extract just the number from html?

Question

4 answers

solution1
2 ACCPTED 2019-06-20 08:18:51

solution2
1 2019-06-20 07:52:24

solution3
1 2019-06-20 08:15:06

solution4
0 2019-06-20 07:46:28

How to extract just the number from html?

Question

4 answers

solution1 2 ACCPTED 2019-06-20 08:18:51

solution2 1 2019-06-20 07:52:24

solution3 1 2019-06-20 08:15:06

solution4 0 2019-06-20 07:46:28

solution1
2 ACCPTED 2019-06-20 08:18:51

solution2
1 2019-06-20 07:52:24

solution3
1 2019-06-20 08:15:06

solution4
0 2019-06-20 07:46:28