简体   繁体   中英

How to scraping the html text using selenium python

I'm trying to get text "General (8)" shown in below HTML code using selenium webdriver but kept running into issues. Any input is highly appreciated. Thanks.

my code:

test1 = driver.find_element_by_xpath("//input[@id = 'General'][@role = 'presentation']").text
print(test1)

returns null

HTML:

<li class="" role="checkbox" aria-checked="false">
     <div class="extend_clickable" tabindex="0">
          <input id="General" role="presentation" name="General" checked="checked" type="checkbox">
          General (8)
          <label for="General" role="presentation"></label>
     </div>
</li>

input node is always empty . It means it cannot contain any child nodes (including text nodes). What you want is a text sibling of input which you can get as text content of parent div :

test1 = driver.find_element_by_xpath('//div[@class="extend_clickable"]').text.strip()

As per the HTML you have provided to print the text General (8) you have to extract it from the <div class="extend_clickable" tag , as the text is not within <input> tag and you can use the following code block using Python's splitlines() method as follows :

all_text = driver.find_element_by_xpath("//li[@role='checkbox']/div[@class='extend_clickable']").get_attribute("innerHTML")
myText = all_text.splitlines()
print(myText[1])

Console Output :

  General (8)

Update

As per @Andersson's counter question/comment the following screenshot should address and answer all the queries.

分割线

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM