简体   繁体   中英

How to get specific text that belongs to div class

 <div class="col_5"> <br> <i class="phone">:: Before </i> 0212 / 897645 <br> <i class="print">:: Before </i> 0212 / 111111 <br> <br> </div>

Firstly I am gettin datas from a website and applying these datas to excel by using pandas.

I have a html code as stated above. I want to take the phone number which is come after the <i class='phone'> and pass the other one. However the phone number doesnot belongs to I class so I could only get the numbers by getting 'xpath' of the <div class='col_5'> but this is not safe for me because some 'divs' do not has the phone number and has only print number and this could be deadly for me. For example I try to find the xpath of <div class='col_5'> like that

num = browser.find_element_by_xpath('div[1]/div/div[103]/div[2]')
num.text.split('\n')

and the output is

['02243 / 80343', '02243 / 83261']

 <div class="col_5"> <br> <i class="phone">::Before </i> <br> <i class="print">::Before </i> 0201 / 623424 <br> <br> <a href="mailto:info@someone.com"> <i class="envelope"> </i> E-Mail</a> </div>

Above I shared the code which does not have the phone number but has the print number only. When I get the xpath of <div class='col_5'> in the second code I get the print number only and while these happens I add my data print number as phone number. And this is causing incorrect data. And when I do the same exact things as stated above the output is

['0201 / 623424', '', 'E-Mail']

So when I try to take the first item, it takes the print number. If there is phone number, I just want to take it, if not, take it and move on. Is this possible?

To print the text 0212 / 897645 you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies :

  • Using CSS_SELECTOR , childNodes and strip() :

     print(driver.execute_script('return arguments[0].childNodes[5].textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.col_5")))).strip())
  • Using XPATH , get_attribute() and splitlines() :

     print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[1]/div/div[103]/div[2]"))).get_attribute("innerHTML").splitlines()[4])
  • Note : You have to add the following imports:

     from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC

References

You can find a couple of relevant detailed discussions in:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM