简体   繁体   English

仅当 div 包含特定文本时如何废弃 div - SELENIUM Python

[英]How scrap a div only when it contains a certain text- SELENIUM Python

I'm trying to automate LinkedIn and want to scrap the profile URL only if they are already not in a pending state.我正在尝试使 LinkedIn 自动化,并希望仅当他们已经不在待处理的 state 中时才废弃配置文件 URL。

<li class="reusable-search__result-container ">
          
                    
    <div class="entity-result" data-chameleon-result-urn="#">
  
  <div class="entity-result__item">
<div class="entity-result__universal-image">
<div class="display-flex align-items-center">
<!---->        
      <a class="app-aware-link scale-down " aria-hidden="true" href="#">
<div class="ivm-image-view-model   ">
<div class="ivm-view-attr__img-wrapper ivm-view-attr__img-wrapper--use-img-tag display-flex

">
<!---->        
<div class="presence-entity presence-entity--size-3">
<img src="#" loading="lazy" alt="Grigorij Aronov" id="ember35" class="presence-entity__image  ivm-view-attr__img--centered EntityPhoto-circle-3  EntityPhoto-circle-3 lazy-image ember-view">


<div class="presence-entity__indicator

presence-entity__indicator--size-3 presence-indicator
hidden
presence-indicator--size-3">
<span class="visually-hidden">
Status is offline
</span>
</div>
</div>

</div>
</div>
</a>
    
</div>
</div>
<div class="entity-result__content entity-result__divider pt3 pb3 t-12 t-black--light">
<div class="mb1">

<div class="t-roman t-sans">

      
      <div class="display-flex">
<span class="entity-result__title-line entity-result__title-line--2-lines">
<span class="entity-result__title-text t-16">
<a class="app-aware-link" href="#">
  <span dir="ltr"><span aria-hidden="true"><!----><!----></span><span class="visually-hidden"><!----><!----></span></span>
</a>
  <span class="entity-result__badge t-14 t-normal t-black--light">
    <div class="display-flex
flex-row-reverse
align-items-baseline">
<div class="ivm-image-view-model    flex-shrink-zero align-self-center mr2 entity-result__badge-icon ml1">
<div class="ivm-view-attr__img-wrapper ivm-view-attr__img-wrapper--use-img-tag display-flex

">
<li-icon type="linkedin-bug" size="14dp" color="premium" role="img" aria-label="Premium member"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 14 14" data-supported-dps="14x14" fill="currentColor" class="mercado-match" width="14" height="14" focusable="false">
<g>
<path class="background-mercado" d="M14 1v12a1 1 0 01-1 1H1a1 1 0 01-1-1V1a1 1 0 011-1h12a1 1 0 011 1zM4 5H2v7h2zm.25-2A1.27 1.27 0 003 1.8 1.27 1.27 0 001.75 3 1.27 1.27 0 003 4.2 1.27 1.27 0 004.25 3zM12 8.29c0-2.2-.73-3.49-2.86-3.49A2.71 2.71 0 006.89 6V5H5v7h2V8.73A1.74 1.74 0 018.66 6.8C9.82 6.8 10 7.94 10 8.73V12h2z"></path>
</g>
</svg></li-icon>
</div>
</div>
<span class="image-text-lockup__text entity-result__badge-text">
<span aria-hidden="true"><!---->• 2nd<!----></span><span class="visually-hidden"><!---->2nd degree connection<!----></span>
</span>
</div>
  </span>
</span>
</span>
<span aria-hidden="true" class="entity-result__badge-overflow align-self-flex-end t-14 t-normal t-black--light flex-shrink-zero
  ">
<div class="display-flex
flex-row-reverse
align-items-baseline">
<div class="ivm-image-view-model    flex-shrink-zero align-self-center mr2 entity-result__badge-icon ml1">
<div class="ivm-view-attr__img-wrapper ivm-view-attr__img-wrapper--use-img-tag display-flex

">
<li-icon type="linkedin-bug" size="14dp" color="premium" role="img" aria-label="Premium member"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 14 14" data-supported-dps="14x14" fill="currentColor" class="mercado-match" width="14" height="14" focusable="false">
<g>
<path class="background-mercado" d="M14 1v12a1 1 0 01-1 1H1a1 1 0 01-1-1V1a1 1 0 011-1h12a1 1 0 011 1zM4 5H2v7h2zm.25-2A1.27 1.27 0 003 1.8 1.27 1.27 0 001.75 3 1.27 1.27 0 003 4.2 1.27 1.27 0 004.25 3zM12 8.29c0-2.2-.73-3.49-2.86-3.49A2.71 2.71 0 006.89 6V5H5v7h2V8.73A1.74 1.74 0 018.66 6.8C9.82 6.8 10 7.94 10 8.73V12h2z"></path>
</g>
</svg></li-icon>
</div>
</div>
<span class="image-text-lockup__text entity-result__badge-text">
<span aria-hidden="true"><!---->• 2nd<!----></span><span class="visually-hidden"><!---->2nd degree connection<!----></span>
</span>
</div>
</span>
</div>
    
  
</div>

<div class="linked-area flex-1
cursor-pointer">


<div>
<div class="entity-result__primary-subtitle t-14 t-black t-normal">
  <!----><!---->
</div>
  <div class="entity-result__secondary-subtitle t-14 t-normal">
    <!----><!---->
  </div>
</div>




</div>

</div>

<div class="linked-area flex-1
cursor-pointer">



</div>

<div class="entity-result__insights t-12">
  
        
            <div class="entity-result__simple-insight ">

  <div class="ivm-image-view-model    entity-result__simple-insight-image flex-shrink-zero mr2">
<div class="ivm-view-attr__img-wrapper ivm-view-attr__img-wrapper--use-img-tag display-flex

">
<li-icon aria-hidden="true" type="people" size="small"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" data-supported-dps="16x16" fill="currentColor" class="mercado-match" width="16" height="16" focusable="false">
<path d="M14 11.75V15H9v-3.25A1.75 1.75 0 0110.75 10h1.5A1.75 1.75 0 0114 11.75zM11.5 9A2.5 2.5 0 109 6.5 2.5 2.5 0 0011.5 9zM5 1a3 3 0 103 3 3 3 0 00-3-3zm.75 7h-1.5A2.25 2.25 0 002 10.25V15h6v-4.75A2.25 2.25 0 005.75 8z"></path>
</svg></li-icon>
</div>
</div>
<div class="entity-result__simple-insight-text-container">
  <span class="entity-result__simple-insight-text
      entity-result__simple-insight-text--small">

  </span>
<!---->      </div>

</div>

<!---->
<!---->
      
    
</div>
</div>
<div class="entity-result__actions entity-result__divider">
<!---->      
        
<div>
        
<button aria-label="Withdraw invitation sent to Grigorij Aronov" id="ember57" class="artdeco-button artdeco-button--muted artdeco-button--2 artdeco-button--full artdeco-button--secondary ember-view" type="button"><!---->
<span class="artdeco-button__text">
Pending
</span></button>


<!---->

<!---->
<!---->

</div>

    
</div>
</div>


</div>

<!---->                  
  </li>


<li class="reusable-search__result-container ">
          
                    
    <div class="entity-result" data-chameleon-result-urn="#">
  
  <div class="entity-result__item">
<div class="entity-result__universal-image">
<div class="display-flex align-items-center">
<!---->        
      <a class="app-aware-link scale-down " aria-hidden="true" href="#">
<div class="ivm-image-view-model   ">
<div class="ivm-view-attr__img-wrapper ivm-view-attr__img-wrapper--use-img-tag display-flex">
<!---->        
<div class="presence-entity presence-entity--size-3">
<img src="#">


<div class="presence-entity__indicatorpresence-entity__indicator--size-3 presence-indicatorhiddenpresence-indicator--size-3">
<span class="visually-hidden">
Status is offline
</span>
</div>
</div>

</div>
</div>
</a>
    
</div>
</div>
<div class="entity-result__content entity-result__divider pt3 pb3 t-12 t-black--light">
<div class="mb1">

<div class="t-roman t-sans">

      
      <div class="display-flex">
<span class="entity-result__title-line entity-result__title-line--2-lines">
<span class="entity-result__title-text t-16">

  <span class="entity-result__badge t-14 t-normal t-black--light">
    <div class="display-flex
flex-row-reverse
align-items-baseline">
<!---->    <span class="image-text-lockup__text entity-result__badge-text">
<span aria-hidden="true"><!---->• 2nd<!----></span><span class="visually-hidden"><!---->2nd degree connection<!----></span>
</span>
</div>
  </span>
</span>
</span>
<span aria-hidden="true" class="entity-result__badge-overflow align-self-flex-end t-14 t-normal t-black--light flex-shrink-zero
  ">
<div class="display-flex
flex-row-reverse
align-items-baseline">
<!---->    <span class="image-text-lockup__text entity-result__badge-text">
<span aria-hidden="true"><!---->• 2nd<!----></span><span class="visually-hidden"><!---->2nd degree connection<!----></span>
</span>
</div>
</span>
</div>
    
  
</div>

<div class="linked-area flex-1
cursor-pointer">


<div>
<div class="entity-result__primary-subtitle t-14 t-black t-normal">
  <!---->#<!---->
</div>
  <div class="entity-result__secondary-subtitle t-14 t-normal">
    <!---->#<!---->
  </div>
</div>




</div>

</div>

<div class="linked-area flex-1
cursor-pointer">

  

</div>

<div class="entity-result__insights t-12">
  
        
            <div class="entity-result__simple-insight ">

  <div class="ivm-image-view-model    entity-result__simple-insight-image flex-shrink-zero mr2">
<div class="ivm-view-attr__img-wrapper ivm-view-attr__img-wrapper--use-img-tag display-flex

">
<li-icon aria-hidden="true" type="people" size="small"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 16 16" data-supported-dps="16x16" fill="currentColor" class="mercado-match" width="16" height="16" focusable="false">
<path d="M14 11.75V15H9v-3.25A1.75 1.75 0 0110.75 10h1.5A1.75 1.75 0 0114 11.75zM11.5 9A2.5 2.5 0 109 6.5 2.5 2.5 0 0011.5 9zM5 1a3 3 0 103 3 3 3 0 00-3-3zm.75 7h-1.5A2.25 2.25 0 002 10.25V15h6v-4.75A2.25 2.25 0 005.75 8z"></path>
</svg></li-icon>
</div>
</div>


</div>

<!---->
<!---->
      
    
</div>
</div>
<div class="entity-result__actions entity-result__divider">
<!---->      
        
<div>
        <button aria-label="Invite Lucas Teuchner to connect" id="ember55" class="artdeco-button artdeco-button--2 artdeco-button--secondary ember-view"><!---->
<span class="artdeco-button__text">Connect</span></button>

<!---->
<!---->

</div>

    
</div>
</div>


</div>

<!---->                  
  </li>

There are 2 blocks of HTML. HTML 有 2 块。 One contains a button with "Pending" and the other with "Connect".一个包含一个带有“待处理”的按钮,另一个包含一个带有“连接”的按钮。 I want to scrap the profile URL of the one inside "Connect" li.我想废弃“连接”li 里面的配置文件 URL。

If you find this question stupid, I'm sorry.如果你觉得这个问题很愚蠢,我很抱歉。 I'm not advanced in Selenium.我在 Selenium 方面并不先进。

Thanks in advance提前致谢

You should be able to locate that element with the following (if element is not in an iframe, shadow root, or some other unforeseen elements in that page):您应该能够使用以下内容找到该元素(如果元素不在 iframe、影子根或该页面中的其他一些不可预见的元素中):

profile_link = WebDriverWait(browser,10).until(EC.element_to_be_clickable((By.XPATH, "//span[text() = 'Connect']/ancestor::li//a[@class = 'app-aware-link']")))

You will also need to import:您还需要导入:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM