简体   繁体   中英

How can I iterate through a list of items and extract a specific part using Selenium and Python

在此处输入图片说明 From this web page " https://meshb.nlm.nih.gov/treeView ", I want to iterate through each node of the tree and if I see the word "Cardiovascular..." in their items, I want to create a dictionary that lists the top level node along with all of cardiovascular associated items. For example, in the above page you can see that if you expand "Anatomy [A]", you will see cardiovascular. Now, I want this part along with whatever included in cardiovascular if you expand it. A part of the html page that I want to iterate through some of its elements is as follows:

<a class="ng-scope">
   <span class="ng-binding ng-scope">Anatomy [A]</span>
</a>
    <ul class="treeItem ng-scope">
        <li class ="ng-scope" >
              < a  class ="ng-scope" href="/record/ui?ui=D001829" >
              < span  class ="ng-binding ng-scope" > Body Regions[A01] < / span >
              </a>
        </li>
        < li class ="ng-scope" >
              <a  class ="ng-scope" href="/record/ui?ui=D001829" >
                < span  class ="ng-binding ng-scope" > Cardio Vascular< / span >
              </a>
                    <ul class="treeItem ng-scope">
                        <li class="ng-scope">
                           <a class="ng-scope" href="/record/ui?ui=D015824">
                           <span class="ng-binding ng-scope">Blood-Air Barrier [A07.025]</span>
                           </a>
                                 <ul class="treeItem ng-scope">                    
                                   <li class="ng-scope">
                                       <a class="ng-scope" href="/record/ui?ui=D018916">
                                       <span class="ng-binding ng-scope">Blood-Aqueous Barrier [A07.030]</span>                        
                                       </a>
                                    </li>
                                 </ul>
                        </li>
                    </ul>
        </li>
    </ul>

..... and here is what I was able to accomplish so far! in Python; As the first step, I wanted to iterate through the top level nodes and find the word "cardiovascular.." but I keep seeing the error" no such element: Unable to locate element". Can someone tell me what am I missing here?

from selenium import webdriver
chrome_path=r"G:\My Drive\A\chrome_driver\chromedriver_win32\chromedriver.exe"
driver=webdriver.Chrome(chrome_path)
driver.get('https://meshb.nlm.nih.gov/treeView')
for links in driver.find_elements_by_css_selector('a.ng-scope'):
    cardio = links.find_element_by_css_selector('li>a>span.ng-binding.ng-scope')        
    print(cardio.text)

There are some issues in your code. You cannot iterate through the list unless you click on the "+" icon on the parent node.

In your code, I can see that you have created a list which contains parent nodes like Anatomy, Organisms and etc but you haven't written a code to expand the list.

Steps which you have to follow are:

  1. Store parent nodes in the list => This step is covered in your code.
  2. Iterate through each parent node by clicking on the expand icon(+ icon) => needs to be covered.
  3. Store the child nodes in the list and iterate through the child nodes as well => needs to be covered
  4. Keep iterating unless you find the child node "cardiovascular" => needs to be covered.
  5. Click on the + icon in front of the child node "cardiovascular" and store the elements under the node "cardiovascular" in the dictionary => needs to be covered.

I have created a code which covers 1st,2nd and 3rd steps for you. Please proceed in the same way.

from selenium import webdriver
chrome_path=r"G:\MyDrive\A\chrome_driver\chromedriver_win32\chromedriver.exe"
driver=webdriver.Chrome(chrome_path)
driver.get('https://meshb.nlm.nih.gov/treeView')
for links in driver.find_elements_by_css_selector('a.ng-scope'):
    links.find_element_by_xpath("./following-sibling::span/i[1]").click();
      for sublinks in links.find_elements_by_xpath('./following-sibling::ul/li//a'):
        print(sublinks.text)

I have a java background so please forgive me for any language related syntax issues.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM