简体   繁体   中英

Scrape specific div with Selenium with Python

I have HTML code from where need to scrape <div class="odds ng-star-inserted"> 1.30 </div> , <div class="odds ng-star-inserted"> 2.30 </div> , <div class="odds ng-star-inserted"> 1.31 </div> and <div class="odds ng-star-inserted"> 2.31 </div> value 1.30, 2.30, 1.31 and 2.31 with Python but they return me only 1.30 and 2.30 for each row.

Result need to be:

Netherlands\\nSouth Korea 1.30\\n2.30 Germany\\nJapan 1.31\\n2.31

But I get:

Netherlands\\nSouth Korea 1.30\\n2.30 Germany\\nJapan 1.30\\n2.30

Here is Python code:

teams = []
btts = []
odds_events = []

box = driver.find_element(By.XPATH, '//*[@id="page"]/div[2]')
#Looking for 'sports titles'
sport_title = box.find_element(By.CLASS_NAME, 'sport-name')

parent = sport_title.find_element(By.XPATH, './..')
grandparent = parent.find_element(By.XPATH, './..').find_element(By.XPATH, './..').find_element(By.XPATH, './..')

single_row_events = grandparent.find_elements(By.CLASS_NAME, 'event')

for match in single_row_events:
    odds_event = match.find_elements(By.CLASS_NAME, 'games')
    odds_events.append(odds_event)
    # Scrape teams
    for team in match.find_elements(By.CLASS_NAME, 'rivals'):
        teams.append(team.text)
        
for odds_event in odds_events:
    for n, box in enumerate(odds_event):
    rows = box.find_elements(By.XPATH, '//div[@class="game g2 ng-star-inserted"]')
       if n == 0:
          btts.append(rows[0].text)

If I set rows = box.find_elements(By.XPATH, './/*') and if n == 2: show me error

ValueError: All arrays must be of the same length

But if I set if n == 0: give me good result but for <div class="game g3 ng-star-inserted"> so in this case result is, but I don't need it.

Netherlands\\nSouth Korea 1.10\\n2.10\\n3.10 Germany\\nJapan 1.11\\n2.11\\n3.11

Here is HTML code:

  <div id="events">
    <game-filter class="ng-star-inserted">
      <div id="sport-legend" class="single">
        <div class="sport-name"> Football </div>
        <div class="games g3">
          <div class="game ng-star-inserted">
            <div class="game-name"> KI </div>
            <div class="selections s3 ng-star-inserted">
              <div class="selection ng-star-inserted"> Home </div>
              <div class="selection ng-star-inserted"> Away </div>
            </div>
          </div>
          <div class="game ng-star-inserted">
            <div class="game-name"> UG </div>
            <div class="selections s3 ng-star-inserted">
              <div class="selection ng-star-inserted"> Over </div>
              <div class="selection ng-star-inserted"> O/U </div>
              <div class="selection ng-star-inserted"> Under </div>
            </div>
          </div>
          <div class="game ng-star-inserted">
            <div class="game-name"> BTTS </div>
            <div class="selections s2 ng-star-inserted">
              <div class="selection ng-star-inserted"> GG </div>
              <div class="selection ng-star-inserted"> NG </div>
            </div>
          </div>
        </div>
      </div>
    </game-filter>
    <standard-item-info class="event ng-star-inserted">
      <div class="details">
        <div class="info">
          <div class="time">01:01</div>
          <div class="date">01.01.</div>
        </div>
        <div class="rivals">
          <div class="league">
            <!---->
            <span class="time-special ng-star-inserted">VIRT 10'
            </span> EL
          </div>
          <div class="home"> Netherlands </div>
          <div class="away"> South Korea </div>
        </div>
      </div>
      <standard-item-games class="games g3 ng-star-inserted">
        <div class="game g3 ng-star-inserted">
          <div class="ng-star-inserted">
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 1.10 </div>
            </standard-item-game>
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 2.10 </div>
            </standard-item-game>
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 3.10 </div>
            </standard-item-game>
          </div>
        </div>
        <div class="game g2 g3 ng-star-inserted">
          <div class="ng-star-inserted">
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 1.20 </div>
            </standard-item-game>
            <div class="odds limit ng-star-inserted"> 2.20 </div>
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 3.20 </div>
            </standard-item-game>
          </div>
        </div>
        <div class="game g2 ng-star-inserted">
          <div class="ng-star-inserted">
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 1.30 </div>
            </standard-item-game>
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 2.30 </div>
            </standard-item-game>
          </div>
        </div>
      </standard-item-games>
      <div class="show-all-expand ng-star-inserted">
        <div class="event-expand">
          <div class="icon"></div>
        </div>
      </div>
    </standard-item-info>
    <standard-item-info class="event ng-star-inserted">
      <div class="details">
        <div class="info">
          <div class="time">01:01</div>
          <div class="date">01.01.</div>
        </div>
        <div class="rivals">
          <div class="league">
            <!---->
            <span class="time-special ng-star-inserted">VIRT 10'
            </span> EL
          </div>
          <div class="home"> Germany </div>
          <div class="away"> Japan </div>
        </div>
      </div>
      <standard-item-games class="games g3 ng-star-inserted">
        <div class="game g3 ng-star-inserted">
          <div class="ng-star-inserted">
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 1.11 </div>
            </standard-item-game>
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 2.11 </div>
            </standard-item-game>
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 3.11 </div>
            </standard-item-game>
          </div>
        </div>
        <div class="game g2 g3 ng-star-inserted">
          <div class="ng-star-inserted">
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 1.21 </div>
            </standard-item-game>
            <div class="odds limit ng-star-inserted"> 2.21 </div>
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 3.21 </div>
            </standard-item-game>
          </div>
        </div>
        <div class="game g2 ng-star-inserted">
          <div class="ng-star-inserted">
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 1.31 </div>
            </standard-item-game>
            <standard-item-game class="ng-star-inserted">
              <div class="odds ng-star-inserted"> 2.31 </div>
            </standard-item-game>
          </div>
        </div>
      </standard-item-games>
      <div class="show-all-expand ng-star-inserted">
        <div class="event-expand">
          <div class="icon"></div>
        </div>
      </div>
    </standard-item-info>
  </div>
</div>```

one solution:

teamsdiv = driver.find_elements_by_xpath ("//div[@id='events']//div[@class='home' or @class='away']")
notesdiv = driver.find_elements_by_xpath ("//div[@id='events']//standard-item-games")

teams = []
for i in range(0, len(teamsdiv), 2):
    teams.append([teamsdiv[i].text, teamsdiv[i+1].text])

notes = []
for i in range(len(notesdiv)):
    notes.append(notesdiv[i].text.split('\n')[-2:])

for i in range(len(notes)):
    print(teams[i], notes[i])

result:

['Netherlands', 'South Korea'] ['1.30', '2.30']
['Germany', 'Japan'] ['1.31', '2.31']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM