简体   繁体   中英

Beautiful Soup variable span class

Wondering if you could help with some web scraping I'm trying to do.

Below is a span class that I want to get the data from. The problem is that there is a random number in the span class for different data points.

I know that the part "price-val" is the same for all iterations but I can't work out how to search for just this when getting the data.

   <span class="price-val_196775436 odd-val ib right">
    2.47
    </span>

My Code so far

    url ="http://www.sportsbet.com.au/betting/american-football"
    r = requests.get(url)
    soup = BeautifulSoup(r.content)
    g_data = soup.find_all("div", {"class": "accordion-body"})

    

    for item in g_data:
            A = item.find('span', {'class': 'team-name ib'}).text
            B = item.find('span', {'class': 'price-val_196775436 odd-val ib right'}).text

the error i get

Traceback (most recent call last):
  File "C:\Users\James\Desktop\NFLsportsbet.py", line 23, in <module>
    B = item.find('span', {'class': 'price-val'}).text
AttributeError: 'NoneType' object has no attribute 'text'

Use parser library eg lxml and may need to use regex or lambda-

import requests,re
from  bs4  import  BeautifulSoup

url ="http://www.sportsbet.com.au/betting/american-football"
r = requests.get(url)
soup = BeautifulSoup(r.content,'lxml')
g_data = soup.find_all("div", {"class": "accordion-body"})



for items in g_data:
    print items.find('span', {'class': 'team-name ib'}).text
    print items.find('span', {'class': lambda L: L and L.startswith('price-val_')}).text
    #print items.find('span', {'class': re.compile('price-val_*')}).text  #or regex like this

It prints

Detroit Lions

2.47

Tampa Bay Buccaneers

3.85

Arizona Cardinals

1.39

San Diego Chargers

2.65

San Francisco 49ers

3.95

New York Giants

2.40

Cincinnati Bengals

1.97

Tennessee Titans

2.61

Minnesota Vikings

1.90

New York Jets

1.66

Seattle Seahawks

1.46

Green Bay Packers

1.68

Indianapolis Colts

3.22

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM