Python 3 Extracting span tag using bs4

Question

I have the span tag for a page

<span itemprop="name">
            DeWalt DCD778D2T-GB  18V 2.0Ah Li-Ion XR Brushless Cordless Combi Drill
        </span>

How would i extract the text inside the span tag, I've tried to use some find methods but recived no item object error

Below is the code I've tried, where am i going wrong?

r=requests.get('https://www.screwfix.com/p/dewalt-dcd778d2t-gb-18v-2-0ah-li-ion-xr-brushless-cordless-combi-drill/268fx')

c=r.content
soup=BeautifulSoup(c,"html.parser")
ToolName1 = soup.find("span", {"itemprop" : "name"}).text

My error is

AttributeError: 'NoneType' object has no attribute 'text'

Answer 1

Actually, you got r.status.code 403 (Forbidden), then repr(soup) is empty string, so you got None for soup.find("span", {"itemprop": "name"}). It means None.text and that's why you got AttributeError: 'NoneType' object has no attribute 'text'.

You need to specify headers for this url, maybe just User-Agent for the header

import requests
from bs4 import BeautifulSoup

url = ('https://www.screwfix.com/p/dewalt-dcd778d2t-gb-18v-2-0ah-li-ion-xr-'
       'brushless-cordless-combi-drill/268fx')

headers = {'User-Agent': ('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWeb'
                          'Kit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.14'
                          '9 Safari/537.36')}

r = requests.get(url, headers=headers)
if r.status_code == 200:
    c = r.content
    soup = BeautifulSoup(c,"html.parser")
    ToolName1 = soup.find("span", {"itemprop" : "name"}).text
    print(ToolName1.strip())

then you will get this

DeWalt DCD778D2T-GB  18V 2.0Ah Li-Ion XR Brushless Cordless Combi Drill

status_code 200 is general case for success, there are some status code, not 200, still means success.

Python 3 Extracting span tag using bs4

Question

1 answers

solution1
1 ACCPTED 2020-06-26 01:47:24

Python 3 Extracting span tag using bs4

Question

1 answers

solution1 1 ACCPTED 2020-06-26 01:47:24

solution1
1 ACCPTED 2020-06-26 01:47:24