Scraping certain attribute - Beautiful Soup Python

Question

I need help scraping the words "CPCAdvertising.com" within the highlighted span tags (see attached screenshot of the HTML). I am unsure of how to iterate through properly. Here is what I have so far:

import requests
from bs4 import BeautifulSoup
page_number = 1
flippa_page = requests.get('https://www.flippa.com/search?filter[property_type]=domain&filter[status]=won&filter[sale_method]=auction&page[number]={}&page[size]=250'.format(page_number))
price_list = []
domain_list = []
for i in range(120):
    src = flippa_page.content
    soup = BeautifulSoup(src, 'lxml')
    for span_tag in soup.find_all('span'):
        domain_list.append(span_tag.attrs['class'])
    page_number += 1

HTML Screenshot

Answer 1

Since your URL doesn't work for me, I'm using a different one from the same website. Regardless, you can specify the class within the find_all() command like so:

import requests
from bs4 import BeautifulSoup

flippa_page = requests.get('https://flippa.com/10339489-e-commerce-sports-and-outdoor')
src = flippa_page.content
soup = BeautifulSoup(src, 'lxml')

for s in soup.find_all('span', {'class': 'ListingList-itemPrice'}):
    # Print out the text within the tag
    print(s.text.strip())

Answer 2

The words should be in in span_tag.string .

Answer 3

Using html.parser instead of lxml i could find the span tags:

for item in soup.find_all('span'): 
     if (str(item.contents).find('CPCAdvertising.com')) > -1: 
         print(item) 

<span class="Basic___propertyName">CPCAdvertising.com</span>
<span class="Basic___title">CPCAdvertising.com - One Dollar Reserve !!</span>

I wasn't able to parse with lxml for some reason, if you can give me a tip on what lxml library your using I can check with it

Scraping certain attribute - Beautiful Soup Python

Question

3 answers

solution1
0 2019-11-12 21:51:25

solution2
0 2019-11-12 21:53:03

solution3
0 ACCPTED 2019-11-12 22:02:58

Scraping certain attribute - Beautiful Soup Python

Question

3 answers

solution1 0 2019-11-12 21:51:25

solution2 0 2019-11-12 21:53:03

solution3 0 ACCPTED 2019-11-12 22:02:58

solution1
0 2019-11-12 21:51:25

solution2
0 2019-11-12 21:53:03

solution3
0 ACCPTED 2019-11-12 22:02:58