Python scraping urls category

Question

Hello I'm starting now in programming and I'm having some problems with Scraping, I'm trying to get all the links in a category with several names, but I'm not getting isolate the cells because there are many with the same name, could someone help me? I will put the picture of my code and the url I want to get.

`

 url = 'http://books.toscrape.com/index.html' reqs = requests.get(url) if reqs.ok: soup = BeautifulSoup(reqs.text, 'html.parser') ul = soup.find('ul', {'class': 'nav nav-list'}) for cells in ul: a = cells.find('a') link = a['href'] #print(link) [print(str(lis) + '\n\n') for lis in link]

=== LINK IMAGE ===

I need to retrieve all urls in (li)

Answer 1

I think this is what you are after. I have commented the code to explain what I've done and why. You can obviously write this in fewer lines but this explains a little easier.

from bs4 import BeautifulSoup
import requests

url = 'http://books.toscrape.com/index.html'
reqs = requests.get(url)
if reqs.ok:
    soup = BeautifulSoup(reqs.text, 'html.parser')

    # use multiple class selector list
    sidebar = soup.find('ul', {"class": ["nav", "nav-list"]})

    # find all the list tags within the ol
    li = sidebar.find_all('li')

    for item in li:
        # iterate through results to find a link
        link = item.find('a', href=True)
        # if there is a link print it
        if link is not None:
            print(link['href'])

Answer 2

This can be done by going straight to the links:

from bs4 import BeautifulSoup
import requests

url = 'http://books.toscrape.com/index.html'
reqs = requests.get(url)
if reqs.ok:
    soup = BeautifulSoup(reqs.text, 'html.parser')
    ul = soup.find_all('a')
    for cells in ul:
        print(cells['href'])

Python scraping urls category

Question

2 answers

solution1
0 ACCPTED 2021-01-18 16:34:46

solution2
0 2021-01-18 17:05:35

Python scraping urls category

Question

2 answers

solution1 0 ACCPTED 2021-01-18 16:34:46

solution2 0 2021-01-18 17:05:35

solution1
0 ACCPTED 2021-01-18 16:34:46

solution2
0 2021-01-18 17:05:35