Python html parsing using beautiful soup issues

Question

I am trying to get the name of all organizations from https://www.devex.com/organizations/search using beautifulsoup.However, I am getting an error. Can someone please help.

import requests from requests import get from bs4 import BeautifulSoup import pandas as pd import numpy as np

from time import sleep from random import randint

headers = {"Accept-Language": "en-US,en;q=0.5"}

titles = [] pages = np.arange(1, 2, 1)

for page in pages:

page = requests.get("https://www.devex.com/organizations/search?page%5Bnumber%5D=" + str(page) + "", headers=headers)

soup = BeautifulSoup(page.text, 'html.parser') movie_div = soup.find_all('div', class_='info-container')

sleep(randint(2,10))

for container in movie_div:

    name = container.a.find('h3', class_= 'ng-binding').text
    titles.append(name)

movies = pd.DataFrame({ 'movie': titles,

})

to see your dataframe

print(movies)

to see the datatypes of your columns

print(movies.dtypes)

to see where you're missing data and how much data is missing

print(movies.isnull().sum())

to move all your scraped data to a CSV file

movies.to_csv('movies.csv')

Answer 1

you may try with something like

name = bs.find("h3", {"class": "ng-binding"})

Python html parsing using beautiful soup issues

Question

to see your dataframe

to see the datatypes of your columns

to see where you're missing data and how much data is missing

to move all your scraped data to a CSV file

1 answers

solution1
0 2021-01-23 13:11:16

Python html parsing using beautiful soup issues

Question

to see your dataframe

to see the datatypes of your columns

to see where you're missing data and how much data is missing

to move all your scraped data to a CSV file

1 answers

solution1 0 2021-01-23 13:11:16

solution1
0 2021-01-23 13:11:16