简体   繁体   中英

How to fetch all pages data from eventbrite

I am trying to fetch all pages data from eventbrite but only able to fetch one page data. When I am using findAll then I am getting error. This code works fine for 1 page but not all pages. Here is my code -

from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import csv
import json

driver = webdriver.Chrome("chromedriver/chromedriver.exe")

driver.get("https://www.eventbrite.com/d/malaysia--kuala-lumpur--85675181/all-events/?page=1")
content = driver.page_source

soup = BeautifulSoup(content, 'html.parser')
b= json.loads("".join(soup.find("script", {"type":"application/ld+json"}).contents))

with open("data.csv", "w",  encoding='utf-8') as file:
    csv_file = csv.writer(file)
    csv_file.writerow( ["Date", "Name", "Price", "Location"] )

    for item in b:
        csv_file.writerow([item['startDate'], item['name'], item['offers']['highPrice'], item['location']['name']])

Try this:

from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import csv
import json

driver = webdriver.Chrome("C:/Users/ARPITA CHOPRA/Downloads/chromedriver/chromedriver.exe")

with open("data.csv", "w",  encoding='utf-8') as file:
    csv_file = csv.writer(file)
    csv_file.writerow( ["Date", "Name", "Price", "Location"] )
    for x in range(1, 20):
        driver.get("https://www.eventbrite.com/d/malaysia--kuala-lumpur--85675181/all-events/?page="+str(x))
        content = driver.page_source

        soup = BeautifulSoup(content, 'html.parser')
        b= json.loads("".join(soup.find("script", {"type":"application/ld+json"}).contents))

        for item in b:
            csv_file.writerow([item['startDate'], item['name'], item['offers']['highPrice'], item['location']['name']])

To get data from all pages, increase the page= parameter in URL.

For example:

import json
import requests
from bs4 import BeautifulSoup


url = "https://www.eventbrite.com/d/malaysia--kuala-lumpur--85675181/all-events/"

page = 1
while True:
    print('Page {}...'.format(page))
    soup = BeautifulSoup(requests.get(url, params={'page': page}).content, 'html.parser')
    b = json.loads("".join(soup.find("script", {"type":"application/ld+json"}).contents))

    if not b:
        break

    for item in b:
        print([item['startDate'], item['name'], item['offers']['highPrice'], item['location']['name']])

    page += 1

Prints:

Page 1...
['2021-03-27', 'Traders Fair 2021 - Malaysia (Financial Education Event)', '0.00', 'InterContinental Kuala Lumpur']
['2020-07-22', 'Malaysian International Food & Beverage (MIFB) Trade Fair', '0.00', 'Kuala Lumpur Convention Centre']
['2020-09-26', 'Post Graduate Education Fair 2020 - Mid Valley KL', '0.00', 'Mid Valley Exhibition Centre']
['2020-08-13', 'THE FIT Malaysia', '0.00', 'Kuala Lumpur Convention Centre']
['2020-09-26', 'Mega Career Fair & Post Graduate Education Fair 2020 - Mid Valley KL', '0.00', 'Mid Valley Exhibition Centre, Kuala Lumpur']
['2020-07-21', 'Entrepreneurship for Beginners - Startup | Entrepreneur Hackathon Webinar', '0.00', 'Kuala Lumpur']
['2020-11-26', 'Branding Strategies For Startups', '0.00', 'Found8 KL Sentral']
['2020-07-22', 'MyFoodTech', '0.00', 'Kuala Lumpur Convention Centre']
['2021-09-01', 'Wiki Finance EXPO Kuala Lumpur 2021', '0.00', '吉隆坡希尔顿逸林酒店']
['2020-07-23', 'How To Improve Your Focus and Limit Distractions - Kuala Lumpur', '0.00', 'ONLINE']
['2020-08-14', 'Kuala Lumpu Video Speed Dating - Filter Off', '0.00', 'Online Dating - Filter Off']
['2021-01-16', "Joey Yap's Feng Shui & Astrology 2021 (Kuala Lumpur) - Cantonese Session", '0.00', 'Kuala Lumpur']
['2020-07-21', 'How To Improve Your Memory - Kuala Lumpur', '0.00', '(ONLINE EVENT)']
['2020-09-24', 'Maximizing Social Impact for Startups and SMEs', '0.00', 'Found8 KL Sentral']
['2021-01-17', "Joey Yap's Feng Shui & Astrology 2021 (Kuala Lumpur) - English Session", '0.00', 'Kuala Lumpur']
['2020-07-17', 'Building Leadership Influence (Online - Run 4)', '0.00', 'Menara Keck Seng']
['2020-08-08', '2020 Entrepreneur (Malaysia) WhatsApp Meetup - Aug 2020', '0.00', 'Eatropica']
['2020-08-01', 'KUPON DAGING QURBAN MJTAAS 2020', '0.00', 'Masjid Jamek Tengku Abdul Aziz Shah']
['2020-08-12', 'Wire And  Cable  Show Malaysia 2020', '0.00', 'Kuala Lumpur City Centre']
['2020-10-05', 'KL International Flea Market 2020 / Bazaar Antarabangsa Kuala Lumpur', '0.00', 'VIVA Shopping Mall']
Page 2...
['2020-07-19', 'FGTSD Physical Church Service', '0.00', 'Full Gospel Tabernacle Sri Damansara']
['2020-07-17', 'OWN YOUR ONLINE BUSINESS WITH A TURN ON KEY PLATFORM', '0.00', 'Online']
['2020-09-12', 'International Beauty Expo (IBE) 2020', '0.00', 'Malaysia International Trade and Exhibition Centre']
['2020-07-20', 'Learn How To Earn USD3500 In 4 Week Using Your SmartPhone', '0.00', 'KL Online Event']
['2020-08-27', 'Turn Customers into Raving Fans of Your Brand via Equity Crowdfunding', '0.00', 'Found8 KL Sentral']
['2020-08-12', 'Improving Your  Business Workflow with HELIOS', '0.00', 'KL Eco City']
['2020-07-27', 'Winning People Over: Influencing Skills (Online - Run 9)', '0.00', 'Menara Keck Seng']
['2020-08-10', 'CERTIFIED CYBER PENETRATION TESTING ENGINEER (CCPTE)', '0.00', 'Kuala Lumpur']
['2020-10-22', 'Halloween Edition: Creating High Performing Teams Workshop', '0.00', 'Found8 KL Sentral']

... and so on until page 19.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM