简体   繁体   English

如何从 eventbrite 获取所有页面数据

[英]How to fetch all pages data from eventbrite

I am trying to fetch all pages data from eventbrite but only able to fetch one page data.我正在尝试从 eventbrite 获取所有页面数据,但只能获取一页数据。 When I am using findAll then I am getting error.当我使用findAll时,我得到了错误。 This code works fine for 1 page but not all pages.此代码适用于 1 页,但不适用于所有页面。 Here is my code -这是我的代码 -

from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import csv
import json

driver = webdriver.Chrome("chromedriver/chromedriver.exe")

driver.get("https://www.eventbrite.com/d/malaysia--kuala-lumpur--85675181/all-events/?page=1")
content = driver.page_source

soup = BeautifulSoup(content, 'html.parser')
b= json.loads("".join(soup.find("script", {"type":"application/ld+json"}).contents))

with open("data.csv", "w",  encoding='utf-8') as file:
    csv_file = csv.writer(file)
    csv_file.writerow( ["Date", "Name", "Price", "Location"] )

    for item in b:
        csv_file.writerow([item['startDate'], item['name'], item['offers']['highPrice'], item['location']['name']])

Try this:尝试这个:

from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import csv
import json

driver = webdriver.Chrome("C:/Users/ARPITA CHOPRA/Downloads/chromedriver/chromedriver.exe")

with open("data.csv", "w",  encoding='utf-8') as file:
    csv_file = csv.writer(file)
    csv_file.writerow( ["Date", "Name", "Price", "Location"] )
    for x in range(1, 20):
        driver.get("https://www.eventbrite.com/d/malaysia--kuala-lumpur--85675181/all-events/?page="+str(x))
        content = driver.page_source

        soup = BeautifulSoup(content, 'html.parser')
        b= json.loads("".join(soup.find("script", {"type":"application/ld+json"}).contents))

        for item in b:
            csv_file.writerow([item['startDate'], item['name'], item['offers']['highPrice'], item['location']['name']])

To get data from all pages, increase the page= parameter in URL.要从所有页面获取数据,请增加 URL 中的page=参数。

For example:例如:

import json
import requests
from bs4 import BeautifulSoup


url = "https://www.eventbrite.com/d/malaysia--kuala-lumpur--85675181/all-events/"

page = 1
while True:
    print('Page {}...'.format(page))
    soup = BeautifulSoup(requests.get(url, params={'page': page}).content, 'html.parser')
    b = json.loads("".join(soup.find("script", {"type":"application/ld+json"}).contents))

    if not b:
        break

    for item in b:
        print([item['startDate'], item['name'], item['offers']['highPrice'], item['location']['name']])

    page += 1

Prints:印刷:

Page 1...
['2021-03-27', 'Traders Fair 2021 - Malaysia (Financial Education Event)', '0.00', 'InterContinental Kuala Lumpur']
['2020-07-22', 'Malaysian International Food & Beverage (MIFB) Trade Fair', '0.00', 'Kuala Lumpur Convention Centre']
['2020-09-26', 'Post Graduate Education Fair 2020 - Mid Valley KL', '0.00', 'Mid Valley Exhibition Centre']
['2020-08-13', 'THE FIT Malaysia', '0.00', 'Kuala Lumpur Convention Centre']
['2020-09-26', 'Mega Career Fair & Post Graduate Education Fair 2020 - Mid Valley KL', '0.00', 'Mid Valley Exhibition Centre, Kuala Lumpur']
['2020-07-21', 'Entrepreneurship for Beginners - Startup | Entrepreneur Hackathon Webinar', '0.00', 'Kuala Lumpur']
['2020-11-26', 'Branding Strategies For Startups', '0.00', 'Found8 KL Sentral']
['2020-07-22', 'MyFoodTech', '0.00', 'Kuala Lumpur Convention Centre']
['2021-09-01', 'Wiki Finance EXPO Kuala Lumpur 2021', '0.00', '吉隆坡希尔顿逸林酒店']
['2020-07-23', 'How To Improve Your Focus and Limit Distractions - Kuala Lumpur', '0.00', 'ONLINE']
['2020-08-14', 'Kuala Lumpu Video Speed Dating - Filter Off', '0.00', 'Online Dating - Filter Off']
['2021-01-16', "Joey Yap's Feng Shui & Astrology 2021 (Kuala Lumpur) - Cantonese Session", '0.00', 'Kuala Lumpur']
['2020-07-21', 'How To Improve Your Memory - Kuala Lumpur', '0.00', '(ONLINE EVENT)']
['2020-09-24', 'Maximizing Social Impact for Startups and SMEs', '0.00', 'Found8 KL Sentral']
['2021-01-17', "Joey Yap's Feng Shui & Astrology 2021 (Kuala Lumpur) - English Session", '0.00', 'Kuala Lumpur']
['2020-07-17', 'Building Leadership Influence (Online - Run 4)', '0.00', 'Menara Keck Seng']
['2020-08-08', '2020 Entrepreneur (Malaysia) WhatsApp Meetup - Aug 2020', '0.00', 'Eatropica']
['2020-08-01', 'KUPON DAGING QURBAN MJTAAS 2020', '0.00', 'Masjid Jamek Tengku Abdul Aziz Shah']
['2020-08-12', 'Wire And  Cable  Show Malaysia 2020', '0.00', 'Kuala Lumpur City Centre']
['2020-10-05', 'KL International Flea Market 2020 / Bazaar Antarabangsa Kuala Lumpur', '0.00', 'VIVA Shopping Mall']
Page 2...
['2020-07-19', 'FGTSD Physical Church Service', '0.00', 'Full Gospel Tabernacle Sri Damansara']
['2020-07-17', 'OWN YOUR ONLINE BUSINESS WITH A TURN ON KEY PLATFORM', '0.00', 'Online']
['2020-09-12', 'International Beauty Expo (IBE) 2020', '0.00', 'Malaysia International Trade and Exhibition Centre']
['2020-07-20', 'Learn How To Earn USD3500 In 4 Week Using Your SmartPhone', '0.00', 'KL Online Event']
['2020-08-27', 'Turn Customers into Raving Fans of Your Brand via Equity Crowdfunding', '0.00', 'Found8 KL Sentral']
['2020-08-12', 'Improving Your  Business Workflow with HELIOS', '0.00', 'KL Eco City']
['2020-07-27', 'Winning People Over: Influencing Skills (Online - Run 9)', '0.00', 'Menara Keck Seng']
['2020-08-10', 'CERTIFIED CYBER PENETRATION TESTING ENGINEER (CCPTE)', '0.00', 'Kuala Lumpur']
['2020-10-22', 'Halloween Edition: Creating High Performing Teams Workshop', '0.00', 'Found8 KL Sentral']

... and so on until page 19.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM