简体   繁体   中英

Sending data scraped with BS4 to sqlite3 database using Python

I am scraping the names of cafes in different neighbourhoods and want to add them to an SQLite3 table within a database. However all that is being added to the table is the variable cafeNames not the actual list of cafe names.

I have been searching through the SQLite and BS4 documentation and following a whole host of tutorials but I can't seem to figure it out. I'd be grateful for some guidance.

Get cafe names code

import requests
import sqlite3

def cafenames():
    url = 'https://www.broadsheet.com.au/melbourne/guides/best-cafes-thornbury'
    response = requests.get(url, timeout=5)

    soup_cafe_names = BeautifulSoup(response.content, "html.parser")
    type(soup_cafe_names)

    cafeNames = soup_cafe_names.findAll('h2', attrs={"class":"venue-title", })
    cafeNames = [ul.text.encode for ul in cafeNames]

Connect to DB code

    try:
        sqliteConnection = sqlite3.connect('anybody_database.db')
        cursor = sqliteConnection.cursor()
        print("Database created and Successfully Connected to anybody_database")

        sqlite_select_Query = "select sqlite_version();"
        cursor.execute(sqlite_select_Query)
        record = cursor.fetchall()
        print("SQLite Database Version is: ", record)
        cursor.close()

    except sqlite3.Error as error:
        print("Error while connecting to sqlite", error)

    finally:
        if (sqliteConnection):
            sqliteConnection.close()
            print("The SQLite connection is closed")

Create Table Code


    try:
        sqliteConnection = sqlite3.connect('anybody_database.db')
        sqlite_create_table_query = ''' CREATE TABLE cafes (
                                        id INTEGER PRIMARY KEY,
                                        name TEXT NOT NULL);'''
        cursor = sqliteConnection.cursor()
        print("Successfully Connected to SQLite")
        cursor.execute(sqlite_create_table_query)
        sqliteConnection.commit()
        print("SQLite table created")

        cursor.close()

    except sqlite3.Error as error:
        print("Error while creating a sqlite table", error)
    finally:
        if (sqliteConnection):
            sqliteConnection.close()
            print("sqlite connection is closed")

Add the cafe names to the table code:


def insertVariableIntoTable(name):
    try:
        sqliteConnection = sqlite3.connect('anybody_database.db')
        cursor = sqliteConnection.cursor()
        print("Successfully Connected to SQLite")

        sqlite_insert_with_param = """INSERT INTO cafes
                            (name)
                            VALUES
                            (?)"""

        data_tuple = (name)
        cursor.execute(sqlite_insert_with_param, data_tuple)
        sqliteConnection.commit()
        print("Python Variables inserted successfully into cafes table ")

        cursor.close()


    except sqlite3.Error as error:
        print("Failed to insert data into sqlite table", error)
    finally:
        if (sqliteConnection):
            sqliteConnection.close()
            print("The SQLite connection is closed")

insertVariableIntoTable('cafeNames')

The problem is with your beautifulSoup extraction.

def cafenames():
    url = 'https://www.broadsheet.com.au/melbourne/guides/best-cafes-thornbury'
    response = requests.get(url, timeout=5)

    soup_cafe_names = BeautifulSoup(response.content, "html.parser")
    type(soup_cafe_names)

    cafeNames = soup_cafe_names.findAll('h2', attrs={"class":"venue-title", })
    cafeNames = [ul.text.strip().encode() for ul in cafeNames]

encode is a method not an attribute. So the output from this function is as follows

[b'Prior',
 b'Rat the Cafe',
 b'Ampersand Coffee and Food',
 b'Umberto Espresso Bar',
 b'Brother Alec',
 b'Short Round',
 b'Jerry Joy',
 b'The Old Milk Bar',
 b'Little Henri',
 b'Northern Soul']

I would rather not encourage encode unless it's for a reason

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM