繁体   English   中英

Python SQLite更新专栏

[英]Python SQLite Updating Column

我正在尝试基于已经填充的artist_title列更新album_title列。

我可以用循环中的最后一个album_title重复更新整个album_title列:对于相册中的标签:

for album in tag:
    cur.execute('INSERT OR IGNORE INTO Albums (album_title) VALUES (?)', (album, ))

    for artist in artists:
        artist = artist.string          
        cur.execute('INSERT OR IGNORE INTO Artists(artist_name) VALUES (?)', (artist, ))        
        cur.execute('UPDATE Artists SET album_title=? WHERE artist_name=?', (album, artist))

或者,我只能使用正确的album_title更新最后一行。

 for tag in albums:

    for album in tag:
        cur.execute('INSERT OR IGNORE INTO Albums (album_title) VALUES (?)', (album, ))

        for artist in artists:
            artist = artist.string          
            cur.execute('INSERT OR IGNORE INTO Artists(artist_name) VALUES (?)', (artist, ))

        cur.execute('UPDATE Artists SET album_title=? WHERE artist_name=?', (album, artist))

我知道为什么会发生这些问题,但是我无法解决要达到的目标-每行更新的专辑名称正确。 album_title名称将始终与artist_name相同。

我已经看到更新列在这里进行了广泛介绍,但是由于缠结的独特的for循环,我无法解决。如果我的问题是因为我的数据检索结构不好,我很高兴听到如何解决。

完整代码:

from urllib.request import Request, urlopen
from urllib.parse import urlparse
from urllib.parse import urljoin
from bs4 import BeautifulSoup

import urllib.error
import sqlite3
import json
import time
import ssl


#connect/create database
conn = sqlite3.connect('pitchscraper.sqlite')
#create way to talk to database
cur = conn.cursor()

#create table
cur.execute('''
    CREATE TABLE IF NOT EXISTS Master (id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, album_title TEXT UNIQUE, artist_name TEXT UNIQUE)''')

cur.execute('''
    CREATE TABLE IF NOT EXISTS Albums (id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, album_title TEXT UNIQUE)''')

cur.execute('''
    CREATE TABLE IF NOT EXISTS Artists (id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, artist_name TEXT UNIQUE, album_title TEXT, FOREIGN KEY(album_title) REFERENCES Albums(album_title))''')



#open and read page
req = Request('http://pitchfork.com/reviews/albums/?page=1', headers={'User-Agent': 'Mozilla/5.0'})
pitchpage = urlopen(req).read()


#parse with beautiful soup
soup = BeautifulSoup(pitchpage, "lxml")
albums = soup('h2')
artists = soup.find_all(attrs={"class" : "artist-list"})


for tag in albums:

    for album in tag:
        cur.execute('INSERT OR IGNORE INTO Albums (album_title) VALUES (?)', (album, ))

        for artist in artists:
            artist = artist.string          
            cur.execute('INSERT OR IGNORE INTO Artists(artist_name) VALUES (?)', (artist, ))        
            cur.execute('UPDATE Artists SET album_title=? WHERE artist_name=?', (album, artist))


print()


conn.commit()

输出失败:

+------+-------------------------------------------+-------------+
|  id  |                artist_name                | album_title |
+------+-------------------------------------------+-------------+
| "1"  | "Sylvan Esso"                             | "Odd Hours" |
| "2"  | "Mew"                                     | "Odd Hours" |
| "3"  | "Tara Jane O’Neil"                        | "Odd Hours" |
| "4"  | "Real Life Buildings"                     | "Odd Hours" |
| "5"  | "Bruce Springsteen and the E Street Band" | "Odd Hours" |
| "6"  | "Ravyn Lenae"                             | "Odd Hours" |
| "7"  | "Tee Grizzley"                            | "Odd Hours" |
| "8"  | "Shugo Tokumaru"                          | "Odd Hours" |
| "9"  | "Woods"                                   | "Odd Hours" |
| "10" | "Formation"                               | "Odd Hours" |
| "11" | "Valgeir Sigurðsson"                      | "Odd Hours" |
| "12" | "Caddywhompus"                            | "Odd Hours" |
+------+-------------------------------------------+-------------+

所需输出:

+------+-------------------------------------------+-------------------------------+
|  id  |                artist_name                |          album_title          |
+------+-------------------------------------------+-------------------------------+
| "1"  | "Sylvan Esso"                             | "What Now"                    |
| "2"  | "Mew"                                     | "Visuals"                     |
| "3"  | "Tara Jane O’Neil"                        | "Tara Jane O'Neil"            |
| "4"  | "Real Life Buildings"                     | "Significant Weather"         |
| "5"  | "Bruce Springsteen and the E Street Band" | "Hammersmirth Odeon, London"  |
| "6"  | "Ravyn Lenae"                             | "Midnight Moonlight EP"       |
| "7"  | "Tee Grizzley"                            | "My Moment"                   |
| "8"  | "Shugo Tokumaru"                          | "TOSS"                        |
| "9"  | "Woods"                                   | "Love is Love"                |
| "10" | "Formation"                               | "Look at the Powerful People" |
| "11" | "Valgeir Sigurðsson"                      | "Dissonance"                  |
| "12" | "Caddywhompus"                            | "Odd Hours"                   |
+------+-------------------------------------------+-------------------------------+
albums = soup('h2')
artists = soup.find_all(attrs={"class" : "artist-list"})

问题是artists列表包含所有艺术家。

您必须从每个单独的专辑中提取循环中的艺术家列表。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM