簡體   English   中英

使用request和psycopg2在Postgres中創建/插入Json

[英]Create/Insert Json in Postgres with requests and psycopg2

剛剛用PostgreSQL開始了一個項目。 我想從Excel跳到數據庫,我堅持創建和插入。 一旦我運行它,我將不得不將其切換到更新我相信所以我不會繼續寫入當前數據。 我知道我的連接正常,但我收到以下錯誤。

我的錯誤是: TypeError: not all arguments converted during string formatting

#!/usr/bin/env python
import requests
import psycopg2

conn = psycopg2.connect(database='NHL', user='postgres', password='postgres', host='localhost', port='5432')

req = requests.get('http://www.nhl.com/stats/rest/skaters?isAggregate=false&reportType=basic&isGame=false&reportName=skatersummary&sort=[{%22property%22:%22playerName%22,%22direction%22:%22ASC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]&cayenneExp=gameTypeId=2%20and%20seasonId%3E=20172018%20and%20seasonId%3C=20172018') 
data = req.json()['data']

my_data = []
for item in data:
    season = item['seasonId']
    player = item['playerName']
    first_name = item['playerFirstName']
    last_Name = item['playerLastName']
    playerId = item['playerId']
    height = item['playerHeight']
    pos = item['playerPositionCode']
    handed = item['playerShootsCatches']
    city = item['playerBirthCity']
    country = item['playerBirthCountry']   
    state = item['playerBirthStateProvince']
    dob = item['playerBirthDate']
    draft_year = item['playerDraftYear']
    draft_round = item['playerDraftRoundNo']
    draft_overall = item['playerDraftOverallPickNo']
    my_data.append([playerId, player, first_name, last_Name, height, pos, handed, city, country, state, dob, draft_year, draft_round, draft_overall, season])

cur = conn.cursor()
cur.execute("CREATE TABLE t_skaters (data json);")
cur.executemany("INSERT INTO t_skaters VALUES (%s)", (my_data,))

data:樣本data:

[[8468493, 'Ron Hainsey', 'Ron', 'Hainsey', 75, 'D', 'L', 'Bolton', 'USA', 'CT', '1981-03-24', 2000, 1, 13, 20172018], [8471339, 'Ryan Callahan', 'Ryan', 'Callahan', 70, 'R', 'R', 'Rochester', 'USA', 'NY', '1985-03-21', 2004, 4, 127, 20172018]]

您似乎想要創建一個名為"data"列的表。 此列的類型是JSON。 (我建議每個字段創建一個列,但這取決於你。)

在這種情況下,可變data (從請求中讀取)是一個dict list 正如我在評論中提到的,您可以循環data並一次執行一次插入,因為executemany()並不比多次execute()快。

我做的是以下內容:

  1. 創建您關心的字段列表。
  2. 循環遍歷data元素
  3. 對於data每個item ,將字段提取到my_data
  4. 調用execute()和在通過json.dumps(my_data)轉換my_datadict成JSON串)

嘗試這個:

#!/usr/bin/env python
import requests
import psycopg2
import json

conn = psycopg2.connect(database='NHL', user='postgres', password='postgres', host='localhost', port='5432')

req = requests.get('http://www.nhl.com/stats/rest/skaters?isAggregate=false&reportType=basic&isGame=false&reportName=skatersummary&sort=[{%22property%22:%22playerName%22,%22direction%22:%22ASC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]&cayenneExp=gameTypeId=2%20and%20seasonId%3E=20172018%20and%20seasonId%3C=20172018') 

# data here is a list of dicts
data = req.json()['data']

cur = conn.cursor()
# create a table with one column of type JSON
cur.execute("CREATE TABLE t_skaters (data json);")

fields = [
    'seasonId',
    'playerName',
    'playerFirstName',
    'playerLastName',
    'playerId',
    'playerHeight',
    'playerPositionCode',
    'playerShootsCatches',
    'playerBirthCity',
    'playerBirthCountry',
    'playerBirthStateProvince',
    'playerBirthDate',
    'playerDraftYear',
    'playerDraftRoundNo',
    'playerDraftOverallPickNo'
]

for item in data:
    my_data = {field: item[field] for field in fields}
    cur.execute("INSERT INTO t_skaters VALUES (%s)", (json.dumps(my_data),))


# commit changes
conn.commit()
# Close the connection
conn.close()

我不是100%確定這里的所有postgres語法是否正確(我無法訪問PG數據庫進行測試),但我相信這個邏輯應該適用於你想要做的事情。

更新單獨的列

您可以修改create語句以處理多個列,但需要知道每列的數據類型。 這是你可以遵循的一些偽代碼:

# same boilerplate code from above
cur = conn.cursor()
# create a table with one column per field
cur.execute(
"""CREATE TABLE t_skaters (seasonId INTEGER, playerName VARCHAR, ...);"""
)

fields = [
    'seasonId',
    'playerName',
    'playerFirstName',
    'playerLastName',
    'playerId',
    'playerHeight',
    'playerPositionCode',
    'playerShootsCatches',
    'playerBirthCity',
    'playerBirthCountry',
    'playerBirthStateProvince',
    'playerBirthDate',
    'playerDraftYear',
    'playerDraftRoundNo',
    'playerDraftOverallPickNo'
]

for item in data:
    my_data = [item[field] for field in fields]
    # need a placeholder (%s) for each variable 
    # refer to postgres docs on INSERT statement on how to specify order
    cur.execute("INSERT INTO t_skaters VALUES (%s, %s, ...)", tuple(my_data))


# commit changes
conn.commit()
# Close the connection
conn.close()

...替換為適當的數據值。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM