[英]Create/Insert Json in Postgres with requests and psycopg2
剛剛用PostgreSQL
開始了一個項目。 我想從Excel跳到數據庫,我堅持創建和插入。 一旦我運行它,我將不得不將其切換到更新我相信所以我不會繼續寫入當前數據。 我知道我的連接正常,但我收到以下錯誤。
我的錯誤是: TypeError: not all arguments converted during string formatting
#!/usr/bin/env python
import requests
import psycopg2
conn = psycopg2.connect(database='NHL', user='postgres', password='postgres', host='localhost', port='5432')
req = requests.get('http://www.nhl.com/stats/rest/skaters?isAggregate=false&reportType=basic&isGame=false&reportName=skatersummary&sort=[{%22property%22:%22playerName%22,%22direction%22:%22ASC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]&cayenneExp=gameTypeId=2%20and%20seasonId%3E=20172018%20and%20seasonId%3C=20172018')
data = req.json()['data']
my_data = []
for item in data:
season = item['seasonId']
player = item['playerName']
first_name = item['playerFirstName']
last_Name = item['playerLastName']
playerId = item['playerId']
height = item['playerHeight']
pos = item['playerPositionCode']
handed = item['playerShootsCatches']
city = item['playerBirthCity']
country = item['playerBirthCountry']
state = item['playerBirthStateProvince']
dob = item['playerBirthDate']
draft_year = item['playerDraftYear']
draft_round = item['playerDraftRoundNo']
draft_overall = item['playerDraftOverallPickNo']
my_data.append([playerId, player, first_name, last_Name, height, pos, handed, city, country, state, dob, draft_year, draft_round, draft_overall, season])
cur = conn.cursor()
cur.execute("CREATE TABLE t_skaters (data json);")
cur.executemany("INSERT INTO t_skaters VALUES (%s)", (my_data,))
data:
樣本data:
[[8468493, 'Ron Hainsey', 'Ron', 'Hainsey', 75, 'D', 'L', 'Bolton', 'USA', 'CT', '1981-03-24', 2000, 1, 13, 20172018], [8471339, 'Ryan Callahan', 'Ryan', 'Callahan', 70, 'R', 'R', 'Rochester', 'USA', 'NY', '1985-03-21', 2004, 4, 127, 20172018]]
您似乎想要創建一個名為"data"
列的表。 此列的類型是JSON。 (我建議每個字段創建一個列,但這取決於你。)
在這種情況下,可變data
(從請求中讀取)是一個dict
list
。 正如我在評論中提到的,您可以循環data
並一次執行一次插入,因為executemany()
並不比多次execute()
快。
我做的是以下內容:
data
元素 data
每個item
,將字段提取到my_data
execute()
和在通過json.dumps(my_data)
轉換my_data
從dict
成JSON串) 嘗試這個:
#!/usr/bin/env python
import requests
import psycopg2
import json
conn = psycopg2.connect(database='NHL', user='postgres', password='postgres', host='localhost', port='5432')
req = requests.get('http://www.nhl.com/stats/rest/skaters?isAggregate=false&reportType=basic&isGame=false&reportName=skatersummary&sort=[{%22property%22:%22playerName%22,%22direction%22:%22ASC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]&cayenneExp=gameTypeId=2%20and%20seasonId%3E=20172018%20and%20seasonId%3C=20172018')
# data here is a list of dicts
data = req.json()['data']
cur = conn.cursor()
# create a table with one column of type JSON
cur.execute("CREATE TABLE t_skaters (data json);")
fields = [
'seasonId',
'playerName',
'playerFirstName',
'playerLastName',
'playerId',
'playerHeight',
'playerPositionCode',
'playerShootsCatches',
'playerBirthCity',
'playerBirthCountry',
'playerBirthStateProvince',
'playerBirthDate',
'playerDraftYear',
'playerDraftRoundNo',
'playerDraftOverallPickNo'
]
for item in data:
my_data = {field: item[field] for field in fields}
cur.execute("INSERT INTO t_skaters VALUES (%s)", (json.dumps(my_data),))
# commit changes
conn.commit()
# Close the connection
conn.close()
我不是100%確定這里的所有postgres語法是否正確(我無法訪問PG數據庫進行測試),但我相信這個邏輯應該適用於你想要做的事情。
更新單獨的列
您可以修改create語句以處理多個列,但需要知道每列的數據類型。 這是你可以遵循的一些偽代碼:
# same boilerplate code from above
cur = conn.cursor()
# create a table with one column per field
cur.execute(
"""CREATE TABLE t_skaters (seasonId INTEGER, playerName VARCHAR, ...);"""
)
fields = [
'seasonId',
'playerName',
'playerFirstName',
'playerLastName',
'playerId',
'playerHeight',
'playerPositionCode',
'playerShootsCatches',
'playerBirthCity',
'playerBirthCountry',
'playerBirthStateProvince',
'playerBirthDate',
'playerDraftYear',
'playerDraftRoundNo',
'playerDraftOverallPickNo'
]
for item in data:
my_data = [item[field] for field in fields]
# need a placeholder (%s) for each variable
# refer to postgres docs on INSERT statement on how to specify order
cur.execute("INSERT INTO t_skaters VALUES (%s, %s, ...)", tuple(my_data))
# commit changes
conn.commit()
# Close the connection
conn.close()
將...
替換為適當的數據值。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.