繁体   English   中英

psycopg2:如何将 Python 应用程序中的数据添加到 Postgresql

[英]psycopg2: How to add data from a Python Application to Postgresql

这是捕获新闻标题、跟踪倾向和回报情绪分析(即正面、负面和中性)的项目的一部分。 除了简单地将捕获的数据添加到 PostgreSQL 表(仅供参考:CNN 和 USAToday 也在工作中,因为我尊重所有新闻来源)之外,一切都适用于以下代码。

目前,出现以下错误:

“发生异常:在“%”处或附近出现 SyntaxError 语法错误”

我试图找到我的具体情况的例子,但我看到的所有例子都有插入值的实际结果,而不是包含所需数据的 python 值。 我还附上了原始表创建代码:

--create table
create table news_data (
timestamp timestamp,
user varchar(75),
url text,
site varchar(75),
lean varchar(75),
source varchar(75),
headline varchar(1000),
results text,
positive float,
negative float,
neutral float,

)

PYTHON 代码:

#import the needed libraries
import os
import nltk
#nltk.download() #Only use once
import datetime
import getpass
import requests
import time
import numpy as np
import pandas as pd
import pandasql as psql
import lxml
from bs4 import BeautifulSoup
from nltk.sentiment.vader import SentimentIntensityAnalyzer as SIA
#from nltk.corpus.reader.plaintext import PlaintextCorpusReader as PCR
import psycopg2 as p2

#Application process
ts = datetime.datetime.now().date() #needs to be converted to a string
timestamp = ts.strftime("%d-%b-%Y (%H:%M:%S.%f)")
user = getpass.getuser()
url = 'https://www.foxnews.com/'
site = 'Fox News'
lean = 'Conservative'
source = requests.get(url)
soup = BeautifulSoup(source.content, 'lxml') #'source.content' is critical to success here
headline = soup.find('h2', class_='title title-color-default').text #apply sntiment analysis to headline
vader = SIA()
sentiment_dict = vader.polarity_scores(headline)
results = vader.polarity_scores(headline)
positive = (sentiment_dict['pos']*100)
negative = (sentiment_dict['neg']*100)
neutral = (sentiment_dict['neu']*100)


#lists
#sql_list = (timestamp, user, url, site, lean, headline, results, positive, negative, neutral)

#Postgresql connection
db_connection = p2.connect(user = 'PLACEHOLDER', password='PLACEHOLDER', database='PLACEHOLDER')
cursor = db_connection.cursor()
insertion = cursor.execute('''
INSERT INTO news_data VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)
''')

record_to_insert = (timestamp, user, url, site, lean, headline, source, results, positive, negative, neutral)
cursor.execute(insertion, record_to_insert)


cursor.close()
db_connection.close()

        
#Testing
#print(headline)
#print(negative)

即使将最后几行更改为以下内容,我仍然收到以下错误:

发生异常:在“%”第 1 行或附近出现 SyntaxError 语法错误:...ce, results, positive,negative,neutral) VALUES (%s, %s, %s...

更改代码:

cursor.execute("INSERT INTO news_data (timestamp, url, site, lean, headline, source, results, positive, negative, neutral) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s), (timestamp, url, site, lean, headline, results, positive, negative, neutral)")

cursor.commit()

cursor.close()

db_connection.close()
        
#Testing
#print(headline)
#print(negative)

如果不查看语句中的每种数据类型,我可以看到其中一些是字符串。 确保在必要时用单引号括住字符串类型的值。

insertion = cursor.execute('''
INSERT INTO news_data VALUES ('%s','%s', ...)
''')

啊,这是代码的相关部分

#Postgresql connection
db_connection = p2.connect(user = 'PLACEHOLDER', password='PLACEHOLDER', database='PLACEHOLDER')
cursor = db_connection.cursor()

# hmm, this won't work.  You need to provide values for all those placeholders
insertion = cursor.execute('''
INSERT INTO news_data VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)
''')

# oh, i see what you're up to.  You want this:

insertion = 'INSERT INTO news_data....'

record_to_insert = (timestamp, user, url, site, lean, headline, source, results, positive, negative, neutral)

cursor.execute(insertion, record_to_insert)

cursor.close()

# most likely, you'll also need
db_connection.commit()

db_connection.close()

基本上,psycopg2 不支持从 Python 变量中添加存储值。 You have to save the data to a CSV file first, have the code take the data from the CSV file, upload the data to PostgreSQL, then finally delete the CSV file once the process is complete.

希望这将有助于将来尝试完成类似事情的人。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM