简体   繁体   中英

Basic python; ' in text variable which stops my script; psycopg&tweepy; python, postgres and twitter

I have a script that mines tweets and inputs them into my postgres database. It works for most messages

With the following line I can return the text of a message:

tweet.text.encode('utf-8')

Whenever the tweet has a ' in the text my script stops. I could make a function that extracts the tweet and puts it within two ". But I figured I will get the same problem when a tweet contains a ". Then I could make a function that checks tweets on containing a ' or " and catch these statements off. But it seems way to much work for this simple problem.

So i'd like to know how to overcome this problem without to much scripting effort.

I am not an expert in python and one of the things that is my problem is that I try to fix things in a difficult way while there often is a much simpler way. The current problem made me think this is a scenario like that. Hence, my question here.

*** UPDATE

My error pops up when inserting the message into my postgres table indeed.

I just tried repr() but still got a similar error message.

Traceback (most recent call last):
  File "...python.py", line 28, in <module>
    cur.execute("INSERT INTO Test(userid, created, retweets, message) VALUES('{0}', '{1}', '{2}', '{3}')".format(tweet.user.id, tweet.created_at, tweet.retweet_count, ber))
psycopg2.ProgrammingError: syntax error at or near "E19"
LINE 1: ...LUES('1251822199', '2016-02-27 10:23:40', '0', 'b'E19 (A1) M...

The 4th parameter is the text of the tweet and starts with 'b'E19 as text. It fails here.

The line I use to input the data into postgres is the following:

cur.execute("INSERT INTO Test(message) VALUES('{0}')".format(repr(tweet.text.encode('utf-8'))))

Because you are manually creating the query with string operations, you would need to escape the quotes in the query.

But a better way is to use parameterised queries and allow psycopg2 to perform escaping of special characters. This will also make your code less vulnerable to SQL injection attacks if some of the parameters are from untrusted sources, eg a user.

cur.execute("INSERT INTO Test(message) VALUES(%s)", (tweet.text.encode('utf-8'),))

or

cur.execute("INSERT INTO Test(userid, created, retweets, message) VALUES(%s, %s, %s, %s)", (tweet.user.id, tweet.created_at, tweet.retweet_count, ber))

Now the DB layer will perform escaping for you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM