简体   繁体   中英

Parameterize an SQL query in Python

I am trying to build a python script that will run a COPY command in a database connection while accepting parameters.

Database: Amazon Redshift, connecting with the psycopg2 package.
COPY command pulling the data from Amazon S3.

If I hardcode any of the values, the command works fine, but if I add a parameter, the query fails.

Parameters:

access_key = 'my_amazon_acccess_key'
secret_key = 'my_amazon_secret_key'
bucketname = 'my_amazon_s3_bucket_name'
filename = 'my_gzipped_file.gz'

Code I am trying to parameterize:

Version 1

cur.execute("
    COPY Schema.tablename FROM 's3://%s/%s' credentials 'aws_access_key_id=%s;aws_secret_access_key=%s' NULL 'NULL' gzip delimiter =',';", 
    (bucketname, filename, access_key, secret_key))

Version 1 error:

ProgrammingError: syntax error at or near "my_amazon_s3_bucket_name"
LINE 2:  COPY Schema.tablename FROM 's3://'my_amazon_s3_bucket_name'/'my_gzipped_file.gz'...

Version 2

cur.execute("
    COPY Schema.tablename FROM 's3://?/?' credentials 'aws_access_key_id=?;aws_secret_access_key=?' NULL 'NULL' gzip delimiter =',';", 
    (bucketname, filename, access_key, secret_key))

Version 3

cur.execute("
    COPY Schema.tablename FROM 's3://$1/$2' credentials 'aws_access_key_id=$3;aws_secret_access_key=$4' NULL 'NULL' gzip delimiter =',';", 
    (bucketname, filename, access_key, secret_key))

Version 2 & Version 3 error (same message):

InternalError                             Traceback (most recent call last)
<> in <module>()
      1 cur.execute("""
      2         COPY Schema.tablename FROM 's3://?/?' credentials ' aws_acces
s_key_id=?;aws_secret_access_key=?' NULL 'NULL' gzip delimiter ',';""",
----> 3         (bucketname, filename, access_key, secret_key))

InternalError: Invalid credentials. Must be of the format: credentials 'aws_acce
ss_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>[;token=<temp
orary-session-token>]'
DETAIL:
  -----------------------------------------------
  error:  Invalid credentials. Must be of the format: credentials 'aws_access_ke
y_id=<access-key-id>;aws_secret_access_key=<secret-access-key>[;token=<temporary
-session-token>]'
  code:      8001
  context:
  query:     95221
  location:  aws_credentials_parser.cpp:86
  process:   padbmaster [pid=326]
  -----------------------------------------------

I would prefer not to hard code these parameters, but can't figure out a way to properly handle this. Can this be done?

You can't embed the delimiters inside your (SQL) strings; you need to use string concatenation (the SQL string concatenation operator being || ) inside the database engine -- as opposed to what you're doing now, which assumes expansion before content is handed off to the database engine to parse.

That is to say, your query string should include:

's3://' || %s || '/' || %s

...to prefix s3:// , add a string from parameters, a / , another string from parameters, etc. When you put a %s inside a SQL string, it's treated as literal, not a placeholder.


In context (and using a different, arguably clearer, available quoting form), this might look like:

cur.execute("""
    COPY Schema.tablename FROM 's3://' || %(bucketname)s || '/' || %(filename)s
      credentials 'aws_access_key_id=' || %(access_key)s ||
                  ';aws_secret_access_key=' || %(secret_key)s
      NULL 'NULL' gzip delimiter =',';""", 
    {'bucketname': bucketname, 'filename': filename, 'access_key': access_key, 'secret_key': secret_key})

It is necessary to use AsIs :

from psycopg2.extensions import AsIs

access_key = 'my_amazon_acccess_key'
secret_key = 'my_amazon_secret_key'
bucketname = 'my_amazon_s3_bucket_name'
filename = 'my_gzipped_file.gz'

print cur.mogrify('''
    COPY Schema.tablename
    FROM 's3://%s/%s'
    credentials 'aws_access_key_id=%s;aws_secret_access_key=%s'
    NULL 'NULL' gzip delimiter =','
    ;''', 
    (AsIs(bucketname), AsIs(filename), AsIs(access_key), AsIs(secret_key))
)

Output:

COPY Schema.tablename
FROM 's3://my_amazon_s3_bucket_name/my_gzipped_file.gz'
credentials 'aws_access_key_id=my_amazon_acccess_key;aws_secret_access_key=my_amazon_secret_key'
NULL 'NULL' gzip delimiter =','
;

Now the problem is that COPY is a server side command. It will be run by the user that runs the server, usually postgres which will need to have read permission on the file. Check psql \\copy which runs at the client with the client's permissions or psycopg2 copy_from or copy_expert

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM