How to connect Amazon Redshift to python

Question

This is my python code and I want to connect my Amazon Redshift database to Python, but it is showing error in host.

Can anyone tell me the correct syntax? Am I passing all the parameters correctly?

con=psycopg2.connect("dbname = pg_table_def, host=redshifttest-icp.cooqucvshoum.us-west-2.redshift.amazonaws.com, port= 5439, user=me, password= secret")

This is the error:

OperationalError: could not translate host name "redshift://redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com," to address: Unknown host

Answer 1

It appears that you wish to run Amazon Redshift queries from Python code.

The parameters you would want to use are:

dbname : This is the name of the database you entered in the Database name field when the cluster was created.
user: This is you entered in the Master user name field when the cluster was created.
password: This is you entered in the Master user password field when the cluster was created.
host: This is the Endpoint provided in the Redshift management console (without the port at the end): redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com
port: 5439

For example:

con=psycopg2.connect("dbname=sales host=redshifttest-xyz.cooqucvshoum.us-west-2.redshift.amazonaws.com port=5439 user=master password=secret")

Answer 2

Old question but I just arrived here from Google.

The accepted answer doesn't work with SQLAlchemy, although it's powered by psycopg2:

sqlalchemy.exc.ArgumentError: Could not parse rfc1738 URL from string 'dbname=... host=... port=... user=... password=...'

What worked:

create_engine(f"postgresql://{REDSHIFT_USER}:{REDSHIFT_PASSWORD}@{REDSHIFT_HOST}:{REDSHIFT_PORT}/{REDSHIFT_DATABASE}")

Which works with psycopg2 directly too:

psycopg2.connect(f"postgresql://{REDSHIFT_USER}:{REDSHIFT_PASSWORD}@{REDSHIFT_HOST}:{REDSHIFT_PORT}/{REDSHIFT_DATABASE}")

Using the postgresql dialect works because Amazon Redshift is based on PostgreSQL .

Hope it can help other people

Answer 3

To connect to redshift, you need the postgres+psycopg2 Install it as For Python 3.x:

pip3 install psycopg2-binary

And then use

return create_engine(
        "postgresql+psycopg2://%s:%s@%s:%s/%s"
        % (REDSHIFT_USERNAME, urlquote(REDSHIFT_PASSWORD), REDSHIFT_HOST, RED_SHIFT_PORT,
           REDSHIFT_DB,)
    )

Answer 4

The easiest way to query AWS Redshift from python is through this Jupyter extension - Jupyter Redshift

Not only can you query and save your results but also write them back to the database from within the notebook environment.

Answer 5

Well, for Redshift the idea is made COPY from S3, is faster than every different way, but here is some example to do it:

first you must install some dependencies

for linux users sudo apt-get install libpq-dev

for mac users brew install libpq

install with pip this dependencies pip3 install psycopg2-binary pip3 install sqlalchemy pip3 install sqlalchemy-redshift

import sqlalchemy as sa
from sqlalchemy.orm import sessionmaker


#>>>>>>>> MAKE CHANGES HERE <<<<<<<<<<<<<
DATABASE = "dwtest"
USER = "youruser"
PASSWORD = "yourpassword"
HOST = "dwtest.awsexample.com"
PORT = "5439"
SCHEMA = "public"

S3_FULL_PATH = 's3://yourbucket/category_pipe.txt'
ARN_CREDENTIALS = 'arn:aws:iam::YOURARN:YOURROLE'
REGION = 'us-east-1'

############ CONNECTING AND CREATING SESSIONS ############
connection_string = "redshift+psycopg2://%s:%s@%s:%s/%s" % (USER,PASSWORD,HOST,str(PORT),DATABASE)
engine = sa.create_engine(connection_string)
session = sessionmaker()
session.configure(bind=engine)
s = session()
SetPath = "SET search_path TO %s" % SCHEMA
s.execute(SetPath)
###########################################################



############ RUNNING COPY ############
copy_command = '''
copy category from '%s'
credentials 'aws_iam_role=%s'
delimiter '|' region '%s';
''' % (S3_FULL_PATH, ARN_CREDENTIALS, REGION)
s.execute(copy_command)
s.commit()
######################################



############ GETTING DATA ############
query = "SELECT * FROM category;"
rr = s.execute(query)
all_results =  rr.fetchall()

def pretty(all_results):
    for row in all_results :
        print("row start >>>>>>>>>>>>>>>>>>>>")
        for r in row :
            print(" ---- %s" % r)
        print("row end >>>>>>>>>>>>>>>>>>>>>>")

pretty(all_results)
s.close()
######################################

How to connect Amazon Redshift to python

Question

5 answers

solution1
24 ACCPTED 2017-07-20 11:41:22

solution2
2 2021-09-14 16:41:39

solution3
0 2022-07-25 06:04:00

solution4
-2 2018-12-17 10:43:08

solution5
-2 2019-01-23 17:01:03

How to connect Amazon Redshift to python

Question

5 answers

solution1 24 ACCPTED 2017-07-20 11:41:22

solution2 2 2021-09-14 16:41:39

solution3 0 2022-07-25 06:04:00

solution4 -2 2018-12-17 10:43:08

solution5 -2 2019-01-23 17:01:03

solution1
24 ACCPTED 2017-07-20 11:41:22

solution2
2 2021-09-14 16:41:39

solution3
0 2022-07-25 06:04:00

solution4
-2 2018-12-17 10:43:08

solution5
-2 2019-01-23 17:01:03