简体   繁体   中英

How to use cx_Oracle within AWS SageMaker (Jupyter Notebook)

I've followed the official installation guide but haven't had any luck so far. I wonder if cx_Oracle can work on AWS SageMaker's virtual environment. The steps I've used so far are:

  1. Create a /opt/oracle directory and unzip the basic instantclient in it.
  2. sudo yum install libaio
  3. sudo sh -c "echo /opt/oracle/instantclient_18_3 > /etc/ld.so.conf.d/oracle-instantclient.conf" and sudo ldconfig
  4. And finally exported the LD_LIBRARY_PATH with: export LD_LIBRARY_PATH=/opt/oracle/instantclient_18_3:$LD_LIBRARY_PATH

When trying to run a connection inside the notebook with connection = cx_Oracle.connect(usr + '/' + pwd + '@' + url) , I receive the DPI-1047 error code that says that libclntsh.so cannot be open, however that library is in the /opt/oracle folder. As another option, when running the same connection through the terminal Python console, I get the ORA-01804 error code, which says that the timezone files were not properly read, which is something I'm trying to fix also but suspect is related to cx_Oracle not finding its library folder. (Now, explain to me: why does it have to be so difficult for a billionaire company to create a decent library import and installation?)

Is there a step I'm missing? Is there a detail about AWS SageMaker that I should account for? Also, is there another option for extracting data from an oracle server through Python and AWS?

Hi and thank you for using SageMaker!

After some effort, I was finally able to figure out a sequence of steps that allowed me to query an Oracle 12 database from within a SageMaker notebook instance. Here are the steps that I took:

  1. I created an Oracle 12 database using Amazon RDS for testing purposes. (You can of course skip this step if you already have an Oracle database available.)
  2. I downloaded the Oracle 12 Instant Client RPM as described here . Note that you will need an Oracle Account in order to download this file.
  3. I uploaded the RPM to my SageMaker Notebook Instance from within JupyterLab. Note that this can take 2-3 minutes to completely upload before proceeding to the next step. (I initially had problems running the installation because the upload was still in progress.)
  4. I ran all of the following commands from the Jupyter terminal as prescribed in the Oracle instructions:
cd SageMaker

sudo yum install oracle-instantclient12.2-basic-12.2.0.1.0-1.x86_64.rpm
sudo sh -c "echo /usr/lib/oracle/12.2/client64/lib > /etc/ld.so.conf.d/oracle-instantclient.conf"
sudo ldconfig
sudo mkdir -p /usr/lib/oracle/12.2/client64/lib/network/admin

# Restart Jupyter...
sudo restart jupyter-server
  1. I then installed the cx_Oracle library:
source activate python3
pip install cx_Oracle --upgrade
  1. Then lastly, I created a new notebook using the conda_python3 kernel:
import cx_Oracle
db = cx_Oracle.connect("my_username/my_password@my-rds-instance.ccccccccccc.us-east-1.rds.amazonaws.com/ORCL")

# Example query
cursor = db.cursor()
for row in cursor.execute('select * from DBA_TABLES'):
  print(row)

...and then it worked!

Note that it took me a few tries to figure out the exact connection string, which can differ depending on how your database is configured. Unfortunately, the error messages were hard to understand - in my case, I had the error ORA-12504: TNS:listener was not given the SERVICE_NAME in CONNECT_DATA until I had specified the /ORCL at the end of the connection string.

If you need to do these steps frequently, you can add the installation and configuration of the Oracle client in a SageMaker Lifecycle Configuration script. I haven't tested that scenario, but it might be worth a try!

One last thing, I noticed in your question that you are using the Oracle 18 client. I didn't test that exact scenario, since I only have access to an Oracle 12 database. However, the Oracle 12 client should be able to connect to an Oracle 18 database, too.

Best, Kevin

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM