简体   繁体   中英

How to use the DB2 LOAD utility using the python ibm_db driver

LOAD is a DB2 utility that I would like to use to insert data into a table from a CSV file. How can I do this in Python using the ibm_db driver? I don't see anything in the docs here

CMD: LOAD FROM xyz OF del INSERT INTO FOOBAR

Running this as standard SQL fails as expected: Transaction couldn't be completed: [IBM][CLI Driver][DB2/LINUXX8664] SQL0104N An unexpected token "LOAD FROM xyz OF del" was found following "BEGIN-OF-STATEMENT". Expected tokens may include: "<space>". SQLSTATE=42601 SQLCODE=-104 Transaction couldn't be completed: [IBM][CLI Driver][DB2/LINUXX8664] SQL0104N An unexpected token "LOAD FROM xyz OF del" was found following "BEGIN-OF-STATEMENT". Expected tokens may include: "<space>". SQLSTATE=42601 SQLCODE=-104

Using the db2 CLP directly (ie os.system('db2 -f /path/to/script.file') ) is not an option as DB2 sits on a different machine that I don't have SSH access to.

EDIT:
Using the ADMIN_CMD utility also doesn't work because the file being loaded cannot be put on the database server due to firewall. For now, I've switched to using INSERT

LOAD is an IBM command line processor command, not an SQL command. Is such, it isn't available through the ibm_db module.

The most typical way to do this would be to load the CSV data into Python (either all the rows or in batches if it is too large for memory) then use a bulk insert to insert many rows at once into the database.

To perform a bulk insert you can use the execute_many method .

You could CALL the ADMIN_CMD procedure. ADMIN_CMD has support for both LOAD and IMPORT . Note that both commands require the loaded/imported file to be on the database server.

The example is taken from the DB2 Knowledge Center :

CALL SYSPROC.ADMIN_CMD('load from staff.del of del replace
 keepdictionary into SAMPLE.STAFF statistics use profile
 data buffer 8')

CSV to DB2 with Python

Briefly: One solution is to use an SQLAlchemy adapter and Db2's External Tables .

SQLAlchemy:

The Engine is the starting point for any SQLAlchemy application. It's “home base” for the actual database and its DBAPI, delivered to the SQLAlchemy application through a connection pool and a Dialect , which describes how to talk to a specific kind of database/DBAPI combination.

在此处输入图片说明

Where above, an Engine references both a Dialect and a Pool , which together interpret the DBAPI's module functions as well as the behavior of the database.

Creating an engine is just a matter of issuing a single call, create_engine() :

dialect+driver://username:password@host:port/database

Where dialect is a database name such as mysql, oracle, postgresql, etc., and driver the name of a DBAPI, such as psycopg2, pyodbc, cx_oracle, etc.

Load data by using transient external table:

Transient external tables (TETs) provide a way to define an external table that exists only for the duration of a single query.

TETs have the same capabilities and limitations as normal external tables. A special feature of a TET is that you do not need to define the table schema when you use the TET to load data into a table or when you create the TET as the target of a SELECT statement.

Following is the syntax for a TET:

INSERT INTO <table> SELECT <column_list | *>
FROM EXTERNAL 'filename' [(table_schema_definition)]
[USING (external_table_options)];

CREATE EXTERNAL TABLE 'filename' [USING (external_table_options)]
AS select_statement;

SELECT <column_list | *> FROM EXTERNAL 'filename' (table_schema_definition)
[USING (external_table_options)];

For information about the values that you can specify for the external_table_options variable, see External table options .

General example

  • Insert data from a transient external table into the database table on the Db2 server by issuing the following command:
INSERT INTO EMPLOYEE SELECT * FROM external '/tmp/employee.dat' USING (delimiter ',' MAXERRORS 10 SOCKETBUFSIZE 30000 REMOTESOURCE 'JDBC' LOGDIR '/logs' )

Requirements

pip install ibm-db
pip install SQLAlchemy

Pyton code

One example below shows how it works together.

from sqlalchemy import create_engine


usr = "enter_username"
pwd = "enter_password"
hst = "enter_host"
prt = "enter_port"
db = "enter_db_name"

#SQL Alchemy URL
conn_params = "db2+ibm_db://{0}:{1}@{2}:{3}/{4}".format(usr, pwd, hst, prt, db)

shema = "enter_name_restore_shema"
table = "enter_name_restore_table"
destination = "/path/to/csv/file_name.csv"

try:
    print("Connecting to DB...")
    engine = create_engine(conn_params)
    engine.connect()  # optional, output: DB2/linux...
    print("Successfully Connected!")
except Exception as e:
    print("Unable to connect to the server.")
    print(str(e))

external = """INSERT INTO {0}.{1} SELECT * FROM EXTERNAL '{2}' USING (CCSID 1208 DELIMITER ',' REMOTESOURCE LZ4 NOLOG TRUE )""".format(
    shema, table, destination
)

try:
    print("Restoring data to the server...")
    engine.execute(external)
    print("Data restored successfully.")
except Exception as e:
    print("Unable to restore.")
    print(str(e))

Conclusion

  • A great solution for restoredlarge files, specifically, 600m worked without any problems.
  • It is also useful for copying data from one table/database to another table. So that the backup is done as an export of csv and then that csv into DB2 with the given example.
  • SQLAlchemy- Engine can be combined with other databases such as: sqlite, mysql, postgresql, oracle, mssql, etc.

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM