How to do SQL Server transactions with Python through SQLAlchemy without blocking access to db tables in SQL Server Management Studio?

Question

I've been stuck for a while now on this question and wasn't able to find the right answer/topic on the internet.

Basically, I have a table on SQL Server that I try to 'replace' with an updated one in a pandas dataframe form. I need to do a transaction for this task so that my original table isn't lost if something goes wrong while transferring data from dataframe ( rollback functionality). I found a solution for this - SQLAlchemy library. My code for this:

engine = create_engine("mssql+pyodbc://server_name:password@user/database?driver=SQL+Server+Native+Client+11.0")
with engine.begin() as conn:
    df.to_sql(name = 'table_name', schema = 'db_schema', con = conn, if_exists = 'replace', index = False)

The problem occurs when I try to access the tables in this specific database through SQL Server Management Studio 18 while doing this transaction because it somehow manages to block the whole database and no one can access any tables in it (access time limits exceeded). The code above works great, I've tried to transfer a small chunk of dataframe, but the problem still persists, because I need to transfer a large dataframe.

What I've tried:

The concept of isolation levels, but this isn't the right thing as it's about the rules of connecting to a table that is already being used. Example:
engine = create_engine("mssql+pyodbc://server_name:password@user/database?driver=SQL+Server+Native+Client+11.0", isolation_level="SERIALIZABLE")
Adjusting such parameters as pool_size and max_overflow in create_engine() statement and chunksize in df.to_sql() statement but they don't seem to have an effect. Example:
engine = create_engine("mssql+pyodbc://server_name:password@user/database?driver=SQL+Server+Native+Client+11.0", pool_size = 1, max_overflow = 0)
with engine.begin() as conn: df.to_sql(name = 'table_name', schema = 'db_schema', con = conn, if_exists = 'replace', chunksize = 1, index = False)
Excluding schema parameter from df.to_sql() query doesn't work either

Basic SQL code and functionality I'm trying to achieve for this task would look something like this:

BEGIN TRANSACTION
   BEGIN TRY
   
         DELETE FROM [db].[schema].[table];
         INSERT INTO [db].[schema].[table] <--- dataframe
         COMMIT TRANSACTION
   
   END TRY
   BEGIN CATCH

         ROLLBACK TRANSACTION
         SELECT ERROR_NUMBER() AS [Error_number], ERROR_MESSAGE() AS [Error_description]

   END CATCH

I could create another buffer table and parse df data into it and do a transaction of this table afterwards, but I'm looking for a solution to bypass these steps.

If there is a better way to do this task please let me know as well.

Answer 1

As suggested by @GordThompson, the right solution considering that your db table already exists is as follows:

engine = create_engine("mssql+pyodbc://server_name:password@user/database?driver=SQL+Server+Native+Client+11.0")
# start transaction
with engine.begin() as conn:
    # clean the table
    conn.exec_driver_sql("TRUNCATE TABLE [db].[scheme].[table]")
    # append data from df
    df.to_sql(name = 'table_name', schema = 'schema_name', con = conn, if_exists = 'append', index = False)

How to do SQL Server transactions with Python through SQLAlchemy without blocking access to db tables in SQL Server Management Studio?

Question

1 answers

solution1
0 2021-11-26 19:40:50

How to do SQL Server transactions with Python through SQLAlchemy without blocking access to db tables in SQL Server Management Studio?

Question

1 answers

solution1 0 2021-11-26 19:40:50

solution1
0 2021-11-26 19:40:50