简体   繁体   中英

Pyodbc: Errors Inserting pickled Python Model to MS SQL Server

Been trying to troubleshoot this one for a while, but running up against a lot of issues. Wondering if anyone has seen this before.

I'm trying to pickle a simple RandomForestClassifier (sklearn) in Python, and use pyodbc to save it to an MS SQL Server database. In particular, I am using an UPDATE statement because I'm updating a model that's been previously trained.

Here's the query I'm using:

RF_serialized = pickle.dumps(RF)

RF_serialized_ins = str(RF_serialized)[1 : ] # doing this to cut off the leading 'b' from 
                                             # Python's byte data, per suggestions from other answers

q = "UPDATE table \
    SET serializedModel = CONVERT(VARBINARY(MAX), {}) \
    WHERE IDa = {} AND \
            IDb = {} AND \
            IDc = {}".format(RF_serialized_ins, "x", "y", "z")

I keep getting an error, though, of the following nonspecific type:

pyodbc.ProgrammingError: ('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server]Syntax error, permission violation, or other nonspecific error (0) (SQLExecDirectW)')

Anybody run into this before? I am positive that the IDs and filters are correct, etc. The datatype of the target column is VARBINARY(MAX) . One idea: is the pickled object too big? The size of the object:

print("Type of python object:", type(RF_serialized))
print("The size of the pickled RF model is:", RF_serialized.__sizeof__())
Type of python object: <class 'bytes'>
The size of the pickled RF model is: 5487942

Here is what ended up working (thanks to @Gord Thompson for getting me going in the right direction):

  1. Using escaped parametrization -- according to pyodbc standard -- rather than Python's .format() . We ended up changing the query to something like:
q = "UPDATE table \
    SET serializedModel = CONVERT(VARBINARY(MAX), ?) \
    WHERE IDa = CONVERT(uniqueidentifier, ?) AND \
            IDb = CONVERT(uniqueidentifier, ?) AND \
            IDc = CONVERT(uniqueidentifier, ?)"

args = (RF_serialized,
        "x",
        "y",
        "z")

cursor.execute(q, args)
cnxn.commit()
  1. Using CONVERT(uniqueidentifier, ?) rather than trying to put in special characters for strings (eg \\' ) because SQL Server treats GUIDs/unique identifiers as a datatype.
  2. I had a couple of extra queries hanging around from running tests/troubleshooting and I think an extra .execute() on one of them -- that totally messed with the query I was actually trying to fix.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM