简体   繁体   English

Pandas 忽略函数 to_sql 中的 dtype,给出 TypeError:期望字符串或字节对象

[英]Pandas ignores dtype in function to_sql, giving TypeError: expecting string or bytes object

I'm processing data from an Excel spreadsheet and uploading it to Oracle.我正在处理来自 Excel 电子表格的数据并将其上传到 Oracle。 Pandas fails on the command to_sql with a certain edge case. Pandas 在命令 to_sql 上失败,有某种边缘情况。 I have a particular column COMMENTS which usually contains strings, however in one row a user typed a number (ie 500).我有一个特定的 COMMENTS 列,它通常包含字符串,但是在一行中,用户键入了一个数字(即 500)。

I am forcing it to read as a number using the dtype argument so I would expect the number to be uploaded to the Oracle table as a VARCHAR:我强制它使用 dtype 参数读取为数字,因此我希望该数字作为 VARCHAR 上传到 Oracle 表:

dataWrite.to_sql(name = 'FB_DATA_HOURLY', schema = 'schemaName', index = False, con = conWrite, if_exists='append', dtype={'COMMENTS': VARCHAR(length=200)})

However there is still a TypeError as it tries to upload this number to the VARCHAR column.但是,当它尝试将此数字上传到 VARCHAR 列时,仍然存在 TypeError。

TypeError                                 Traceback (most recent call last)
<ipython-input-99-1864db124148> in readExcel_05142019(version, filepath)
     72     conWrite = oracle_db.connect()
---> 73     dataWrite.to_sql(name = 'FB_DATA_HOURLY', schema = 'schemaName', index = False, con = conWrite, if_exists='append', dtype={'COMMENTS': VARCHAR(length=200)})
     74 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in to_sql(self, name, con, schema, if_exists, index, index_label, chunksize, dtype, method)
   2710             chunksize=chunksize,
   2711             dtype=dtype,
-> 2712             method=method,
   2713         )
   2714 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\sql.py in to_sql(frame, name, con, schema, if_exists, index, index_label, chunksize, dtype, method)
    516         chunksize=chunksize,
    517         dtype=dtype,
--> 518         method=method,
    519     )
    520 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\sql.py in to_sql(self, frame, name, if_exists, index, index_label, schema, chunksize, dtype, method)
   1318         )
   1319         table.create()
-> 1320         table.insert(chunksize, method=method)
   1321         if not name.isdigit() and not name.islower():
   1322             # check for potentially case sensitivity issues (GH7815)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\sql.py in insert(self, chunksize, method)
    754 
    755                 chunk_iter = zip(*[arr[start_i:end_i] for arr in data_list])
--> 756                 exec_insert(conn, keys, chunk_iter)
    757 
    758     def _query_iterator(

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\sql.py in _execute_insert(self, conn, keys, data_iter)
    668         """
    669         data = [dict(zip(keys, row)) for row in data_iter]
--> 670         conn.execute(self.table.insert(), data)
    671 
    672     def _execute_insert_multi(self, conn, keys, data_iter):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\base.py in execute(self, object_, *multiparams, **params)
    980             raise exc.ObjectNotExecutableError(object_)
    981         else:
--> 982             return meth(self, multiparams, params)
    983 
    984     def _execute_function(self, func, multiparams, params):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\sql\elements.py in _execute_on_connection(self, connection, multiparams, params)
    285     def _execute_on_connection(self, connection, multiparams, params):
    286         if self.supports_execution:
--> 287             return connection._execute_clauseelement(self, multiparams, params)
    288         else:
    289             raise exc.ObjectNotExecutableError(self)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\base.py in _execute_clauseelement(self, elem, multiparams, params)
   1099             distilled_params,
   1100             compiled_sql,
-> 1101             distilled_params,
   1102         )
   1103         if self._has_events or self.engine._has_events:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1248         except BaseException as e:
   1249             self._handle_dbapi_exception(
-> 1250                 e, statement, parameters, cursor, context
   1251             )
   1252 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\base.py in _handle_dbapi_exception(self, e, statement, parameters, cursor, context)
   1476                 util.raise_from_cause(sqlalchemy_exception, exc_info)
   1477             else:
-> 1478                 util.reraise(*exc_info)
   1479 
   1480         finally:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\util\compat.py in reraise(tp, value, tb, cause)
    151         if value.__traceback__ is not tb:
    152             raise value.with_traceback(tb)
--> 153         raise value
    154 
    155     def u(s):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\engine\base.py in _execute_context(self, dialect, constructor, statement, parameters, *args)
   1224                 if not evt_handled:
   1225                     self.dialect.do_executemany(
-> 1226                         cursor, statement, parameters, context
   1227                     )
   1228             elif not parameters and context.no_parameters:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sqlalchemy\dialects\oracle\cx_oracle.py in do_executemany(self, cursor, statement, parameters, context)
   1126         if isinstance(parameters, tuple):
   1127             parameters = list(parameters)
-> 1128         cursor.executemany(statement, parameters)
   1129 
   1130     def do_begin_twophase(self, connection, xid):

TypeError: expecting string or bytes object

I was able to resolve this problem by converting the column type from Object to String, while making sure to leave null values as nulls instead of a string "nan".我能够通过将列类型从 Object 转换为 String 来解决这个问题,同时确保将空值保留为空值而不是字符串“nan”。

# convert any string values that are purely numeric to be strings, while preserving nulls
dataWrite['COMMENTS'] = dataWrite['COMMENTS'].where(dataWrite['COMMENTS'].isnull(), dataWrite['COMMENTS'].astype(str))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM