简体   繁体   中英

Pandas to_sql Trying to Index Nullable Column

I want to set up a job that dumps data into a SQL table every day, overwriting the existing data.

df.to_sql(table_name, engine, schema='dbo', 
          index=True, index_label='IdColumn', 
          if_exists='replace')

However behind the scenes SQLAlchemy is trying to create the table with IdColumn VARCHAR(max), and being nullable. So SQL throws an error when it tries to create the index.

It's pretty trivial to truncate the table before I write the data to it, but I feel like there should be a more elegant solution to this problem.

If you want the write the index to the sql table as a normal column, you can do a reset_index before the to_sql call:

df.reset_index().to_sql(table_name, engine, schema='dbo', index=False, if_exists='replace')

The only problem is the name of that column, if you want a custom one you first have to set the index name ( df.index.name = 'IdColumn' ) or rename after the reset_index.

Consider using the dtype argument which takes a dictionary mapping data frame column names to specified sqlalchemy data types . You can try Varchar :

import sqlalchemy

df.to_sql(table_name, engine, schema='dbo', 
          index=True, index_label='IdColumn', 
          if_exists='replace',
          dtype={'IdColumn': sqlalchemy.types.VARCHAR(length=255)})

or generic String type, specifying a length:

from sqlalchemy.types import String

df.to_sql(table_name, engine, schema='dbo', 
          index=True, index_label='IdColumn', 
          if_exists='replace',
          dtype={'IdColumn': String(length=255)})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM