简体   繁体   English

使用 Pandas 在 MySQL 中创建一个临时表

[英]Create a temporary table in MySQL using Pandas

Pandas has a great feature, where you can write your dataframe to a table in SQL. Pandas 有一个很棒的功能,您可以在其中将数据帧写入 SQL 表中。

df.to_sql(con=cnx, name='some_table_name', if_exists='replace', flavor='mysql', index=False)

Is there a way to make a temporary table this way?有没有办法以这种方式制作临时表?

There is nothing in the documentation as far as I can tell.据我所知,文档中没有任何内容。

The DataFrame.to_sql() uses the built into pandaspandas.io.sql package , which itself relies on the SQLAlchemy as a database abstraction layer. DataFrame.to_sql()使用内置的 pandaspandas.io.sql,它本身依赖于 SQLAlchemy 作为数据库抽象层。 In order to create a "temporary" table in SQLAlchemy ORM, you need to supply a prefix :为了在 SQLAlchemy ORM 中创建“临时”表, 您需要提供一个前缀

t = Table(
    't', metadata,
    Column('id', Integer, primary_key=True),
    # ...
    prefixes=['TEMPORARY'],
)

From what I see, pandas.io.sql does not allow you to specify the prefixes or easily change the way tables are created.据我pandas.io.sqlpandas.io.sql 不允许您指定prefixes或轻松更改表的创建方式。

One way to approach this problem would be to create the temporary table beforehand and use to_sql() with if_exists="append" (all using the same database connection).解决此问题的一种方法是预先创建临时表并使用带有if_exists="append" to_sql() (都使用相同的数据库连接)。


Here is also what I've tried to do: override the pandas.io.sql.SQLTable 's _create_table_setup() method and pass the prefixes to the Table constructor.这也是我尝试做的:覆盖pandas.io.sql.SQLTable_create_table_setup()方法并将prefixes传递给Table构造函数。 For some reason, the table was still created non-temporary.出于某种原因,该表仍然是非临时创建的。 Not sure if it would help, but here is the code I was using: gist .不确定它是否会有所帮助,但这是我使用的代码: gist This is kind of hacky, but I hope it would at least serve as an example code to get you started on this approach.这有点 hacky,但我希望它至少可以作为示例代码,让您开始使用这种方法。

This may be a bit hacky and it doesn't technically create a temporary table, it just acts like one, but you could create use the @contextmanager decorator from contextlib to create the table upon opening the context and drop it upon close.这可能有点 hacky,它在技术上并没有创建临时表,它只是像一个临时表一样,但是您可以使用@contextmanager装饰器在打开contextlib时创建表并在关闭时删除它。 Could look something like:可能看起来像:

from contextlib import contextmanager

import numpy as np
import sqlalchemy as sqla
import pandas as pd


@contextmanager
def temp_table(frame, tbl, eng, *args, **kwargs):
    frame.to_sql(tbl, eng, *args, **kwargs)
    yield
    eng.execute('DROP TABLE {}'.format(tbl))

df = pd.DataFrame(np.random.randint(21, size=(10, 10)))
cnx = sqla.create_engine(conn_string)

with temp_table(df, 'some_table_name', cnx, if_exists='replace', flavor='mysql', index=False):
    # do stuff with "some_table_name"

I tested it using Teradata and it works fine.我使用 Teradata 对其进行了测试,效果很好。 I don't have a MySQL laying around that I can test it out on, but as long as DROP statements work in MySQL, it should work as intended.我没有可以测试的 MySQL,但只要DROP语句在 MySQL 中工作,它就应该按预期工作。

Easy workaround without fancy magic简单的解决方法,无需花哨的魔法

This was a quick and easy workaround for me.这对我来说是一种快速简便的解决方法。

Simply apply a RegEx to the generated SQL to add in whatever statements you want.只需将 RegEx 应用于生成的 SQL 即可添加您想要的任何语句。

import io
import pandas as pd

# Get the SQL that would be generated by the create table statement
create_table_sql = pd.io.sql.get_schema(df, tmp_table_name)

# Replace the `CREATE TABLE` part of the generated statement with 
# whatever you need.
create_tmp_table_sql = re.sub(
    "^(CREATE TABLE)?",
    "CREATE TEMP TABLE",
    create_table_sql
)

# Write to the database in a transaction (psychopg2)
with conn.cursor() as cur:
    cur.execute(create_tmp_table_sql)
    output = io.StringIO()
    df.to_csv(output, sep="\t", header=False, index=False, na_rep="NULL")
    output.seek(0)
    cur.copy_from(output, tmp_table_name, null="NULL")

Credit to Aseem for a fast way to write to Postgres.感谢 Aseem提供了一种快速写入 Postgres 的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM