将临时表与SQLAlchemy一起使用

Question

我试图使用临时表与SQLAlchemy，并将其连接到现有的表。 这就是我到目前为止所拥有的

engine = db.get_engine(db.app, 'MY_DATABASE')
df = pd.DataFrame({"id": [1, 2, 3], "value": [100, 200, 300], "date": [date.today(), date.today(), date.today()]})
temp_table = db.Table('#temp_table',
                      db.Column('id', db.Integer),
                      db.Column('value', db.Integer),
                      db.Column('date', db.DateTime))
temp_table.create(engine)
df.to_sql(name='tempdb.dbo.#temp_table',
          con=engine,
          if_exists='append',
          index=False)
query = db.session.query(ExistingTable.id).join(temp_table, temp_table.c.id == ExistingTable.id)
out_df = pd.read_sql(query.statement, engine)
temp_table.drop(engine)
return out_df.to_dict('records')

这不会返回任何结果，因为to_sql的插入语句没有运行（我认为这是因为它们是使用sp_prepexec运行的，但我并不完全确定）。

然后我尝试写出SQL语句（ CREATE TABLE #temp_table... ， INSERT INTO #temp_table... ， SELECT [id] FROM... ）然后运行pd.read_sql(query, engine) 。 我收到错误消息

此结果对象不返回行。 它已自动关闭。

我想这是因为声明不仅仅是SELECT ？

我该如何解决这个问题（任何一个解决方案都可以工作，虽然第一个会更好，因为它避免了硬编码的SQL）。 要清楚，我无法修改现有数据库中的模式 - 它是供应商数据库。

Answer 1

如果要插入临时表中的记录数量很小/中等，一种可能性是使用literal subquery或values CTE而不是创建临时表。

# MODEL
class ExistingTable(Base):
    __tablename__ = 'existing_table'
    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String)
    # ...

假设还将以下数据插入temp表：

# This data retrieved from another database and used for filtering
rows = [
    (1, 100, datetime.date(2017, 1, 1)),
    (3, 300, datetime.date(2017, 3, 1)),
    (5, 500, datetime.date(2017, 5, 1)),
]

创建包含该数据的CTE或子查询：

stmts = [
    # @NOTE: optimization to reduce the size of the statement:
    # make type cast only for first row, for other rows DB engine will infer
    sa.select([
        sa.cast(sa.literal(i), sa.Integer).label("id"),
        sa.cast(sa.literal(v), sa.Integer).label("value"),
        sa.cast(sa.literal(d), sa.DateTime).label("date"),
    ]) if idx == 0 else
    sa.select([sa.literal(i), sa.literal(v), sa.literal(d)])  # no type cast

    for idx, (i, v, d) in enumerate(rows)
]
subquery = sa.union_all(*stmts)

# Choose one option below.
# I personally prefer B because one could reuse the CTE multiple times in the same query
# subquery = subquery.alias("temp_table")  # option A
subquery = subquery.cte(name="temp_table")  # option B

使用所需的连接和过滤器创建最终查询：

query = (
    session
    .query(ExistingTable.id)
    .join(subquery, subquery.c.id == ExistingTable.id)
    # .filter(subquery.c.date >= XXX_DATE)
)

# TEMP: Test result output
for res in query:
    print(res)

最后，获取pandas数据框：

out_df = pd.read_sql(query.statement, engine)
result = out_df.to_dict('records')

Answer 2

您可以尝试使用其他解决方案 - Process-Keyed Table

进程键控表只是一个永久表，用作临时表。 为了允许进程同时使用该表，该表有一个额外的列来标识该进程。 最简单的方法是全局变量@@ spid（@@ spid是SQL Server中的进程ID）。

...

进程密钥的一种替代方法是使用GUID（数据类型uniqueidentifier）。

http://www.sommarskog.se/share_data.html#prockeyed

将临时表与SQLAlchemy一起使用

问题描述

2 个解决方案

解决方案1
12 已采纳 2017-05-31 22:12:26

解决方案2
1 2017-05-31 13:46:32

将临时表与SQLAlchemy一起使用

问题描述

2 个解决方案

解决方案1 12 已采纳 2017-05-31 22:12:26

解决方案2 1 2017-05-31 13:46:32

解决方案1
12 已采纳 2017-05-31 22:12:26

解决方案2
1 2017-05-31 13:46:32