pandas dataframe 到 aws postgresql 使用 sqlalchemy 不将数据注入表

Question

I am running this code:我正在运行这段代码：

import sqlalchemy

engine = sqlalchemy.create_engine('postgresql://user:pass@aws_db_endpoint:5432/db')
print(bool(engine)) # <- just to keep track of the process

with engine.connect() as conn:
    print(bool(conn)) # <- just to keep track of the process

    df.to_sql('mytable', schema='public', con=conn, if_exists='append')
    print("end") # <- just to keep track of the process

I get the true , true and end which means I get the connection done and the df.to_slq executed.我得到了true ， true和end这意味着我完成了连接并执行了df.to_slq 。

THe problem is that the table in aws postresql is still with no data at all.问题是 aws postresql 中的表仍然根本没有数据。

What I am doing wrong here?我在这里做错了什么？

Thanks谢谢

Answer 1

Using engine.connect like this I think requires that you call commit() explicitly.我认为像这样使用engine.connect需要您显式调用commit() 。 You can see the explaination towards the end of the Basic Usage :您可以在基本用法末尾看到解释：

When the connection is returned to the pool for re-use, the pooling mechanism issues a rollback() call on the DBAPI connection so that any transactional state or locks are removed, and the connection is ready for its next use.当连接返回到池中以供重新使用时，池机制会在 DBAPI 连接上发出 rollback() 调用，以便删除任何事务性 state 或锁，并为下一次使用做好连接准备。

So in this case you need to call conn.commit() .所以在这种情况下你需要调用conn.commit() 。 You can always commit unless there is an exception by changing your usage to engine.begin() :你可以随时提交，除非有异常，方法是将你的用法更改为engine.begin() ：

# Commit is called unless an exception occurs.
with engine.begin() as conn:
    print(bool(conn)) # <- just to keep track of the process

    df.to_sql('mytable', schema='public', con=conn, if_exists='append')
    print("end") # <- just to keep track of the process

Example例子

Here is an example based on the panda docs.这是一个基于熊猫文档的示例。 This script runs against an empty database, specifically Base.metadata.create_all(engine) creates the tables in the db first so pandas is forced to append.此脚本针对空数据库运行，特别是Base.metadata.create_all(engine)首先在数据库中创建表，因此 pandas 被强制为 append。

import sys
from sqlalchemy import (
    create_engine,
    Integer,
    String,
)
from sqlalchemy.schema import (
    Column,
)
from sqlalchemy.sql import select
from sqlalchemy.orm import declarative_base
import pandas as pd


Base = declarative_base()


username, password, db = sys.argv[1:4]


engine = create_engine(f"postgresql+psycopg2://{username}:{password}@/{db}", echo=False)


class User(Base):
    __tablename__ = "users"
    id = Column(Integer, primary_key=True)
    name = Column(String(8), index=True)


Base.metadata.create_all(engine)


with engine.begin() as conn:
    df = pd.DataFrame({
        'name': ['User 1', 'User 2', 'User 3']
    })

    df.to_sql('users', schema="public", con=conn, if_exists='append', index_label='id')


with engine.begin() as conn:
    for user in conn.execute(select(User)).all():
        print(user.name)

pandas dataframe 到 aws postgresql 使用 sqlalchemy 不将数据注入表

问题描述

1 个解决方案

解决方案1
1 2022-03-27 03:32:22

Example例子

pandas dataframe 到 aws postgresql 使用 sqlalchemy 不将数据注入表

问题描述

1 个解决方案

解决方案1 1 2022-03-27 03:32:22

Example例子

解决方案1
1 2022-03-27 03:32:22