简体   繁体   English

如何使 Tornado 中的 SQLAlchemy 异步?

[英]How to make SQLAlchemy in Tornado to be async?

How to make SQLAlchemy in Tornado to be async ?如何使Tornado SQLAlchemy async I found example for MongoDB on async mongo example but I couldn't find anything like motor for SQLAlchemy .我在async mongo 示例上找到了 MongoDB 的示例,但我找不到任何类似于motor for SQLAlchemy东西。 Does anyone know how to make SQLAlchemy queries to execute with tornado.gen ( I am using MySQL below SQLAlchemy , at the moment my handlers reads from database and return result, I would like to make this async).有谁知道如何使用tornado.gen执行SQLAlchemy查询(我在SQLAlchemy使用MySQL ,目前我的处理程序从数据库读取并返回结果,我想让这个异步)。

ORMs are poorly suited for explicit asynchronous programming, that is, where the programmer must produce explicit callbacks anytime something that uses network access occurs. ORM 不太适合显式异步编程,也就是说,程序员必须在任何使用网络访问的情况发生时生成显式回调。 A primary reason for this is that ORMs make extensive use of the lazy loading pattern, which is more or less incompatible with explicit async.一个主要原因是 ORM 广泛使用延迟加载模式,这或多或少与显式异步不兼容。 Code that looks like this:看起来像这样的代码:

user = Session.query(User).first()
print user.addresses

will actually emit two separate queries - one when you say first() to load a row, and the next when you say user.addresses , in the case that the .addresses collection isn't already present, or has been expired.实际上会发出两个单独的查询-一个当你说first()来加载一排,而下时,你说user.addresses ,在的情况下,该.addresses集已不存在,或者已经过期。 Essentially, nearly every line of code that deals with ORM constructs might block on IO, so you'd be in extensive callback spaghetti within seconds - and to make matters worse, the vast majority of those code lines won't actually block on IO, so all the overhead of connecting callbacks together for what would otherwise be simple attribute access operations will make your program vastly less efficient too.从本质上讲,几乎每一行处理 ORM 结构的代码都可能在 IO 上阻塞,所以你会在几秒钟内陷入大量的回调意大利面——更糟糕的是,这些代码行中的绝大多数实际上不会在 IO 上阻塞,因此,将回调连接在一起的所有开销都将是简单的属性访问操作,这也将使您的程序效率大大降低。

A major issue with explicit asynchronous models is that they add tremendous Python function call overhead to complex systems - not just on the user-facing side like you get with lazy loading, but on the internal side as well regarding how the system provides abstraction around the Python database API (DBAPI).显式异步模型的一个主要问题是它们为复杂系统增加了巨大的 Python 函数调用开销 - 不仅在面向用户的方面(如延迟加载),而且在内部方面,以及系统如何围绕系统提供抽象Python 数据库 API (DBAPI)。 For SQLAlchemy to even have basic async support would impose a severe performance penalty on the vast majority of programs that don't use async patterns, and even those async programs that are not highly concurrent.对于 SQLAlchemy 来说,即使拥有基本的异步支持,也会对绝大多数不使用异步模式的程序,甚至那些非高度并发的异步程序造成严重的性能损失。 Consider SQLAlchemy, or any other ORM or abstraction layer, might have code like the following:考虑 SQLAlchemy 或任何其他 ORM 或抽象层,可能具有如下代码:

def execute(connection, statement):
     cursor = connection.cursor()
     cursor.execute(statement)
     results = cursor.fetchall()
     cursor.close()
     return results

The above code performs what seems to be a simple operation, executing a SQL statement on a connection.上面的代码执行看似简单的操作,即在连接上执行 SQL 语句。 But using a fully async DBAPI like psycopg2's async extension, the above code blocks on IO at least three times.但是使用像 psycopg2 的异步扩展这样的完全异步 DBAPI,上面的代码在 IO 上至少阻塞了 3 次。 So to write the above code in explicit async style, even when there's no async engine in use and the callbacks aren't actually blocking, means the above outer function call becomes at least three function calls, instead of one, not including the overhead imposed by the explicit asynchronous system or the DBAPI calls themselves.因此,以显式异步风格编写上述代码,即使没有使用异步引擎并且回调实际上没有阻塞,也意味着上述外部函数调用至少变为三个函数调用,而不是一个,不包括施加的开销通过显式异步系统或 DBAPI 调用自身。 So a simple application is automatically given a penalty of 3x the function call overhead surrounding a simple abstraction around statement execution.因此,围绕语句执行的简单抽象,一个简单的应用程序会自动受到 3 倍的函数调用开销的惩罚。 And in Python, function call overhead is everything .而在 Python 中, 函数调用开销就是一切

For these reasons, I continue to be less than excited about the hype surrounding explicit async systems, at least to the degree that some folks seem to want to go all async for everything, like delivering web pages (see node.js).由于这些原因,我对围绕显式异步系统的炒作仍然不那么兴奋,至少在某种程度上,有些人似乎想要对所有事情都采用异步,比如交付网页(参见 node.js)。 I'd recommend using implicit async systems instead, most notably gevent , where you get all the non-blocking IO benefits of an asynchronous model and none of the structural verbosity/downsides of explicit callbacks.我建议改用隐式异步系统,最显着的是gevent ,在那里您可以获得异步模型的所有非阻塞 IO 优点,并且没有显式回调的结构冗长/缺点。 I continue to try to understand use cases for these two approaches, so I'm puzzled by the appeal of the explicit async approach as a solution to all problems, ie as you see with node.js - we're using scripting languages in the first place to cut down on verbosity and code complexity, and explicit async for simple things like delivering web pages seems to do nothing but add boilerplate that can just as well be automated by gevent or similar, if blocking IO is even such a problem in a case like that (plenty of high volume websites do fine with a synchronous IO model).我继续尝试理解这两种方法的用例,所以我对显式异步方法作为所有问题的解决方案的吸引力感到困惑,即正如您在 node.js 中看到的那样 - 我们在第一个减少冗长和代码复杂性的地方,对于像交付网页这样的简单事情的显式异步似乎什么都不做,只是添加可以由 gevent 或类似工具自动化的样板,如果阻塞 IO 在一个像这样的情况(很多大容量网站都可以使用同步 IO 模型)。 Gevent-based systems are production proven and their popularity is growing, so if you like the code automation that ORMs provide, you might also want to embrace the async-IO-scheduling automation that a system like gevent provides.基于 Gevent 的系统已在生产环境中得到验证,并且它们的受欢迎程度正在增长,因此如果您喜欢 ORM 提供的代码自动化,您可能还希望采用像 gevent 这样的系统提供的异步 IO 调度自动化。

Update : Nick Coghlan pointed out his great article on the subject of explicit vs. implicit async which is also a must read here.更新:Nick Coghlan 指出了他关于显式与隐式异步主题的精彩文章,这也是这里必须阅读的内容。 And I've also been updated to the fact that pep-3156 now welcomes interoperability with gevent , reversing its previously stated disinterest in gevent, largely thanks to Nick's article.而且我还了解到, pep-3156 现在欢迎与 gevent 的互操作性,扭转了先前声明的对 gevent 不感兴趣的事实,这在很大程度上要归功于 Nick 的文章。 So in the future I would recommend a hybrid of Tornado using gevent for the database logic, once the system of integrating these approaches is available.因此,将来,一旦集成这些方法的系统可用,我将推荐使用 gevent 作为数据库逻辑的 Tornado 的混合体。

I had this same issue in the past and I couldn't find a reliable Async-MySQL library.我过去遇到过同样的问题,但找不到可靠的 Async-MySQL 库。 However there is a cool solution using Asyncio + Postgres .但是,使用Asyncio + Postgres有一个很酷的解决方案。 You just need to use the aiopg library, which comes with SQLAlchemy support out of the box:你只需要使用aiopg库,它带有开箱即用的 SQLAlchemy 支持:

import asyncio
from aiopg.sa import create_engine
import sqlalchemy as sa

metadata = sa.MetaData()

tbl = sa.Table('tbl', metadata,
    sa.Column('id', sa.Integer, primary_key=True),
    sa.Column('val', sa.String(255)))

async def create_table(engine):
    async with engine.acquire() as conn:
        await conn.execute('DROP TABLE IF EXISTS tbl')
        await conn.execute('''CREATE TABLE tbl (
                                  id serial PRIMARY KEY,
                                  val varchar(255))''')

async def go():
    async with create_engine(user='aiopg',
                             database='aiopg',
                             host='127.0.0.1',
                             password='passwd') as engine:

        async with engine.acquire() as conn:
            await conn.execute(tbl.insert().values(val='abc'))

            async for row in conn.execute(tbl.select()):
                print(row.id, row.val)

loop = asyncio.get_event_loop()
loop.run_until_complete(go())

Updated as mentioned by @cglacet如@cglacet所述更新

Not tornado, but we sort of made SQLAlchemy async in asyncio in the GINO project :不是龙卷风,但是我们做排序异步的SQLAlchemy在ASYNCIO GINO项目

import asyncio
from gino import Gino, enable_task_local
from sqlalchemy import Column, Integer, Unicode, cast

db = Gino()


class User(db.Model):
    __tablename__ = 'users'

    id = Column(Integer(), primary_key=True)
    nickname = Column(Unicode(), default='noname')


async def main():
    await db.create_pool('postgresql://localhost/gino')

    # Create object, `id` is assigned by database
    u1 = await User.create(nickname='fantix')
    print(u1.id, u1.nickname)  # 1 fantix

    # Retrieve the same row, as a different object
    u2 = await User.get(u1.id)
    print(u2.nickname)  # fantix

    # Update affects only database row and the operating object
    await u2.update(nickname='daisy')
    print(u2.nickname)  # daisy
    print(u1.nickname)  # fantix

    # Returns all user objects with "d" in their nicknames
    users = await User.query.where(User.nickname.contains('d')).gino.all()

    # Find one user object, None if not found
    user = await User.query.where(User.nickname == 'daisy').gino.first()

    # Execute complex statement and return command status
    status = await User.update.values(
        nickname='No.' + cast(User.id, Unicode),
    ).where(
        User.id > 10,
    ).gino.status()

    # Iterate over the results of a large query in a transaction as required
    async with db.transaction():
        async for u in User.query.order_by(User.id).gino.iterate():
            print(u.id, u.nickname)


loop = asyncio.get_event_loop()
enable_task_local(loop)
loop.run_until_complete(main())

It looks a bit like, but actually quite different than SQLAlchemy ORM.它看起来有点像,但实际上与 SQLAlchemy ORM大不相同 Because we used only a part of SQLAlchemy core, and built a simple ORM on top of it.因为我们只使用了 SQLAlchemy 核心的一部分,并在其上构建了一个简单的 ORM。 It uses asyncpg underneath, so it is for PostgreSQL only .它在下面使用asyncpg ,因此它仅适用于 PostgreSQL

Update : GINO supports Tornado now, thanks to the contribution of Vladimir Goncharov.更新:GINO 现在支持 Tornado,感谢 Vladimir Goncharov 的贡献。 See docs here 在此处查看文档

I am using tornado with sqlalchemy in next way:我正在以下一种方式将龙卷风与 sqlalchemy 一起使用:


from tornado_mysql import pools
from sqlalchemy.sql import table, column, select, join
from sqlalchemy.dialects import postgresql, mysql

# from models import M, M2

t = table(...)
t2 = table(...)

xxx_id = 10

j = join(t, t2, t.c.t_id == t2.c.id)
s = select([t]).select_from(j).where(t.c.xxx == xxx_id)

sql_str = s.compile(dialect=mysql.dialect(),compile_kwargs={"literal_binds": True})


pool = pools.Pool(conn_data...)
cur = yield pool.execute(sql_str)
data = cur.fetchone()

In that case we are able to use sqlalchemy models, and sqlalchemy tools for constructig queries.在这种情况下,我们可以使用 sqlalchemy 模型和 sqlalchemy 工具来构建查询。

I am using tornado6 with sqlalchemy in next way:我正在以下一种方式将 tornado6 与 sqlalchemy 一起使用:

from tornado.ioloop import IOLoop

def sql_function():
    pass

class Handler(tornado.web.RequestHandler):
    async def post(self):
        args = get_front_end_args()
        result = await IOLoop.current().run_in_executor(None,sql_function,*(args))
        self.write({"result":result})

SQLAlchemy 1.4 supports asyncio natively (in beta for now): SQLAlchemy 1.4 原生支持 asyncio(目前处于测试阶段):

https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM