简体   繁体   中英

Sqlalchemy from_self in an inner query referencing outer aliased table

I have an sqlalchemy query like this:

E = aliased(Entity, name='e')
db.session.query(E) \
    .filter(E.user_id == user_id) \
    .filter(
        db.session.query(Entity) \
            .distinct(Entity.original_id) \
            .order_by(Entity.original_id, Entity.id.desc())
            .filter(Entity.ref_id == E.ref_id) \
            .from_self(Entity.flag) \
            .order_by(Entity.timestamp.desc()) \
            .limit(1).as_scalar()
    ) \
    .order_by(E.timestamp) \
    .all()

It produces (roughly) the following SQL:

SELECT *
FROM entity AS e
WHERE e.user_id = 1 AND (
    SELECT anon_1.entity_flag AS flag
    FROM (
        SELECT DISTINCT ON (entity.original_id)
            entity.flag AS entity_flag, entity.timestamp AS entity_timestamp
        FROM entity, entity AS e # THIS "entity AS e" SHOULD NOT BE HERE
        WHERE
            entity.ref_id = e.ref_id
            ORDER BY entity.original_id, entity.id DESC
    ) AS anon_1
    ORDER BY anon_1.entity_timestamp
    LIMIT 1
) ORDER BY e.timestamp;

Because it somehow adds entity AS e into the inner query FROM clause it causes the WHERE entity.ref_id = e.ref_id not to reference the outer table as it should.

Why is this extraneous entity AS e added to the inner FROM clause and how to overcome it?

It might be that you've found the limits of Query.from_self() :

The automatic aliasing feature only works in a limited way, for simple filters and orderings. More ambitious constructions such as referring to the entity in joins should prefer to use explicit subquery objects, typically making use of the Query.subquery() method to produce an explicit subquery object. Always test the structure of queries by viewing the SQL to ensure a particular structure does what's expected!

In general SQLAlchemy's "auto-correlation" considers FROM objects for correlation from the immediate enclosing query only and if deeper levels of nesting are required, explicit correlation must be used.

On the other hand, in your case this does not help. Adding a call to Query.correlate() does not break through the boundary of Query.from_self() for some reason, though using them together is even mentioned in the documentation:

The correlation arguments take effect in such cases as when Query.from_self() is used, or when a subquery as returned by Query.subquery() is embedded in another select() construct.

A solution is to use explicit subqueries:

In [65]: sq = session.query(Entity.flag, Entity.timestamp).\
    ...:     distinct(Entity.original_id).\
    ...:     order_by(Entity.original_id, Entity.id.desc()).\
    ...:     filter(Entity.ref_id == E.ref_id).\
    ...:     correlate(E).\
    ...:     subquery()

In [66]: q = session.query(E).\
    ...:     filter(E.user_id == 123).\
    ...:     filter(session.query(sq.c.flag).
    ...:            order_by(sq.c.timestamp.desc()).
    ...:            limit(1).
    ...:            as_scalar()).\
    ...:     order_by(E.timestamp)
    ...:            

In [67]: print(q.statement.compile(dialect=postgresql.dialect()))
SELECT e.id, e.ref_id, e.timestamp, e.original_id, e.user_id, e.flag 
FROM entity AS e 
WHERE e.user_id = %(user_id_1)s AND (SELECT anon_1.flag 
FROM (SELECT DISTINCT ON (entity.original_id) entity.flag AS flag, entity.timestamp AS timestamp 
FROM entity 
WHERE entity.ref_id = e.ref_id ORDER BY entity.original_id, entity.id DESC) AS anon_1 ORDER BY anon_1.timestamp DESC 
 LIMIT %(param_1)s) ORDER BY e.timestamp

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM