简体   繁体   中英

Python/SqlAlchemy joining table to itself not generating expected query

What I want to do seems like it should be straightforward. I want to join a table representing data collection stations to itself, in order to track previous iterations of stations deployed in the same location.

In the code below, I have two classes: StationTable and StationTypeTable. The StationTable has two FK relationships -- one to a station type, and another back to the station table.

At the end is the generated SQL that shows the correct join to the StationType table, but no trace whatsoever of the link created by the previous_station column.

What am I doing wrong? Note that this will eventually be used with FastApi and the Async Postgres driver, which may or may not be of interest. Also, I don't need to modify the related tables via the relationship; I only need to read some attributes.

Using SQLAlchemy 1.4, latest version.

from typing import Any

import sqlalchemy as sa
from sqlalchemy import select
from sqlalchemy.orm import registry, RelationshipProperty
from sqlalchemy.schema import ForeignKey
from sqlalchemy.dialects.postgresql import UUID
from sqlalchemy.orm.decl_api import DeclarativeMeta


mapper_registry = registry()


class BaseTable(metaclass=DeclarativeMeta):
    __abstract__ = True
    registry = mapper_registry
    metadata = mapper_registry.metadata

    __init__ = mapper_registry.constructor
    id = sa.Column(UUID(as_uuid=True), primary_key=True, server_default=sa.text("gen_random_uuid()"))


class Relationship(RelationshipProperty):       # type: ignore
    """ Using this class hides some of the static typing messiness in SQLAlchemy. """
    inherit_cache = True        # If this works, should improve performance


    def __init__(self, *args: Any, **kwargs: Any):
        if "lazy" not in kwargs:
            # 'joined' means items should be loaded "eagerly" in the same query as that of the parent, using a JOIN or LEFT
            # OUTER JOIN. Whether the join is "outer" or not is determined by the relationship.innerjoin parameter.
            # We need this to keep loading in async mode from blowing up.
            # https://docs.sqlalchemy.org/en/14/orm/relationship_api.html
            kwargs["lazy"] = "joined"

        super().__init__(*args, **kwargs)


class StationTypeTable(BaseTable):
    __tablename__ = "station_type"

    name = sa.Column(sa.String(255), unique=True, nullable=False)
    description = sa.Column(sa.UnicodeText)


class StationTable(BaseTable):
    __tablename__ = "station"

    name = sa.Column(sa.String(255), unique=True, nullable=False)
    installation_date = sa.Column(sa.BigInteger, nullable=False)
    station_type_id = sa.Column(UUID(as_uuid=True), ForeignKey(StationTypeTable.id), nullable=False)
    previous_station = sa.Column(UUID(as_uuid=True), ForeignKey("station.id"), nullable=True)

    station_type_table = Relationship(StationTypeTable, uselist=False)
    previous_station_table = Relationship("StationTable", uselist=False)   # self join, uselist=False ==> one-to-one



query = select(StationTable)

print(query)


# SELECT station.id, station.name, station.installation_date, station.station_type_id, station.previous_station, 
#        station_type_1.id AS id_1, station_type_1.name AS name_1, station_type_1.description 
# FROM station 
# LEFT OUTER JOIN station_type AS station_type_1 ON station_type_1.id = station.station_type_id

EDIT:

Based on Ian Wilson's reply below, I added the parameter join_depth=1 to the previous_station_table relationship, which did indeed generate the SQL for the relationship, but it is, oddly, "the wrong way around" compared to the station_type_table relationship. Here is the SQL generated with that param:

SELECT station.id, station.name, station.installation_date, 
          station.station_type_id, station.previous_station, 
          station_type_1.id AS id_1, station_type_1.name AS name_1, 
          station_type_1.description, station_type_2.id AS id_2, 
          station_type_2.name AS name_2, station_type_2.description AS description_1, 
          station_1.id AS id_3, station_1.name AS name_3, 
          station_1.installation_date AS installation_date_1, 
          station_1.station_type_id AS station_type_id_1, 
          station_1.previous_station AS previous_station_1 
FROM station 
LEFT OUTER JOIN station_type AS station_type_1 
    ON station_type_1.id = station.station_type_id  -- looks right
LEFT OUTER JOIN station AS station_1 
    ON station.id = station_1.previous_station      -- looks backward, see below
LEFT OUTER JOIN station_type AS station_type_2 
    ON station_type_2.id = station_1.station_type_id

I think that the marked line should be:

LEFT OUTER JOIN station AS station_1 ON station.previous_station = station_1.id

The problem appears to be that you have to set join_depth for self-referential eager loading, I set it to join_depth=1 and that seemed to fix the query. Since you have an additional eager join this actually creates 3 joins instead of 2 joins because the second set of stations have to join type too. The documentation explains join_depth here:

configuring-self-referential-eager-loading

The param for relatoinship is briefly explained here:

sqlalchemy.orm.relationship.params.join_depth

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM