As a background: I'm creating an ORM based on a schema of an already existing database. - This due to the fact that the python application won't be the "owner" of said database.
Now in this database there is a table called " task
" and a table called " task_notBefore__task_relatedTasks
" - this latter is a many-to-many relation between different entries in the " task
" table.
now automap_base()
has an automated detection of these relationships as described here . However this fails for my case, and no relationship is being build.
I then try to manually create the relationship:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.ext.automap import generate_relationship
from sqlalchemy.orm import sessionmaker, interfaces, relationship
from sqlalchemy import create_engine
class DBConnection:
def __init__(self, connection_url, **kwargs):
self.engine = create_engine(connection_url, **kwargs)
self._Base = automap_base()
self._Base.prepare(self.engine, reflect=True)
self.Task = self._Base.classes.task
self.Order = self._Base.classes.order
self.Poller = self._Base.classes.poller
rel = generate_relationship(self._Base, interfaces.MANYTOMANY, relationship, 'related', self.Task, self.Task,
secondary=self._Base.classes.task_notBefore__task_relatedTasks, backref='notBefore')
self._Session = sessionmaker()
self._Session.configure(bind=self.engine)
self.session = self._Session()
However this still doesn't "do" anything: it doesn't add anything to the self.Task
"class".
How would one do this?
The primary problem in this case is not just the many-to-many relationship, but the fact that it's a self-referential, many-to-many relationship. Because automap
is simply translating the mapped class names to relationship names, it constructs the same name, eg task_collection
, for both directions of the relationship, and the naming collision generates the error. This shortcoming of automap
feels significant in that self-referential, many-to-many relationships are not uncommon.
Explicitly adding the relationships you want, using your own names, won't solve the problem because automap
will still try to create the task_collection
relationships. To deal with this issue, we need to override task_collection
.
If you're okay with keeping the name task_collection
for the forward direction of the relationship, we can simply pre-define the relationship--specifying whatever name we want for the backref
. If automap
finds the expected property already in place, it will assume the relationship is being overridden and not try to add it.
Here's a stripped down example, along with the an sqlite
database for testing.
CREATE TABLE task (
id INTEGER,
name VARCHAR,
PRIMARY KEY (id)
);
CREATE TABLE task_task (
tid1 INTEGER,
tid2 INTEGER,
FOREIGN KEY(tid1) REFERENCES task(id),
FOREIGN KEY(tid2) REFERENCES task(id)
);
-- Some sample data
INSERT INTO task VALUES (0, 'task_0');
INSERT INTO task VALUES (1, 'task_1');
INSERT INTO task VALUES (2, 'task_2');
INSERT INTO task VALUES (3, 'task_3');
INSERT INTO task VALUES (4, 'task_4');
INSERT INTO task_task VALUES (0, 1);
INSERT INTO task_task VALUES (0, 2);
INSERT INTO task_task VALUES (2, 4);
INSERT INTO task_task VALUES (3, 4);
INSERT INTO task_task VALUES (3, 0);
Putting it into a file called setup_self.sql
, we can do:
sqlite3 self.db < setup_self.sql
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, Integer, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base
DeclBase = declarative_base()
task_task = Table('task_task', DeclBase.metadata,
Column('tid1', Integer, ForeignKey('task.id')),
Column('tid2', Integer, ForeignKey('task.id')))
Base = automap_base(DeclBase)
class Task(Base):
__tablename__ = 'task'
task_collection = relationship('Task',
secondary=task_task,
primaryjoin='Task.id==task_task.c.tid1',
secondaryjoin='Task.id==task_task.c.tid2',
backref='backward')
engine = create_engine("sqlite:///self.db")
Base.prepare(engine, reflect=True)
session = Session(engine)
task_0 = session.query(Task).filter_by(name ='task_0').first()
task_4 = session.query(Task).filter_by(name ='task_4').first()
print("task_0.task_collection = {}".format([x.name for x in task_0.task_collection]))
print("task_4.backward = {}".format([x.name for x in task_4.backward]))
task_0.task_collection = ['task_1', 'task_2']
task_4.backward = ['task_2', 'task_3']
If you want to have a name other than task_collection
, you need to use automap
's function for overriding collection-relationship names:
name_for_collection_relationship(base, local_cls, referred_cls, constraint)
The arguments local_cls
and referred_cls
are instances of the mapped table classes. For a self-referential, many-to-many relationship, these are both the same class. We can use the arguments to build a key that allows us to identify overrides.
Here is an example implementation of this approach.
from sqlalchemy.ext.automap import automap_base, name_for_collection_relationship
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, Integer, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base
DeclBase = declarative_base()
task_task = Table('task_task', DeclBase.metadata,
Column('tid1', Integer, ForeignKey('task.id')),
Column('tid2', Integer, ForeignKey('task.id')))
Base = automap_base(DeclBase)
class Task(Base):
__tablename__ = 'task'
forward = relationship('Task',
secondary=task_task,
primaryjoin='Task.id==task_task.c.tid1',
secondaryjoin='Task.id==task_task.c.tid2',
backref='backward')
# A dictionary that maps relationship keys to a method name
OVERRIDES = {
'Task_Task' : 'forward'
}
def _name_for_collection_relationship(base, local_cls, referred_cls, constraint):
# Build the key
key = '{}_{}'.format(local_cls.__name__, referred_cls.__name__)
# Did we have an override name?
if key in OVERRIDES:
# Yes, return it
return OVERRIDES[key]
# Default to the standard automap function
return name_for_collection_relationship(base, local_cls, referred_cls, constraint)
engine = create_engine("sqlite:///self.db")
Base.prepare(engine, reflect=True, name_for_collection_relationship=_name_for_collection_relationship)
Note that the overriding of name_for_collection_relationship
simply changes the name that automap
uses for the relationship. In our case, the relationship is still being pre-defined by Task
. But, the override tells automap
to look for forward
instead of task_collection
, which it finds and therefore discontinues defining the relationship.
Under some circumstances, it would be nice if we could override the relationship names without having to pre-define the actual relationship. On first consideration, this should be possible using name_for_collection_relationship
. However, I could not get this approach to work for self-referential, many-to-many relationships, due to a combination of two reasons.
name_for_collection_relationship
and the related generate_relationship
are called twice, once for each direction of the many-to-many relationship. In both cases, local_cls
and referred_cls
are the same, because of the self-referentiality. Moreover, the other arguments of name_for_collection_relationship
are effectively equivalent. Therefore, we cannot, from the context of the function call, determine which direction we are overriding.
Here is the even-more surprising part of the problem. It appears we cannot even count on one direction happening before the other. In other words, the two calls to name_for_collection_relationship
and generate_relationship
are very similar. The argument that actually determines the directionality of the relationship is constraint
, which is one of the two foreign-key constraints for the relationship; these constraints are loaded, from Base.metadata
, into a variable called m2m_const
. Herein lies the problem. The order that the constraints end up in m2m_const
is nondeterministic, ie sometimes it will be one order; other times it will be the opposite (at least when using sqlite3
). Because of this, the directionality of the relationship is nondeterministic.
On the other hand, when we pre-define the relationship, the following arguments create the necessary determinism.
primaryjoin='Task.id==task_task.c.tid1',
secondaryjoin='Task.id==task_task.c.tid2',
Of particular note, I actually tried to create a solution that simply overrode the relationship names without pre-defining it. It exhibited the described nondeterminism.
If you have a reasonable number of database tables that do not change often, I would suggest just using Declarative Base . It might be a little more work to set up, but it gives you more control.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.