简体   繁体   English

SQLAlchemy连接两个模型并选择第一行

[英]SQLAlchemy join two models and select top row

Currently I have two tables: Server and Scan. 当前,我有两个表:Server和Scan。
It is possible to have one server to many scans (one to many relationship). 一台服务器可以进行多次扫描(一对多关系)。

What I am trying to achieve is to select a Server and then only the first Scan associated to that Server. 我试图实现的是选择一个服务器,然后仅选择与该服务器关联的第一个扫描。 The following query: 以下查询:

query = db.session.query(models.Server, models.Scan).outerjoin(models.Server.scans).all()

outputs: 输出:

(<Server u'Testing'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>)
(<Server u'Testing'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>)
(<Server u'Testing'>, <Scan u'testscan'>)
(<Server u'fasd'>, <Scan u'testscan'>)
(<Server u'fdaafas'>, None)

whereas I only want one " Testing " Server and the most recent Scan. 而我只想要一台“ Testing ”服务器和最新的扫描。

ADDITIONAL 额外

When I loop through my query like so: 当我像这样循环查询时:

for a in query:
    print a, a.scans.all()

The output is: 输出为:

<Server u'Testing'> [<Scan u'testscan'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>, <Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>]
<Server u'fasd'> [<Scan u'testscan'>]
<Server u'fdaafas'> []

The output I want should equal: 我想要的输出应该等于:

<Server u'Testing'> [<Scan u'bbd4f805-3966-d464-b2d1-0079eb89d69708c3a05ec2812bcf'>]
<Server u'fasd'> [<Scan u'testscan'>]
    <Server u'fdaafas'> []

You would need to add a subquery in which you select the Scan register you want to show, using some criteria. 您将需要添加一个子查询,在其中使用某些条件选择要显示的Scan寄存器。 For the toy example below I assume you want the maximum value of some parameter. 对于下面的玩具示例,我假设您想要某个参数的最大值。

I've created tables A and B ; 我已经创建了表AB A corresponds to Server and B to Scan . A对应于ServerB对应于Scan

In [2]:

class A(Base):
    __tablename__ = 'A'
​
    pk = Column('pk', Integer, primary_key=True)
    name = Column('name', String)

class B(Base):
    __tablename__ = 'B'
​
    pk = Column('pk', Integer, primary_key=True)
    fk = Column('fk', Integer, ForeignKey('A.pk'))
    attr = Column('attr', Integer)
​
    a = relationship("A", backref='B')

Inserted some data, 插入了一些数据,

In [10]:

q = session.query(B)
print(q)
for x in q.all():
    print(x.pk, x.fk, x.attr)

q = session.query(A)
print(q)
for x in q.all():
    print(x.pk, x.name)
​
SELECT "B".pk AS "B_pk", "B".fk AS "B_fk", "B".attr AS "B_attr" 
FROM "B"
1 1 1
2 1 2
3 2 0
4 2 4
5 1 4
SELECT "A".pk AS "A_pk", "A".name AS "A_name" 
FROM "A"
1 one
2 two

And solved your problem adding a subquery that selects the maximum value of B.attr for every B.fk , ie for every A.pk . 并解决您的问题,并表示选择的最大值的子查询B.attr为每B.fk ,即每一个A.pk (In your example it would be the maximum Scan.attr for every Server .) (在您的示例中,这将是每个Server的最大Scan.attr 。)

In [13]:


from sqlalchemy import func
from sqlalchemy import tuple_
​
s = session.query(func.max(B.attr), B.fk).group_by(B.fk)
print(s)
q = session.query(A, B).outerjoin(B).filter(tuple_(B.attr, B.fk).in_(s))
print(q)
for x in q.all():
    print(x.A.pk, x.A.name, x.B.pk, x.B.attr)

SELECT max("B".attr) AS max_1, "B".fk AS "B_fk" 
FROM "B" GROUP BY "B".fk
SELECT "A".pk AS "A_pk", "A".name AS "A_name", "B".pk AS "B_pk", "B".fk AS "B_fk", "B".attr AS "B_attr" 
FROM "A" LEFT OUTER JOIN "B" ON "A".pk = "B".fk 
WHERE ("B".attr, "B".fk) IN (SELECT max("B".attr) AS max_1, "B".fk AS "B_fk" 
FROM "B" GROUP BY "B".fk)
2 two 4 4
1 one 5 4

NOTE: you don't mention which database you are using, but just in case, please note that the in_ statement with multiple columns does not work in sqlite (which is quite annoying when you try it). 注意:您没有提到要使用哪个数据库,但是以防万一,请注意,具有多列的in_语句在sqlite不起作用(尝试时非常烦人)。 But if you used one column only, something like, 但是,如果仅使用一列,则类似

s = session.query(func.max(B.attr)).group_by(B.fk)
q = session.query(A, B).outerjoin(B).filter(B.attr.in_(s))

however depending on your data you could get more than one B for each A (eg B.fk =1 has max( B.attr )=3, and B.fk =2 has max( B.attr )=4 but also a B.attr =3, you would get for B.fk =2 both B.attr =3 and B.attr =4. 但是,根据您的数据,每个A可能会获得不止一个B(例如B.fk = 1的max( B.attr )= 3,而B.fk = 2的max( B.attr )= 4但也有一个B.attr = 3,你会得到B.fk = 2都B.attr = 3和B.attr = 4。

However if the attribute you are using to select the maximum was unique, it would be fine. 但是,如果用于选择最大值的属性是唯一的,那就没问题了。 Anyway if you are with a database like postgres or oracle you can use the in_ with multiple columns. 无论如何,如果您使用的是postgresoracle等数据库,则可以将in_与多列一起使用。

Hope it helps. 希望能帮助到你。

EDIT added after comments: If you want to get also the Servers without a Scan , you just need to add an or_ to your query. 注释后添加了EDIT:如果您还想获取没有ScanServers ,则只需在查询中添加or_

In [18]:

from sqlalchemy import func
from sqlalchemy import tuple_
from sqlalchemy import or_
​
s = session.query(func.max(B.attr), B.fk).group_by(B.fk)
q = session.query(A, B).outerjoin(B).filter(or_(tuple_(B.attr, B.fk).in_(s), B.fk==None))
print(q)
for x in q.all():
    if x.B:
        print(x.A.pk, x.A.name, x.B.pk, x.B.attr)
    else:
        print(x.A.pk, x.A.name)
​
SELECT "A".pk AS "A_pk", "A".name AS "A_name", "B".pk AS "B_pk", "B".fk AS "B_fk", "B".attr AS "B_attr" 
FROM "A" LEFT OUTER JOIN "B" ON "A".pk = "B".fk 
WHERE ("B".attr, "B".fk) IN (SELECT max("B".attr) AS max_1, "B".fk AS "B_fk" 
FROM "B" GROUP BY "B".fk) OR "B".fk IS NULL
2 two 4 4
1 one 5 4
3 three

As you see, you have to be careful with nulls. 如您所见,您必须谨慎使用null。 Note that outerjoin already performs a left join , which is what you needed, but because of the filter , you have to explicitly say that you want the null rows also. 请注意, outerjoin已经执行了left join ,这是您所需要的,但是由于有filter ,您必须明确地说您也想要空行。 As usual, A is Server and B is Scan . 像往常一样, AServerBScan Sorry for not using your table names, it makes it much more difficult to read. 很抱歉没有使用您的表名,这使它的读取更加困难。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM