简体   繁体   中英

sqlalchemy query related posts (order by common many-to-many relationships)

In my flask app there is a many-to-many relationship between Article and Tags:

article_tags =db.Table("article_tags",
    db.Column('article_id', db.Integer, db.ForeignKey('articles.id')),
    db.Column('tag_id', db.Integer, db.ForeignKey('tags.id')))

class Article(db.Model):
    __tablename__ = 'articles'
    id = db.Column(db.Integer, primary_key=True)
    ...
    tags = db.relationship('Tags',secondary=article_tags,backref=db.backref('articles',lazy='dynamic'), lazy='dynamic')

class Tags(db.Model):
    __tablename__="tags"
    id = db.Column(db.Integer,primary_key=True,index=True)
    name = db.Column(db.String(64),unique=True,index=True)

Given a specific article, I need to query all other articles grouped by the number of tags in common. For example, from the following set:

Article1.tags = tag1,tag2,tag3,tag4
Article2.tags = tag1,tag3,tag5
Article3.tags = tag1,tag3,tag4,tag5

Given Article1 I would want the query to return:

Common Tags | Article
3             Article3 
2             Article2

The result would give a fair approximation of most related posts. Thanks to this article I was able to figure out a query that sorts all articles by total number of tags, but I need to refine that by just the common tags with a given article:

db.session.query(Article,func.count(article_tags.c.tag_id).label('tot
al')).join(article_tags).group_by(Article).order_by('total').all()

Any help for a good query would be greatly appreciated.

I found a query for this, then converted it to use the ORM.
It uses one subquery.
I used vanilla SQLAlchemy, and created my models in a slightly different way so I'll be interested to know if you run into any issues with this.

article_id = 1

sub_stmt = db.session.query(article_tags.c.tag_id)\
                     .filter(article_tags.c.article_id==article_id)

db.session.query(Article.id,
                 func.count(article_tags.c.tag_id).label('total'),
                 func.group_concat(article_tags.c.tag_id).label('related_tags'))\
          .filter(Article.id!=article_id)\
          .filter(article_tags.c.tag_id.in_(sub_stmt))\
          .filter(article_tags.c.article_id==Article.id)\
          .group_by(Article.id)\
          .order_by(func.count(article_tags.c.tag_id).desc()).all()

Result:

Out[179]: [(3, 3, '1,3,4'), (2, 2, '1,3')]

I wrote about this solution here: https://www.mechanical-meat.com/blog/sqlalchemy-query-related-posts

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM