简体   繁体   English

SQLAlchemy中的GroupBy和Sum?

[英]GroupBy and Sum in SQLAlchemy?

I am trying to group a few fields in a table, and then sum the groups, but they are getting double counted. 我试图在一个表中组合几个字段,然后对这些组进行求和,但它们会被重复计算。

My models are as follows: 我的模型如下:

class CostCenter(db.Model):
     __tablename__ = 'costcenter'
     id = db.Column(db.Integer, primary_key=True, autoincrement=True)
     name = db.Column(db.String)
     number = db.Column(db.Integer)

class Expense(db.Model):

    __tablename__ = 'expense'
    id = db.Column(db.Integer, primary_key=True, autoincrement=True)
    glitem_id = db.Column(db.Integer, db.ForeignKey('glitem.id'))
    glitem = db.relationship('GlItem')
    costcenter_id = db.Column(db.Integer, db.ForeignKey('costcenter.id'))
    costcenter = db.relationship('CostCenter')
    value = db.Column(db.Float)
    date = db.Column(db.Date)

I have been using: 我一直在使用:

expenses=db.session.query(Expense,func.sum(Expense.value)).group_by(Expense.date).filter(CostCenter.id.in_([1,2,3]))

When I print expenses it shows the SQL statement that follows. 当我打印费用时,它显示了后面的SQL语句。 It looks correct to me, but I am not that familiar with SQL. 它对我来说是正确的,但我不熟悉SQL。 The problem is that the values it outputs as sum_1 are being counted multiple times. 问题是它作为sum_1输出的值被多次计数。 If I have [1] item in the "in statment" it will sum all three. 如果我在“in statment”中有[1]项,那么它将全部三个相加。 If I have [1,2], it will sum all three and then double it, and if i have [1,2,3], it will sum all three and triple it. 如果我有[1,2],它将总和所有三个然后加倍它,如果我有[1,2,3],它将总和所有三个并加倍它。 I am not sure why it is counting multiple times. 我不确定为什么它会多次计数。 How do I fix this? 我该如何解决?

SELECT expense.id AS expense_id, expense.glitem_id AS expense_glitem_id, expense.costcenter_id AS         expense_costcenter_id, expense.value AS expense_value, expense.date AS expense_date, sum(expense.value) AS sum_1 
FROM expense, costcenter 
WHERE costcenter.id IN (:id_1, :id_2, :id_3) GROUP BY expense.date

Thanks! 谢谢!

There are a few issues here; 这里有一些问题; you don't seem to be querying the right things. 你好像不是在追问正确的事情。 It's meaningless to select an Expense object when grouping by Expense.date. 在按Expense.date分组时选择Expense对象毫无意义。 There needs to be some join condition between CostCenter and Expense, otherwise the rows will be duplicated, each count for each cost center but with no relation between the two. CostCenter和Expense之间需要有一些连接条件,否则行将被复制,每个成本中心计数但两者之间没有关系。

Your query should look like this: 您的查询应如下所示:

session.query(
    Expense.date,
    func.sum(Expense.value).label('total')
).join(Expense.cost_center
).filter(CostCenter.id.in_([2, 3])
).group_by(Expense.date
).all()

producing this sql: 产生这个sql:

SELECT expense.date AS expense_date, sum(expense.value) AS total 
FROM expense JOIN cost_center ON cost_center.id = expense.cost_center_id 
WHERE cost_center.id IN (?, ?) GROUP BY expense.date

Here is a simple runnable example: 这是一个简单的可运行示例:

from datetime import datetime
from sqlalchemy import create_engine, Column, Integer, ForeignKey, Numeric, DateTime, func
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session, relationship

engine = create_engine('sqlite://', echo=True)
session = Session(bind=engine)
Base = declarative_base(bind=engine)


class CostCenter(Base):
    __tablename__ = 'cost_center'

    id = Column(Integer, primary_key=True)


class Expense(Base):
    __tablename__ = 'expense'

    id = Column(Integer, primary_key=True)
    cost_center_id = Column(Integer, ForeignKey(CostCenter.id), nullable=False)
    value = Column(Numeric(8, 2), nullable=False, default=0)
    date = Column(DateTime, nullable=False)

    cost_center = relationship(CostCenter, backref='expenses')


Base.metadata.create_all()

session.add_all([
    CostCenter(expenses=[
        Expense(value=10, date=datetime(2014, 8, 1)),
        Expense(value=20, date=datetime(2014, 8, 1)),
        Expense(value=15, date=datetime(2014, 9, 1)),
    ]),
    CostCenter(expenses=[
        Expense(value=45, date=datetime(2014, 8, 1)),
        Expense(value=40, date=datetime(2014, 9, 1)),
        Expense(value=40, date=datetime(2014, 9, 1)),
    ]),
    CostCenter(expenses=[
        Expense(value=42, date=datetime(2014, 7, 1)),
    ]),
])
session.commit()

base_query = session.query(
    Expense.date,
    func.sum(Expense.value).label('total')
).join(Expense.cost_center
).group_by(Expense.date)

# first query considers center 1, output:
# 2014-08-01: 30.00
# 2014-09-01: 15.00
for row in base_query.filter(CostCenter.id.in_([1])).all():
    print('{}: {}'.format(row.date.date(), row.total))

# second query considers centers 1, 2, and 3, output:
# 2014-07-01: 42.00
# 2014-08-01: 75.00
# 2014-09-01: 95.00
for row in base_query.filter(CostCenter.id.in_([1, 2, 3])).all():
    print('{}: {}'.format(row.date.date(), row.total))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM