简体   繁体   English

SQL查询混合聚合结果和单个值

[英]SQL query mixing aggregated results and single values

I have a table with transactions. 我有一张交易表。 Each transaction has a transaction ID, and accounting period (AP), and a posting value (PV), as well as other fields. 每个交易都有交易ID,会计期间(AP),过帐价值(PV)以及其他字段。 Some of the IDs are duplicated, usually because the transaction was done in error. 一些ID是重复的,通常是因为事务是错误完成的。 To give an example, part of the table might look like: 举个例子,表的一部分可能如下所示:

ID    PV    AP  
123   100   2  
123   -100  5  

In this case the transaction was added in AP2 then removed in AP5. 在这种情况下,交易在AP2中添加,然后在AP5中删除。

Another example would be: 另一个例子是:

ID    PV    AP  
456   100   2  
456   -100  5  
456   100   8

In the first example, the problem is that if I am analyzing what was spent in AP2, there is a transaction in there which actually shouldn't be taken into account because it was taken out again in AP5. 在第一个例子中,问题是,如果我正在分析在AP2中花费了什么,那么在那里有一个实际上不应该考虑的事务,因为它在AP5中再次被取出。 In the second example, the second two transactions shouldn't be taken into account because they cancel each other out. 在第二个示例中,不应考虑后两个事务,因为它们相互抵消。

I want to label as many transactions as possible which shouldn't be taken into account as erroneous. 我想标记尽可能多的交易,不应该将其视为错误的。 To identify these transactions, I want to find the ones with duplicate IDs whose PVs sum to zero (like ID 123 above) or transactions where the PV of the earliest one is equal to sum(PV), as in the second example. 为了识别这些交易,我想找到具有重复ID的那些,其PV总和为零(如上面的ID 123)或最早的PV等于总和(PV)的交易,如第二个例子中所示。 This second condition is what is causing me grief. 这第二个条件是导致我悲痛的原因。

So far I have 到目前为止我有

SELECT *
FROM table
WHERE table.ID IN (SELECT table.ID
                    FROM table
                    GROUP BY table.ID
                    HAVING COUNT(*) > 1
                    AND (SUM(table.PV) = 0
                    OR SUM(table.PV) = <PV of first transaction in each group>))
ORDER BY table.ID;

The bit in chevrons is what I'm trying to do and I'm stuck. V形臂章是我正在尝试做的事情,而且我被卡住了。 Can I do it like this or is there some other method I can use in SQL to do this? 我可以这样做,还是我可以在SQL中使用其他方法来执行此操作?

Edit 1: Btw I forgot to say that I'm using SQL Compact 3.5, in case it matters. 编辑1:顺便说一句,我忘了说我正在使用SQL Compact 3.5,以防万一。

Edit 2: I think the code snippet above is a bit misleading. 编辑2:我认为上面的代码片段有点误导。 I still want to mark out transactions with duplicate IDs where sum(PV) = 0, as in the first example. 我仍然想要标记具有重复ID的事务,其中sum(PV)= 0,如第一个示例中所示。 But where the PV of the earliest transaction = sum(PV), as in the second example, what I actually want is to keep the earliest transaction and mark out all the others with the same ID. 但是,在最早的交易的PV = sum(PV)的情况下,如第二个例子中那样,我实际想要的是保持最早的交易并用相同的ID标记所有其他交易。 Sorry if that caused confusion. 对不起,如果这引起混乱。

Edit 3: I've been playing with Clodoaldo's solution and have made some progress, but still can't get quite what I want. 编辑3:我一直在玩Clodoaldo的解决方案并取得了一些进展,但仍然无法得到我想要的东西。 I'm trying to get the transactions I know for certain to be erroneous. 我试图让我知道的交易肯定是错误的。 Suppose the following transactions are also in the table: 假设表中还包含以下事务:

ID     PV    AP  
789    100   2  
789    200   5  
789   -100   8

In this example sum(PV) <> 0 and the earliest PV <> sum(PV) so I don't want to mark any of these out. 在这个例子中,sum(PV)<> 0和最早的PV <> sum(PV)所以我不想将这些中的任何一个标记出来。

If I modify Clodoaldo's query as follows: 如果我修改Clodoaldo的查询如下:

    select t.*
    from 
    t
    left join (
        select id, min(ap) as ap, sum(pv) as sum_pv
        from t
        group by id
        having sum(pv) <> 0
    ) s on t.id = s.id and t.ap = s.ap and t.pv = s.sum_pv
     where s.id is null

This gives the result 这给出了结果

 ID      PV     AP
123      100    2
123     -100    5
456     -100    5
456      100    8
789      100    3
789      200    5
789     -100    8

Whilst the first 4 transactions are ok (they would be marked out), the 789 transactions are also there, and I don't want them. 虽然前4个交易都没问题(它们会被标记出来),但789交易也在那里,我不想要它们。 But I can't figure out how to modify the query so that they're not included. 但我无法弄清楚如何修改查询,以便它们不被包括在内。 Any ideas? 有任何想法吗?

SQL Fiddle SQL小提琴

select t.* 
from 
    t
    inner join (
        select id, min(ap) as ap
        from t
        group by id
        having sum(pv) <> 0
    ) s on t.id = s.id and t.ap = s.ap

The above gets the valid transactions. 以上获取有效交易。 If you want the invalid ones use this: 如果你想要无效的,请使用:

select t.*
from 
    t
    left join (
        select id, min(ap) as ap
        from t
        group by id
        having sum(pv) <> 0
    ) s on t.id = s.id and t.ap = s.ap
where s.id is null

SQL Fiddle SQL小提琴

Try something like this: 尝试这样的事情:

UPDATE
    Transactions
SET
    IsError = true
WHERE
    EXISTS
    (SELECT
        NULL
    FROM 
        Transactions SubsequentTransactions
    WHERE
        Transactions.ID = SubsequentTransactions.ID
    AND Transactions.AP < SubsequentTransactions.AP
    AND Transactions.PV = -1 * SubsequentTransactions.PV)

I think that will work. 我认为这会奏效。 I haven't tested it at all so I'd suggest that you use the WHERE clause in a select statement first to ensure it will only affect the rows you want. 我根本没有测试它,所以我建议你先在select语句中使用WHERE子句,以确保它只影响你想要的行。

This won't flag negative transactions as errors (you may or may not need to), except for in your second example. 除了第二个示例之外,这不会将负面交易标记为错误(您可能需要或可能不需要)。 In your second example there is a third record which cancels the second one if they are taken in isolation. 在你的第二个例子中,有第三个记录,如果它们被孤立地取消,它将取消第二个记录。 You may find you need to expand the logic to fully get what you need but it should get you started. 你可能会发现你需要扩展逻辑以完全得到你需要的东西,但它应该让你开始。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM