繁体   English   中英

SQL:选择行不在同一个表中的条件的事务

[英]SQL: Select transactions where rows are not of criteria inside the same table

我有一个交易表:

Transactions
------------
id | account | type | date_time             | amount
----------------------------------------------------
 1 | 001     | 'R'  | '2012-01-01 10:01:00' | 1000
 2 | 003     | 'R'  | '2012-01-02 12:53:10' | 1500
 3 | 003     | 'A'  | '2012-01-03 13:10:01' | -1500
 4 | 002     | 'R'  | '2012-01-03 17:56:00' | 2000
 5 | 001     | 'R'  | '2012-01-04 12:30:01' | 1000
 6 | 002     | 'A'  | '2012-01-04 13:23:01' | -2000
 7 | 003     | 'R'  | '2012-01-04 15:13:10' | 3000
 8 | 003     | 'R'  | '2012-01-05 12:12:00' | 1250
 9 | 003     | 'A'  | '2012-01-06 17:24:01' | -1250

并且我希望选择所有特定类型('R'),但不是那些立即(按照date_time字段的顺序)为同一帐户提交的另一种类型('A')的另一个交易...

因此,在前面的示例中,查询应抛出以下行:

id | account |type  | date                  | amount
----------------------------------------------------
 1 | 001     | 'R'  | '2012-01-01 10:01:00' | 1000
 5 | 001     | 'R'  | '2012-01-04 12:30:01' | 1000
 7 | 003     | 'R'  | '2012-01-04 15:13:10' | 3000

(如您所见,第2行未显示,因为第3行'取消'它...第4行'第4行被'取消';行7确实出现(即使帐户003属于已取消的第2行) ,这次在第7行,它没有被任何'A'行取消;并且第8行也不会出现(它也是003帐户,因为现在这个被9取消,这也不会取消7,只是前一个一:8 ......

我在Where子句中尝试了Joins,子查询,但我真的不确定如何进行查询...

我尝试过的:

尝试加入:

   SELECT trans.type as type,
          trans.amount as amount,
          trans.date_time as dt,
          trans.account as acct,
     FROM Transactions trans
INNER JOIN ( SELECT t.type AS type, t.acct AS acct, t.date_time AS date_time
               FROM Transactions t
              WHERE t.date_time > trans.date_time
           ORDER BY t.date_time DESC
          ) AS nextTrans
       ON nextTrans.acct = trans.acct
    WHERE trans.type IN ('R')
      AND nextTrans.type NOT IN ('A')
 ORDER BY DATE(trans.date_time) ASC

这会引发错误,因为我无法将外部值引入MySQL中的JOIN。

在以下位置尝试子查询:

   SELECT trans.type as type,
          trans.amount as amount,
          trans.date_time as dt,
          trans.account as acct,
     FROM Transactions trans
    WHERE trans.type IN ('R')
      AND trans.datetime <
          ( SELECT t.date_time AS date_time
               FROM Transactions t
              WHERE t.account = trans.account
           ORDER BY t.date_time DESC
          ) AS nextTrans
       ON nextTrans.acct = trans.acct

 ORDER BY DATE(trans.date_time) ASC

这是错误的,我可以将外部值引入MySQL中的WHERE,但我无法找到正确过滤我需要的方法...

重要编辑:

我设法实现了解决方案,但现在需要认真优化。 这里是:

SELECT *
  FROM (SELECT t1.*, tFlagged.id AS cancId, tFlagged.type AS cancFlag
          FROM transactions t1
     LEFT JOIN (SELECT t2.*
                  FROM transactions t2
              ORDER BY t2.date_time ASC ) tFlagged
            ON (t1.account=tFlagged.account
                  AND
                t1.date_time < tFlagged.date_time)
         WHERE t1.type = 'R'
      GROUP BY t1.id) tCanc
 WHERE tCanc.cancFlag IS NULL
    OR tCanc.cancFlag <> 'A'

我自己加入了这个表,只考虑了相同的帐户和很棒的date_time。 Join按date_time排序。 按ID分组我设法只获得了连接的第一个结果,这恰好是同一帐户的下一个事务。

然后在外部选择上,我过滤掉那些具有“A”的东西,因为这意味着下一个交易实际上是对它的取消。 换句话说,如果同一个帐户没有下一个交易,或者下一个交易是'R',那么它不会被取消,并且必须在结果中显示...

我懂了:

+----+---------+------+---------------------+--------+--------+----------+
| id | account | type | date_time           | amount | cancId | cancFlag |
+----+---------+------+---------------------+--------+--------+----------+
|  1 | 001     |   R  | 2012-01-01 10:01:00 |   1000 |      5 | R        |
|  5 | 001     |   R  | 2012-01-04 12:30:01 |   1000 |   NULL | NULL     |
|  7 | 003     |   R  | 2012-01-04 15:13:10 |   3000 |      8 | R        |
+----+---------+------+---------------------+--------+--------+----------+

它将每个交易与下一个交易关联到同一个帐户,然后筛选出已取消的交易...成功!!

正如我所说,现在的问题是优化。 我的真实数据有很多行(因为预计会有时间跨越事务的表),而对于现在约有10,000行的表,我在1分44秒内得到了一个积极的结果。 我想这就是加入的东西......(对于那些在这里知道协议的人,我该怎么做?在这里发一个新问题并将其作为解决方案发布到这个?或者只是在这里等待更多答案?)

这是一个基于嵌套子查询的解决方案。 首先,我添加了几行来捕获更多案例。 例如,交易10不应该取消交易10,因为交易11介于两者之间。

> select * from transactions order by date_time;
+----+---------+------+---------------------+--------+
| id | account | type | date_time           | amount |
+----+---------+------+---------------------+--------+
|  1 |       1 | R    | 2012-01-01 10:01:00 |   1000 |
|  2 |       3 | R    | 2012-01-02 12:53:10 |   1500 |
|  3 |       3 | A    | 2012-01-03 13:10:01 |  -1500 |
|  4 |       2 | R    | 2012-01-03 17:56:00 |   2000 |
|  5 |       1 | R    | 2012-01-04 12:30:01 |   1000 |
|  6 |       2 | A    | 2012-01-04 13:23:01 |  -2000 |
|  7 |       3 | R    | 2012-01-04 15:13:10 |   3000 |
|  8 |       3 | R    | 2012-01-05 12:12:00 |   1250 |
|  9 |       3 | A    | 2012-01-06 17:24:01 |  -1250 |
| 10 |       3 | R    | 2012-01-07 00:00:00 |   1250 |
| 11 |       3 | R    | 2012-01-07 05:00:00 |   4000 |
| 12 |       3 | A    | 2012-01-08 00:00:00 |  -1250 |
| 14 |       2 | R    | 2012-01-09 00:00:00 |   2000 |
| 13 |       3 | A    | 2012-01-10 00:00:00 |  -1500 |
| 15 |       2 | A    | 2012-01-11 04:00:00 |  -2000 |
| 16 |       2 | R    | 2012-01-12 00:00:00 |   5000 |
+----+---------+------+---------------------+--------+
16 rows in set (0.00 sec)

首先,创建一个查询,为每个事务“获取同一帐户中该事务之前的最近事务的日期”:

SELECT t2.*,
       MAX(t1.date_time) AS prev_date
FROM transactions t1
JOIN transactions t2
ON (t1.account = t2.account
   AND t2.date_time > t1.date_time)
GROUP BY t2.account,t2.date_time
ORDER BY t2.date_time;

+----+---------+------+---------------------+--------+---------------------+
| id | account | type | date_time           | amount | prev_date           |
+----+---------+------+---------------------+--------+---------------------+
|  3 |       3 | A    | 2012-01-03 13:10:01 |  -1500 | 2012-01-02 12:53:10 |
|  5 |       1 | R    | 2012-01-04 12:30:01 |   1000 | 2012-01-01 10:01:00 |
|  6 |       2 | A    | 2012-01-04 13:23:01 |  -2000 | 2012-01-03 17:56:00 |
|  7 |       3 | R    | 2012-01-04 15:13:10 |   3000 | 2012-01-03 13:10:01 |
|  8 |       3 | R    | 2012-01-05 12:12:00 |   1250 | 2012-01-04 15:13:10 |
|  9 |       3 | A    | 2012-01-06 17:24:01 |  -1250 | 2012-01-05 12:12:00 |
| 10 |       3 | R    | 2012-01-07 00:00:00 |   1250 | 2012-01-06 17:24:01 |
| 11 |       3 | R    | 2012-01-07 05:00:00 |   4000 | 2012-01-07 00:00:00 |
| 12 |       3 | A    | 2012-01-08 00:00:00 |  -1250 | 2012-01-07 05:00:00 |
| 14 |       2 | R    | 2012-01-09 00:00:00 |   2000 | 2012-01-04 13:23:01 |
| 13 |       3 | A    | 2012-01-10 00:00:00 |  -1500 | 2012-01-08 00:00:00 |
| 15 |       2 | A    | 2012-01-11 04:00:00 |  -2000 | 2012-01-09 00:00:00 |
| 16 |       2 | R    | 2012-01-12 00:00:00 |   5000 | 2012-01-11 04:00:00 |
+----+---------+------+---------------------+--------+---------------------+
13 rows in set (0.00 sec)

将其用作子查询以使每个事务及其前任在同一行上。 使用一些过滤来抽出我们感兴趣的交易 - 即'A'交易,其前身是'R'交易,它们完全取消 -

SELECT
  t3.*,transactions.*
FROM
  transactions
  JOIN
  (SELECT t2.*,
          MAX(t1.date_time) AS prev_date
   FROM transactions t1
   JOIN transactions t2
   ON (t1.account = t2.account
      AND t2.date_time > t1.date_time)
   GROUP BY t2.account,t2.date_time) t3
  ON t3.account = transactions.account
     AND t3.prev_date = transactions.date_time
     AND t3.type='A'
     AND transactions.type='R'
     AND t3.amount + transactions.amount = 0
  ORDER BY t3.date_time;


+----+---------+------+---------------------+--------+---------------------+----+---------+------+---------------------+--------+
| id | account | type | date_time           | amount | prev_date           | id | account | type | date_time           | amount |
+----+---------+------+---------------------+--------+---------------------+----+---------+------+---------------------+--------+
|  3 |       3 | A    | 2012-01-03 13:10:01 |  -1500 | 2012-01-02 12:53:10 |  2 |       3 | R    | 2012-01-02 12:53:10 |   1500 |
|  6 |       2 | A    | 2012-01-04 13:23:01 |  -2000 | 2012-01-03 17:56:00 |  4 |       2 | R    | 2012-01-03 17:56:00 |   2000 |
|  9 |       3 | A    | 2012-01-06 17:24:01 |  -1250 | 2012-01-05 12:12:00 |  8 |       3 | R    | 2012-01-05 12:12:00 |   1250 |
| 15 |       2 | A    | 2012-01-11 04:00:00 |  -2000 | 2012-01-09 00:00:00 | 14 |       2 | R    | 2012-01-09 00:00:00 |   2000 |
+----+---------+------+---------------------+--------+---------------------+----+---------+------+---------------------+--------+
4 rows in set (0.00 sec)

从上面的结果可以看出我们几乎就在那里 - 我们已经确定了不需要的交易。 使用LEFT JOIN我们可以从整个事务集中筛选出这些:

SELECT
  transactions.*
FROM
  transactions
LEFT JOIN
  (SELECT
     transactions.id
   FROM
     transactions
     JOIN
     (SELECT t2.*,
             MAX(t1.date_time) AS prev_date
      FROM transactions t1
      JOIN transactions t2
      ON (t1.account = t2.account
         AND t2.date_time > t1.date_time)
      GROUP BY t2.account,t2.date_time) t3
     ON t3.account = transactions.account
        AND t3.prev_date = transactions.date_time
        AND t3.type='A'
        AND transactions.type='R'
        AND t3.amount + transactions.amount = 0) t4
  USING(id)
  WHERE t4.id IS NULL
    AND transactions.type = 'R'
  ORDER BY transactions.date_time;

+----+---------+------+---------------------+--------+
| id | account | type | date_time           | amount |
+----+---------+------+---------------------+--------+
|  1 |       1 | R    | 2012-01-01 10:01:00 |   1000 |
|  5 |       1 | R    | 2012-01-04 12:30:01 |   1000 |
|  7 |       3 | R    | 2012-01-04 15:13:10 |   3000 |
| 10 |       3 | R    | 2012-01-07 00:00:00 |   1250 |
| 11 |       3 | R    | 2012-01-07 05:00:00 |   4000 |
| 16 |       2 | R    | 2012-01-12 00:00:00 |   5000 |
+----+---------+------+---------------------+--------+

(编辑2)尝试这个:

 SELECT trans.tp as type,
trans.id as id,
 trans.amount as amount, 
trans.date_time as dt, 
trans.account as acct
FROM Transactions trans
WHERE trans.tp = 'R' 
AND trans.account NOT IN (SELECT t.account AS acct
   FROM Transactions t
 WHERE t.date_time > trans.date_time
 AND t.tp = 'A'
AND t.amount = (trans.amount)-((trans.amount)*2)
  ORDER BY t.date_time DESC
 )  ;

这里我试过MSSQL。 请检查逻辑并尝试使用mysql。 我假设逻辑是在第一次交易取消后进行新交易。 在您的插图中,在id = 3取消后生成id = 7。

我已经检查过mssql

create table Transactions(id int,account varchar(5),  tp char(1),date_time datetime,amount int)

insert into Transactions values (1,'001','R','2012-01-01 10:01:00',1000)
insert into Transactions values (2,'003','R','2012-01-02 12:53:10',1500)
insert into Transactions values (3,'003','A','2012-01-03 13:10:01',-1500)
insert into Transactions values (4,'002','R','2012-01-03 17:56:00',2000)
insert into Transactions values (5,'001','R','2012-01-04 12:30:01',1000)
insert into Transactions values (6,'002','A','2012-01-04 13:23:01',-2000)
insert into Transactions values (7,'003','R','2012-01-04 15:13:10',3000)


select t.id, t.account, t.date_time, t.amount
from Transactions t
where t.tp = 'R'
and not exists
(
    select account, date_time
    from Transactions
    where tp = 'A'
    and account = t.account
    and t.date_time < date_time
)

如果ID在date_time上排序时确实对应于行的索引 - 就像在示例中那样(如果没有,则可以创建这样的ID字段) - 您可以这样做:

SELECT t1.*
FROM transactions t1 JOIN transactions t2 ON(t2.id = t1.id + 1)
WHERE t1.type = 'R'
  AND NOT((t2.type = 'A') AND ((t1.amount + t2.amount) = 0))

即使用ID字段将每一行及其后继行放在同一结果行中; 然后筛选所需的属性。

要改善您的查询,请尝试以

SELECT t1.*, tFlagged.id AS cancId, tFlagged.tp AS cancFlag FROM t t1
LEFT JOIN t tFlagged
ON t1.account = tFlagged.account AND t1.date_time < tFlagged.date_time
WHERE t1.tp = 'R' 
GROUP BY t1.id
HAVING tFlagged.tp is null or tFlagged.tp <> 'A'

它运行得更快......希望提供相同的结果:P

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM