简体   繁体   English

当 oracle 中的连接字段之一是多对 1 时,如何在连接 3 个表时根据最大日期获得准确计数?

[英]how can i get an accurate count based on max date when joining 3 tables when one of the join fields is many to 1 in oracle?

So, I have 3 tables that I am attempting to get counts for based on a groupid, and a task code.所以,我有 3 个表,我试图根据 groupid 和任务代码获取计数。 There are a few issues I am having as some of the relationships are many to one, which I think is somehow inflating my counts.我遇到了一些问题,因为一些关系是多对一的,我认为这在某种程度上夸大了我的计数。 I will list my 3 tables with the pertinent attributes.我将列出我的 3 个具有相关属性的表。

task_table contains: task_table 包含:

task_code - would like to get the counts of each one in a group id, would like to use the latest instance basedon event date. task_code - 想要获取组 id 中每个人的计数,想要使用基于事件日期的最新实例。

sol_id -used to join to worktable; sol_id - 用于加入工作表; many sol_id to one m_id is possible许多 sol_id 到一个 m_id 是可能的

edate -need to use to get one record edate - 需要用来获取一条记录

cur_id - where cur_id = 1 in the where clause cur_id - where 子句中的 cur_id = 1

worktable contains:工作台包含:

sol_id - used to join to task_table sol_id - 用于加入 task_table

m_id - used to join to grouptable m_id - 用于加入分组表

grouptable contains:组表包含:

m_id

groupid- used to group the task_code to get count groupid- 用于对 task_code 进行分组以获取计数

I'd like the end result to look like:我希望最终结果如下所示:

group_id    task_count  task
5555        45          A
5555        4           N
5624        67          A
5624        23          O
5624        42          X

I have been attempting to run a number of queries, but the counts I am getting back do not look correct.我一直在尝试运行一些查询,但我返回的计数看起来不正确。 I am concerned that it is somehow returning more than one instance of the m_id somehow?我担心它会以某种方式返回多个 m_id 实例? Here is the query in question:这是有问题的查询:

select  c.groupid, count(c.groupid) group_count, a.task_code from task_table a
join worktable b
on a.sol_id = b.sol_id
join grouptable c
on b.m_id= c.m_id
where   a.cur_id = 1 and a.task_code is not null 
group by c.groupid, a.task_code;

If I add 'edate = (select max(edate) from task_table)' in the where clause, it returns an empty table.如果我在 where 子句中添加 'edate = (select max(edate) from task_table)',它将返回一个空表。

I am unsure how to incorporate edate to get only the newest record that fits the criteria in the where clause.我不确定如何合并 edate 以仅获取符合 where 子句中标准的最新记录。 The reason I think I want to use this is because there could be more than one sol_id that is associated with a m_id, so i'd just like to include only the newest record with a cur_id in the count.我想我想使用它的原因是因为与 m_id 关联的 sol_id 可能不止一个,所以我只想在计数中只包含带有 cur_id 的最新记录。 Thank you for your time.感谢您的时间。

sample data样本数据

task_table任务表

task_code  sol_id  edate   cur_id
A          23      6/7/09    1
A          24      6/4/09    1
A          23      6/10/09   0
B          45      6/2/09    1
B          42      6/3/09    1
C          34      10/8/10   0
C          83      9/10/09   1   

work table工作表

sol_id    m_id
23        1234
24        1234
45        1832
42        1343
83        7623

group table组桌

m_id  group_id
1234   A76
1832   Y23
1343   A76
7623   Y23

looking at these tables, the result should look like the following查看这些表,结果应如下所示

group_id    task_count  task
A76       2         A            
Y23       1         C       
 

( A76 should only count sol_id 23 and 42) ( Y23 should only count sol_id 83) (A76 应该只计算 sol_id 23 和 42)(Y23 应该只计算 sol_id 83)

So, there's a conflict in your requested data result.因此,您请求的数据结果存在冲突。 According to your own sample, A76 should have a task_count of 2: sol_id 23, which has Task A, and sol_id 42, which has Task B. It's not possible to have it return a row like you have at your example result table because it would need to group by TASK_CODE, which means losing the COUNT(task_code).根据您自己的示例,A76 的 task_count 应该为 2:sol_id 23,其中包含任务 A,而 sol_id 42,其中包含任务 B。它不可能像您在示例结果表中那样返回一行,因为它需要按 TASK_CODE 分组,这意味着丢失 COUNT(task_code)。 Can't have it both ways.不能两全。

In order to obtain only the most recent edate, I did a separate calculation to location that max(edate) by task_code, then joined it back to obtain the sol_id.为了只获取最近的 edate,我通过 task_code 对 max(edate) 的位置进行了单独的计算,然后将其加入以获取 sol_id。 If this isn't accurate for your data set, you'll need to determine another way of obtaining max(edate).如果这对您的数据集不准确,您需要确定另一种获取 max(edate) 的方法。 This works for your sample set.这适用于您的样本集。

with recentTasks as (
   select task_code, max(edate) as recentDate
   from task_table m 
   where cur_id = 1
     and task_code is not null 
   group by task_code
), recentTaskWithSols as (
  select m.task_code, m.recentDate as edate, t.sol_id
  from recentTasks m 
  join task_table t on m.task_code = t.task_code AND m.recentDate = t.edate
  where t.cur_id = 1
)
select c.group_id, 
  count(a.sol_id) task_count
from group_table c
join work_table b on c.m_id = b.m_id 
join recentTaskWithSols a on b.sol_id = a.sol_id 
group by c.group_id;

gives the result:给出结果:

+------------------------+
| GROUP_ID  | TASK_COUNT |
+------------------------+
|    A76    |     2      |
|    Y23    |     1      |
+-----------+------------+

Demo here . 演示在这里

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM