简体   繁体   English

如何多次自行加入此表?

[英]How do I self join this table multiple times?

I have a table that lists the following columns:我有一个列出以下列的表格:

employee ssn, dependent ssn, last name, first name, relationship, benefit amount员工 ssn、受抚养人 ssn、姓氏、名字、关系、福利金额

in the relationship column, it will show either employee, child, or spouse.在关系列中,它将显示员工、孩子或配偶。

What I need is the following:我需要的是以下内容:

employee ssn, last name, first name, ee benefit, child benefit, spouse benefit.员工 ssn、姓氏、名字、ee 福利、儿童福利、配偶福利。

I need to join the benefit amount column where the employee ssn matches up, but the problem is whenever I do this, it gives me multiples of the same employee, and sometimes the child and/or spouse benefit isn't correct.我需要加入员工 ssn 匹配的福利金额列,但问题是每当我这样做时,它会给我同一员工的倍数,有时孩子和/或配偶福利不正确。

Here's what I've written:这是我写的:

SELECT a.[employee ssn], a.[last name], a.[first name], a.[benefit amount], b.[benefit amount] AS "child benefit", c.[benefit amount] AS "spouse benefit"
FROM allenrollments a
JOIN allenrollments b ON b.[employee ssn] = a.[employee ssn]
JOIN allenrollments c ON c.[employee ssn] = a.[employee ssn]
WHERE a.relationship = "Employee" AND b.relationship = "Child" AND c.relationship = "Spouse"

You are getting a common problem of a cross-join or Cartesian product.您遇到了交叉连接或笛卡尔积的常见问题。 For every record in the first table joined to the second is also combined with the third (in this case based on the common employee SSN.对于连接到第二个表的第一个表中的每条记录也与第三个表结合(在这种情况下基于普通员工 SSN。

Note, many systems try to do away with SSN as a primary key for many reasons and typically have auto-increment number and only use the SSN as the basis of a lookup to the details.请注意,许多系统出于多种原因试图取消 SSN 作为主键,并且通常具有自动递增编号,并且仅使用 SSN 作为查找详细信息的基础。 Otherwise, you have SSN flooded throughout your system, and exposure if hacked is not good (although getting hacked is never a good thing when by a bad/malicious actor).否则,您的系统中的 SSN 就会泛滥,如果被黑客入侵则暴露是不好的(尽管被坏/恶意的演员入侵绝不是一件好事)。

Now, that said and what you appear to have.现在,这就是你所拥有的。 You are trying to get data for a spouse and also child.您正在尝试获取配偶和孩子的数据。 But what if a family insurance benefit plan has 5 children.但是,如果家庭保险福利计划有 5 个孩子怎么办。 What is your final goal?你的最终目标是什么? What you might just be better off with is getting all rows and just having a column indicating WHO it is associated with.您可能会做得更好的是获取所有行并且只有一个列指示它与谁相关联。 Then you can do whatever count/sum coverage checks once you get the data for the entire family plan after.然后,一旦获得整个家庭计划的数据,您就可以进行任何计数/总和覆盖检查。

Also, having spaces within column names it not a good thing.此外,在列名中有空格也不是一件好事。 You will always be chasing unbalanced [brackets] to contend with, output results queried to a class structure such as getting a list of things, etc. Having said all that, I would recommend trying您将始终追逐不平衡的[括号]来应对,输出结果查询到类结构,例如获取事物列表等。说了这么多,我建议您尝试

SELECT 
      emp.[employee ssn] SSN, 
      emp.[last name] EmpLastName, 
      emp.[first name] EmpFirstName, 
      emp.[benefit amount] EmpBenefitAmount,     
      fam.[last name] FamilyMemberLastName, 
      fam.[first name] FamilyMemberFirstName, 
      fam.relationship FamilyRelationship,
      case when fam.relationship = 'Spouse'
           then fam.[benefit_amount]
           else 0 end SpouseBenefit,
      case when fam.relationship = 'Spouse'
           then 0
           else fam.[benefit_amount] end ChildBenefit
   FROM 
      allenrollments emp
         JOIN allenrollments fam
            on emp.[employee ssn] = fam.[employee ssn]
           AND NOT emp.relationship = 'Employee'
   where
      emp.relationship = 'Employee'

So the first table is based entirely for the employee who is covered.所以第一个表完全基于被覆盖的员工。 Then joining back to the same table by the SSN to find all OTHER family members.然后通过 SSN 加入同一张表以查找所有其他家庭成员。 I dont think you even need to go to the level of SpouseBenefit vs ChildBenefit columns.我认为您甚至不需要达到 SpouseBenefit 与 ChildBenefit 列的级别。 It could just be the benefit amount AS FamilyMemberBenefitAmount.它可能只是福利金额 AS FamilyMemberBenefitAmount。

I'm assuming there's some underlying data issue and that is why we're self joining, in which case:我假设存在一些潜在的数据问题,这就是我们自我加入的原因,在这种情况下:

WITH dependants as (
SELECT 
 [employee ssn],
 [benefit amount],
 RANK() 
  OVER (
   PARTITION BY [employee ssn] 
   ORDER BY [benefit amount] DESC
  ) as unique_key
FROM allenrollments
WHERE 
 [relationship] IN('Child', 'Spouse')
)

SELECT 
 emp.[employee ssn],
 emp.[first name],
 emp.[last name],
 emp.[benefit amount] as EmployeeBenefit,
 dep.[benefit amount] as DependantBenefit
FROM allenrollments emp
JOIN 
 dependants dep on emp.[employee ssn] = dep.[employee ssn]
WHERE 
 dep.unique_key = 1 AND 
 emp.relationship = 'Employee'

Here we are using a CTE to create a query that just lists all dependent benefits and their associated employee ssns.在这里,我们使用CTE创建一个列出所有相关福利及其相关员工 ssns 的查询。 We are using a RANK() function to give us something we can filter on (since we would get multiple records per employee ssn if there are multiple dependents).我们正在使用RANK()函数为我们提供可以过滤的内容(因为如果有多个受抚养人,我们将获得每个员工 ssn 的多条记录)。 We then filter for values 'Employee' and the 1st rank in the outer query.然后,我们过滤值“员工”和外部查询中的第一个排名。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM