简体   繁体   English

在 Oracle SQL 中选择唯一行并忽略空值

[英]Selecting unique rows and ignoring nulls in Oracle SQL

There is a table with four columns: client, city, postcode and street.有一个包含四列的表:客户、城市、邮政编码和街道。 For each client I want to count the number of unique addresses.对于每个客户,我想计算唯一地址的数量。 Unfortunately, there can be nulls in some of the columns: city, postcode or street.不幸的是,某些列中可能存在空值:城市、邮政编码或街道。 I have to ignore them when comparing for count distinct.在比较不同的计数时,我必须忽略它们。 So this can't be solved by simple group by and count distinct.所以这不能通过简单的 group by 和 count distinct 来解决。

For example, there are例如,有

'client1', 'city1', 'postcode1', 'street1'
'client1', 'city1', 'postcode1', null
'client1', 'city1', null, 'street1'
'client1', null, null, 'street2'

'client1', 'city2', null, 'street1'
'client1', 'city2', null, 'street2'

For my task, the unique addresses should be ( edited )对于我的任务,唯一地址应该是(已编辑

'client1', 'city1', 'postcode1', 'street1'

'client1', 'city2', null, 'street1'
'client1', 'city2', null, 'street2'

(so the answer is 3 unique addresses for client1), (所以答案是 client1 的 3 个唯一地址),
but for a standard distinct clause these are all unique, eg, rows但是对于标准的 distinct 子句,这些都是唯一的,例如,行

'client1', 'city1', 'postcode1', 'street1'
'client1', 'city1', 'postcode1', null
'client1', 'city1', null, 'street1'

are treated as different, whereas for my task, these are not different and I want to count them as 1.被视为不同,而对于我的任务,这些并没有什么不同,我想将它们计为 1。

Edit after some comments: If we had在一些评论后编辑:如果我们有

'client1', null, null, 'street3'

then this is a unique address (since there are no other addresses with 'street3') and should be counted in.那么这是一个唯一的地址(因为没有其他带有“street3”的地址)并且应该被计算在内。

You can use the min analytical function as follows:您可以使用min解析 function 如下:

Select distinct t.client,
       t.city,
       Coalesce(t.postcode,Min(t.postcode) over (partition by t.client, t.city)) as postcode,
       Coalesce(t.street,Min(t.street) over (partition by t.client, t.city)) as street
  From your table
 Where city is not null;

-- update --更新

I can think of self hoin solution, check if it works for you.我可以想到自我解决方案,检查它是否适合您。

Select distinct a.client,
       Coalesce(a.city, b.city) as city,
       Coalesce(a.postcode, b.postcode) as postcode,
       Coalesce(a.street, b.street) as street
  From your_table a left join your_table b
    On a.client = b.client
   And (a.city = b.city or (a.city is null or b.city is null))
   And (a.postcode = b.postcode or (a.postcode is null or b.postcode is null))
   And (a.street = b.street or (a.street is null or b.street is null))
   And a.rowid <> b.rowid
       

I've solved my problem in PL/SQL.我已经在 PL/SQL 中解决了我的问题。 The code is quite lengthy, so I will give only an outline of the idea in case someone is interested.代码很长,如果有人感兴趣,我将只给出这个想法的概要。

  1. We create a table with distinct tuples.我们创建一个具有不同元组的表。 This is our input table.这是我们的输入表。

  2. We insert to our output table tuples with three not null values as they will not be covered by any other tuples with missing values.我们向 output 表元组插入三个非 null 值,因为它们不会被任何其他缺失值的元组覆盖。

  3. We insert to our output table tuples with two not null values.我们向 output 表元组插入两个非 null 值。 Here we filter the ones that are covered by tuples from 2.在这里,我们从 2 中过滤出被元组覆盖的那些。

  4. We insert to our output table tuples with one not null value.我们插入到我们的 output 表元组中,其中一个不是 null 值。 Here we filter the ones that are covered by tuples from 2 and 3.在这里,我们从 2 和 3 中过滤出被元组覆盖的那些。

I think that there are no counterexamples for this solution.我认为这个解决方案没有反例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM