[英]Group by not detecting duplicates but there are dupes. Strange SQL Server - Azure Synapse database dedicated SQL pool
I have encountered a strange (until I understand the logical reason) behaviour of group by
in a SQL Server database.我在 SQL 服务器数据库中遇到
group by
一种奇怪的(直到我理解逻辑原因)行为。 There are many duplicates in a table, for which when I query I get duplicate rows but when I try to find all dupes using group by or row_number
strategy I get 0 records.表中有很多重复项,当我查询时,我得到重复的行,但是当我尝试使用 group by 或
row_number
策略查找所有重复项时,我得到 0 条记录。 But when I add "Cast" to the group by / row_number I get correct list of duplicates.但是当我将“Cast”添加到组中时 / row_number 我得到了正确的重复项列表。
The datatype is nvarchar
for all 3 keys.所有 3 个键的数据类型都是
nvarchar
。
Can someone tell me why this is happening?有人能告诉我为什么会这样吗?
Added the query and its output添加了查询及其 output
select top 10 len(VBELN) len_vblen, len(MANDT) , len(posnr) , * from [SRC_SAP_R3].[LIPS] where VBELN = '6316785926'
select cast(MANDT as nvarchar) as "MANDT",cast(VBELN as nvarchar) as "VBELN" , cast(posnr as nvarchar) as "posnr", count(*) from [SRC_SAP_R3].[LIPS]
group by cast(MANDT as nvarchar),cast(VBELN as nvarchar) , cast(posnr as nvarchar)
having count(*)>1;
select cast(MANDT as varchar) as "MANDT",cast(VBELN as varchar) as "VBELN" , cast(posnr as varchar) as "posnr", count(*) from [SRC_SAP_R3].[LIPS]
group by cast(MANDT as varchar),cast(VBELN as varchar) , cast(posnr as varchar)
having count(*)>1;
select MANDT, VBELN ,posnr, count(1) from [SRC_SAP_R3].[LIPS]
group by MANDT, VBELN ,posnr
having count(1)>1;
I tried to repro this in Azure Synapse Analytics.我试图在 Azure Synapse Analytics 中重现这一点。 As @Martin Smith said the len() function will ignore the trailing spaces while computing the total length of the column.
正如@Martin Smith所说, len() function 在计算列的总长度时将忽略尾随空格。 When I tried with datalength() function, the length of trailing spaces is also included.
当我尝试使用datalength() function 时,尾随空格的长度也包括在内。 Below is the repro.
下面是复制品。
create table SAP_TAB (VBELN varchar(100))
insert into SAP_TAB values('500 ')
insert into SAP_TAB values('500')
select VBELN,len(VBELN) as [length_VBELN],
datalength(VBELN) as [data_length_VBELN],
len(cast(VBELN as varchar(10))) as
[length_varchar_casted_VBELN]
from sap_tab
Result结果
VBELN ![]() |
length_VBELN![]() |
data_length_VBELN![]() |
length_varchar_casted_VBELN ![]() |
---|---|---|---|
500 ![]() |
3 ![]() |
5 ![]() |
3 ![]() |
500 ![]() |
3 ![]() |
3 ![]() |
3 ![]() |
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.