简体   繁体   中英

SQL - How to find a value in a tree level data structure

I have two SQL Server tables:

  • Invoice ( invoice )
  • Invoice Relations ( invoice_relation )

invoice table stores all invoice records with a transaction folio.

在此输入图像描述

invoice_relation table stores any relation between invoices.

在此输入图像描述

This is an example of how invoices can be related between each other:

在此输入图像描述

So the goal is to find the " folio " under invoice table given an invoicenumber and a folio but the folio sometimes won't be the folio that the invoice has, so I need to do a search on all the tree relation in order to find if any invoice match the invoice number but also the folio is part of the relation.

For example I have to find the folio and match invoice number of:

  • Folio: 1003
  • Invoice Number: A1122

In my query I would need to first find by folio because it's my invoice table primary key. Then, will try to match A1122 with D1122 that won't match, so then I have to search all the tree structure to find if there is a A1122 . The result will be that the invoice A1122 was found in folio 1000 .

Any clue on how to do this?

Here is a script of how to create the above example tables with data:

CREATE TABLE [dbo].[invoice](
    [folio] [int] NOT NULL,
    [invoicenumber] [nvarchar](20) NOT NULL,
    [isactive] [bit] NOT NULL,
 CONSTRAINT [PK_invoice] PRIMARY KEY CLUSTERED 
(
    [folio] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO


CREATE TABLE [dbo].[invoice_relation](
    [relationid] [int] NOT NULL,
    [invoice] [nvarchar](20) NOT NULL,
    [parentinvoice] [nvarchar](20) NOT NULL,
 CONSTRAINT [PK_invoice_relation_1] PRIMARY KEY CLUSTERED 
(
    [relationid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1000, N'A1122', 1)
GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1001, N'B1122', 1)
GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1002, N'C1122', 1)
GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1003, N'D1122', 1)
GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1004, N'F1122', 1)
GO
INSERT [dbo].[invoice] ([folio], [invoicenumber], [isactive]) VALUES (1005, N'G1122', 1)
GO
INSERT [dbo].[invoice_relation] ([relationid], [invoice], [parentinvoice]) VALUES (1, N'A1122', N'B1122')
GO
INSERT [dbo].[invoice_relation] ([relationid], [invoice], [parentinvoice]) VALUES (2, N'C1122', N'A1122')
GO
INSERT [dbo].[invoice_relation] ([relationid], [invoice], [parentinvoice]) VALUES (3, N'D1122', N'A1122')
GO
INSERT [dbo].[invoice_relation] ([relationid], [invoice], [parentinvoice]) VALUES (4, N'F1122', N'B1122')
GO
INSERT [dbo].[invoice_relation] ([relationid], [invoice], [parentinvoice]) VALUES (5, N'G1122', N'F1122')
GO

I am still not sure what you really want, I had written something similar to JamieD77 which is to find top parent and then walk back down tree but then you get children and granchildren that are not directly related to A1122.....

Here is a way to walk up and down the tree and return all children and parents directly related to an invoicenumber

DECLARE @InvoiceNumber NVARCHAR(20) = 'A1122'
DECLARE @Folio INT = 1003

;WITH cteFindParents AS (
    SELECT
       i.folio
       ,i.invoicenumber
       ,CAST(NULL AS NVARCHAR(20)) as ChildInvoiceNumber
       ,CAST(NULL AS NVARCHAR(20)) as ParentInvoiceNumber
       ,0 as Level
    FROM
       dbo.invoice i
    WHERE
       i.invoicenumber = @InvoiceNumber

    UNION ALL

    SELECT
       i.folio
       ,i.invoicenumber
       ,c.invoicenumber as ChildInvoiceNumber
       ,i.invoicenumber as ParentInvoiceNumber
       ,c.Level - 1 as Level
    FROM
       cteFindParents c
       INNER JOIN dbo.invoice_relation r
       ON c.invoicenumber = r.invoice
       INNER JOIN dbo.invoice i
       ON r.parentinvoice = i.invoicenumber
)

, cteFindChildren as (
    SELECT *
    FROM
       cteFindParents

    UNION ALL

    SELECT
       i.folio
       ,i.invoicenumber
       ,i.invoicenumber AS ChildInvoiceNumber
       ,c.invoicenumber AS ParentInvoiceNumber
       ,Level + 1 as Level
    FROM
       cteFindChildren c
       INNER JOIN dbo.invoice_relation r
       ON c.invoicenumber = r.parentinvoice
       INNER JOIN dbo.invoice i
       ON r.invoice = i.invoicenumber
    WHERE
       c.Level = 0
)

SELECT *
FROM
    cteFindChildren

But Depending on what exactly you are looking for you may actually get a couple of cousins that are not desired.....

--------------Here was a method to find top parent and get the whole tree

DECLARE @InvoiceNumber NVARCHAR(20) = 'A1122'
DECLARE @Folio INT = 1003

;WITH cteFindParents AS (
    SELECT
       i.folio
       ,i.invoicenumber
       ,CAST(NULL AS NVARCHAR(20)) as ChildInvoiceNumber
       ,0 as Level
    FROM
       dbo.invoice i
    WHERE
       i.invoicenumber = @InvoiceNumber

    UNION ALL

    SELECT
       i.folio
       ,i.invoicenumber
       ,c.invoicenumber as ChildInvoiceNumber
       ,c.Level + 1 as Level
    FROM
       cteFindParents c
       INNER JOIN dbo.invoice_relation r
       ON c.invoicenumber = r.invoice
       INNER JOIN dbo.invoice i
       ON r.parentinvoice = i.invoicenumber

)

, cteGetTopParent AS  (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY LEVEL DESC) as RowNum
    FROM
       cteFindParents
)

, cteGetWholeTree AS (
    SELECT
       p.folio
       ,p.invoicenumber
       ,p.invoicenumber as TopParent
       ,p.invoicenumber as Parent
       ,CAST(p.invoicenumber AS NVARCHAR(1000)) as Hierarchy
       ,0 as Level
    FROM
       cteGetTopParent p
    WHERE
       RowNum = 1

    UNION ALL

    SELECT
       i.folio
       ,i.invoicenumber
       ,c.TopParent
       ,c.invoicenumber AS Parent
       ,CAST(c.TopParent + '|' + (CASE WHEN Level > 0 THEN c.invoicenumber + '|' ELSE '' END) + i.invoicenumber  AS NVARCHAR(1000)) as Hierarchy
       ,Level + 1 as Level
    FROM
       cteGetWholeTree c
       INNER JOIN dbo.invoice_relation r
       ON c.invoicenumber = r.parentinvoice
       INNER JOIN dbo.invoice i
       ON r.invoice = i.invoicenumber
)

SELECT *
FROM
    cteGetWholeTree

Your model is broken to begin with. parentinvoice should be in the invoice table. It's a recursive database model....so make the table schema recursive. Have a nullable foreign key in a column that references it's own table. Any time that field (the parent invoice field) is null indicates that it is the primary invoice. Any row having a parent is a piece of invoice.

When you want to find a value in a tree level structure you wrap your initial sql query into a 'SELECT(.....)' statement (creating your own custom selectable table) that filters out what you want. Let me know if you have any questions!

I was a little unclear as to your actual requirements, so I figured a Table Valued Function may be appropriate here. I added a few optional items, and if unwanted, they are easy enough to remove (ie TITLE, Nesting, TopInvoice, TopFolio). Also, you may notice the Range Keys (R1/R2). These serve many functions: Presentation Sequence, Selection Criteria, Parent/Leaf Indicators, and perhaps most importantly Non-Recursive Aggregation.

To Return the Entire Hierarchy

Select * from [dbo].[udf_SomeFunction](NULL,NULL)     

在此输入图像描述


To Return an Invoice and ALL of its descendants

Select * from [dbo].[udf_SomeFunction]('A1122',NULL) 

在此输入图像描述


To Return the PATH of a Folio

Select * from [dbo].[udf_SomeFunction](NULL,'1003') 

在此输入图像描述


To Return a Folio Limited to an Invoice

Select * from [dbo].[udf_SomeFunction]('A1122','1003')

在此输入图像描述

The following code requires SQL 2012+

CREATE FUNCTION [dbo].[udf_SomeFunction](@Invoice nvarchar(25),@Folio nvarchar(25))
Returns Table
As  
Return (
with cteBld as (
      Select Seq  = cast(1000+Row_Number() over (Order By Invoice) as nvarchar(500)),I.Invoice,I.ParentInvoice,Lvl=1,Title = I.Invoice,F.Folio
        From (
              Select Distinct 
                     Invoice=ParentInvoice
                    ,ParentInvoice=cast(NULL as nvarchar(20)) 
              From   [Invoice_Relation] 
              Where  @Invoice is NULL and ParentInvoice Not In (Select Invoice from [Invoice_Relation])
              Union  All
              Select Invoice
                    ,ParentInvoice 
              From   [Invoice_Relation] 
              Where  Invoice=@Invoice
             ) I
        Join Invoice F on I.Invoice=F.InvoiceNumber
      Union  All
      Select Seq  = cast(concat(A.Seq,'.',1000+Row_Number() over (Order by I.Invoice)) as nvarchar(500))
            ,I.Invoice
            ,I.ParentInvoice
            ,A.Lvl+1
            ,I.Invoice,F.folio
      From   [Invoice_Relation] I
      Join   cteBld  A on I.ParentInvoice = A.Invoice 
      Join   Invoice F on I.Invoice=F.InvoiceNumber )
     ,cteR1  as (Select Seq,Invoice,Folio,R1=Row_Number() over (Order By Seq) From cteBld)
     ,cteR2  as (Select A.Seq,A.Invoice,R2=Max(B.R1) From cteR1 A Join cteR1 B on (B.Seq like A.Seq+'%') Group By A.Seq,A.Invoice )

Select Top 100 Percent 
       B.R1
      ,C.R2
      ,A.Invoice
      ,A.ParentInvoice
      ,A.Lvl
      ,Title = Replicate('|-----',A.Lvl-1)+A.Title    -- Optional: Added for Readability
      ,A.Folio
      ,TopInvoice  = First_Value(A.Invoice) over (Order By R1) 
      ,TopFolio    = First_Value(A.Folio)   over (Order By R1) 
 From  cteBld A 
 Join  cteR1  B on A.Invoice=B.Invoice 
 Join  cteR2  C on A.Invoice=C.Invoice 
 Where (@Folio is NULL)
    or (@Folio is Not NULL and (Select R1 from cteR1 Where Folio=@Folio) between R1 and R2)
 Order By R1
)

Final Thoughts:

This certainly may be more than what you were looking, and there is good chance that I COMPLETELY misunderstood your requirements. That said, being a TVF you can expand with additional WHERE and/or ORDER clauses or even incorporate into a CROSS APPLY.

This uses an approach of using a hierarchyid , first generating hierarchyid for every row, then selecting the row where folio is 1003, then finding all ancestors who have an invoicenumber of 'A1122'. It's not very efficient but may give you some different ideas:

;WITH
Allfolios
AS
(
    Select i.folio, i.InvoiceNumber,
          hierarchyid::Parse('/' + 
               CAST(ROW_NUMBER() 
                        OVER (ORDER BY InvoiceNumber) AS VARCHAR(30)
                   ) + '/') AS hierarchy, 1 as level
    from invoice i
    WHERE NOT EXISTS 
        (SELECT * FROM invoice_relation ir WHERE ir.invoice = i. invoicenumber)
    UNION ALL
    SELECT i.folio, i.invoiceNumber,
          hierarchyid::Parse(CAST(a.hierarchy as VARCHAR(30)) + 
                             CAST(ROW_NUMBER() 
                                   OVER (ORDER BY a.InvoiceNumber) 
                               AS VARCHAR(30)) + '/') AS hierarchy, 
          level + 1
    FROM Allfolios A
    INNER JOIN invoice_relation ir
        on a.InvoiceNumber = ir.ParentInvoice
    INNER JOIN invoice i
        on ir.Invoice = i.invoicenumber
),
Ancestors
AS
(
    SELECT folio, invoiceNumber, hierarchy, hierarchy.GetAncestor(1) as AncestorId
    from Allfolios
    WHERE folio = 1003
    UNION ALL
    SELECT af.folio, af.invoiceNumber, af.hierarchy, af.hierarchy.GetAncestor(1)
      FROM Allfolios AF
      INNER JOIN 
            Ancestors a ON Af.hierarchy= a.AncestorId
)
SELECT *
FROM Ancestors
WHERE InvoiceNumber = 'A1122'

Edited for the case highlighted by @jj32 where you wish to find the root element in the hierarchy which folio 1003 is in, then find any descendant of that root which has an invoice number of 'A1122'. See below:

;WITH
Allfolios -- Convert all rows to a hierarchy
AS
(
    Select i.folio, i.InvoiceNumber,
          hierarchyid::Parse('/' + 
               CAST(ROW_NUMBER() 
                        OVER (ORDER BY InvoiceNumber) AS VARCHAR(30)
                   ) + '/') AS hierarchy, 1 as level
    from invoice i
    WHERE NOT EXISTS 
        (SELECT * FROM invoice_relation ir WHERE ir.invoice = i. invoicenumber)
    UNION ALL
    SELECT i.folio, i.invoiceNumber,
          hierarchyid::Parse(CAST(a.hierarchy as VARCHAR(30)) + 
                             CAST(ROW_NUMBER() 
                                   OVER (ORDER BY a.InvoiceNumber) 
                               AS VARCHAR(30)) + '/') AS hierarchy, 
          level + 1
    FROM Allfolios A
    INNER JOIN invoice_relation ir
        on a.InvoiceNumber = ir.ParentInvoice
    INNER JOIN invoice i
        on ir.Invoice = i.invoicenumber
),
Root -- Find Root
AS
(
    SELECT *
    FROM AllFolios AF
    WHERE Level = 1 AND 
    (SELECT hierarchy.IsDescendantOf(AF.hierarchy)  from AllFolios AF2 WHERE folio = 1003) = 1
)
-- Find all descendants of the root element which have an invoicenumber = 'A1122'
SELECT *
FROM ALLFolios
WHERE hierarchy.IsDescendantOf((SELECT TOP 1 hierarchy FROM Root)) = 1 AND
invoicenumber = 'A1122'

this was tricky since you have a separate relation table and the root invoice is not in it.

DECLARE @folio INT = 1003,
        @invoice NVARCHAR(20) = 'A1122'


-- find highest level of relationship
;WITH cte  AS (
    SELECT  i.folio,
            i.invoicenumber,
            ir.parentinvoice,
            0 AS [level]
    FROM    invoice i
            LEFT JOIN invoice_relation ir  ON ir.invoice = i.invoicenumber
    WHERE   i.folio = @folio
    UNION ALL 
    SELECT  i.folio,
            i.invoicenumber,
            ir.parentinvoice,
            [level] + 1
    FROM    invoice i
            JOIN invoice_relation ir  ON ir.invoice = i.invoicenumber
            JOIN cte r ON r.parentinvoice = i.invoicenumber
),
-- make sure you get the root folio
rootCte AS (
    SELECT  COALESCE(oa.folio, c.folio) AS rootFolio 
    FROM    (SELECT     *,
                        ROW_NUMBER() OVER (ORDER BY [level] DESC) Rn
                FROM    cte ) c
             OUTER APPLY (SELECT folio FROM invoice i WHERE i.invoicenumber = c.parentinvoice) oa
    WHERE    c.Rn = 1
),
-- get all children of root folio
fullTree AS (
    SELECT  i.folio,
            i.invoicenumber
    FROM    rootCte r
            JOIN invoice i ON r.rootFolio = i.folio
    UNION ALL 
    SELECT  i.folio,
            i.invoicenumber
    FROM    fullTree ft
            JOIN invoice_relation ir ON ir.parentinvoice = ft.invoicenumber
            JOIN invoice i ON ir.invoice = i.invoicenumber
)
-- search for invoice
SELECT  * 
FROM    fullTree
WHERE   invoicenumber = @invoice

Here's an attempt that flattens out the relations first so you can travel in any direction. Then it does the recursive CTE to work through the levels:

WITH invoicerelation AS
(
select relationid, invoice, parentinvoice AS relatedinvoice
from invoice_relation
union
select relationid, parentinvoice AS invoice, invoice AS relatedinvoice 
from invoice_relation
),

cteLevels AS
(
select 0 AS relationid, invoice.folio, 
   invoicenumber AS invoice, invoicenumber AS relatedinvoice, 
   0 AS Level
from invoice 
UNION ALL
select invoicerelation.relationid, invoice.folio,
    invoicerelation.invoice, cteLevels.relatedinvoice, 
    Level + 1 AS Level
from invoice INNER JOIN
   invoicerelation ON invoice.invoicenumber = invoicerelation.invoice INNER JOIN
   cteLevels ON invoicerelation.relatedinvoice = cteLevels.invoice
      and (ctelevels.relationid <> invoicerelation.relationid)
)

SELECT cteLevels.folio, relatedinvoice, invoice.folio AS invoicefolio, cteLevels.level
from cteLevels INNER JOIN
    invoice ON cteLevels.relatedinvoice = invoice.invoicenumber
WHERE cteLevels.folio = 1003 AND cteLevels.relatedinvoice = 'a1122'

I agree with SwampDev's comment that the parentinvoice should really be in the invoice table. This could also be done without a recursive CTE if you know the maximum number of levels of separation between invoices.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM