简体   繁体   中英

TSQL query that uses function and view is very slow

Ok, first of all thanks in advance if you read through this whole thing as it may be quite painful on several levels.

  1. It's a long post
  2. It's gross
  3. It's going to probably make your brain hurt

But on the plus side, after reading through this whole thing I have a feeling the answer is very obvious and simple, so you have that going for you.

So I'll tell you the problem in a nutshell, and then in more detail:

Nutshell

  • I have a query in SQL Server 2008r2 that is taking a very long time to complete.
  • I have several tables that contain information about a child and its parent.
  • A child in one table can have a parent in another table which then could have a parent in another table (there are only 3 tables).
  • I want to be able to take a child's name as a string and figure out it's heirarchy of ancestors and return that as a period delimited string. So Grandpappy.Grandpa.Dad.Me .
  • I have this all working, it's just taking forever so I'm doing something stupid, or poorly performant, or most likely both.
  • I have NO control over the tables, they are what they are and I can't do anything to them. I created a view and a function (which you will see below) and that is all I can control.
  • The table names and values below are obviously fictitious.

Detailed description

Here are the tables that indicate children and parents. In this example we will be dealing with Fruits, Vegetables, and Planets.

  • A Planet has no parents.
  • A Fruit has a Parent who is a Planet, or a Fruit.
  • A Vegetable has a Parent who is a Fruit, or a Planet, or a Vegetable.

Let's take a look at them...

Table 1 = Planets (I have no parents)

ID, Name
1, Earth
2, Saturn

Table 2 = Fruits (my parent is either a planet or a fruit)

ID, Name, PlanetName, FruitName
1, Kiwi, Earth, null
2, Strawberry, Saturn, null
3, Banana, null, Strawberry

Table 3 = Vegetables (my parent is planet or a fruit or a vegetable)

ID, Name, FruitName, PlanetName, VegetableName
1, Potato, Kiwi, null, null
2, Squash, null, Earth, null
3, Pumpkin, null, null, Potato

Table 4 = BigTable (this will be the one the main slow query is using. It has a column that contains just a child's name and it could be a planet or a fruit or a vegetable)

ID, Name, OneOfTheThree
1, John, Earth
2, Steve, Kiwi
3, Joe, Saturn
4, Jane, Potato

We have our tables and we have our data, what do I want to do now?

I want to create a query that looks at all of the OneOfTheThree values in the BigTable and find out what their lineage is (who there dads, grand parents etc are) and return that to the caller.

So my thought was to do this:

  • Create a view that pulls the three tables (Planet, Fruit, Vegetable) into one single view that shows Name and Parent.
  • Create a function that takes in a Name. It then uses that view to find out who the Parent is for that Name. It then looks to see who the Parent is for that Parent, and on and on until the Parent is null and it stops (because that's the top of the ancestry chain... we made it all the way to Planet, who has no parents).
  • Create a query to query BigTable and then use the above function on BigTable's OneOfTheThree column to get the ancestry of the name in OneOfTheThree.

So I did it as follows:

My view

View = vwEverybodyAndTheirParents

-- Planets
SELECT Name, null AS Parent
FROM Planets
UNION
-- Fruits
SELECT Name, PlanetName AS Parent
FROM Fruits
UNION
-- Vegetables
SELECT Name, CASE WHEN FruitName IS NOT NULL THEN FruitName WHEN PlanetName IS NOT NULL THEN Planet ELSE NULL END AS Parent
FROM Vegetables

Ok, that gives me everything and it's parents. Now for the function to crawl that view and give me the period delimited string of the full ancestry:

My function

CREATE FUNCTION dbo.fnGetMyParent(@NameToGetParentsFor varchar(255))
RETURNS varchar(255)
AS
    DECLARE @InternalName varchar(255)
    DECLARE @ParentName varchar(255)
    DECLARE @ConcatenatedParentStringToReturn varchar(max)

    SELECT @ParentName = Parent
            ,@ConcatenatedParentStringToReturn = Name
    FROM vwEverybody
    WHERE Name = @NameToGetParentsFor

    WHILE @ParentName IS NOT NULL
    BEGIN
        SELECT @InternalName = Name, 
                @ParentName = Parent
        FROM vwEverybody
        WHERE Name = @ParentName

        SET @ConcatenatedParentStringToReturn = RTRIM(InternalName) + "." + RTRIM(@ConcatenatedParentStringToReturn)
    END

    RETURN @ConcatenatedParentStringToReturn
END 

This function works fine (though could be poorly coded and poorly performing?), so with all the above examples if I were to call it like so:

dbo.fnGetMyParent('Potato')

I get back the concatenated string of:

Earth.Kiwi.Potato

The problem

Ok, so now to finally get to the problem... the big query that takes forever:

SELECT Name,
       OneOfTheThree,
       fnGetMyParent(OneOfTheThree) as HeirarchyOfParents
FROM BigTable

I can see why it could take so long as for each value it executes the function which needs to then crawl a view. So...

My questions to you

  • How can I speed this up?
  • Do I need to put an index on the view?
  • Is my approach off, and should I do this differently?
  • If so, what do you recommend?

A BIG THANK YOU if you made it this far!

First of all when using sql you should avoid using loops as much as you can (unless the situation asks for it)

Second, there is no need of the view, or of the function as your query should be easily written in one go.

select 
  bt.Name
  ,bt.OneOfTheThree 
  ,p.Name+'.'+isnull(f.Name,'')+'.'+isnull(v.Name,'')+'.'+bt.Name as HeirarchyOfParents
from BigTable bt
left join Vegetables v
  on bt.OneOfTheThree = v.name
left join Fruits f
  on coalesce(v.FruitName,bt.OneOfTheThree) = f.Name
left join Planets p
  on coalesce(f.PlanetName,v.PlanetName,bt.OneOfTheThree) = p.Name

The last join you can remove if the table is consistent with the others, as it does not bring new information (the planet name is already there).

The improvements that you can bring here are with indexes on the tables, if you are able to do that.

Ok, with the new information, the easiest way I can think of is the following:

;with ftemp as (
  select 
    name as path
    ,PlanetName
    ,name as root
    ,name as name
    ,FruitName as parent
    ,0 as cnt
  from fruits
  union all
  select 
    fruits.name + '.' + ftemp.path
    ,ftemp.PlanetName
    ,root
    ,fruits.name
    ,cnt+1
  from fruits
  join ftemp
    on fruits.name= ftemp.parent
)
,fg as (
  select 
    name
    ,max(cnt) as cnt
  from ftemp
  group by name
)
,f as (
  select 
    ftemp.*
  from ftemp
  join fg
    on  ftemp.cnt = fg.cnt
    and ftemp.name = fg.name
)
,vtemp (same ideea)
,vg (same ideea)
,v (same ideea)
select 
  bt.Name
  ,bt.OneOfTheThree 
  ,p.Name+'.'+isnull(f.Path+'.','')+isnull(v.Path+'.','')+bt.Name as HeirarchyOfParents
from BigTable bt
left join v
  on bt.OneOfTheThree = v.name
left join f
  on coalesce(v.FruitName,bt.OneOfTheThree) = f.Name
left join Planets p
  on coalesce(f.PlanetName,v.PlanetName,bt.OneOfTheThree) = p.Name

With this approach though .. I have no idea on the performance it will yield. So it's up to you to complete the query and test.

Hope it helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM