简体   繁体   English

有效查询MSSQL数据库

[英]Efficiently querying MSSQL database

I have been given the task of getting some data from a MSSQL database. 我被赋予了从MSSQL数据库获取一些数据的任务。 I am not the DB owner and I do not have the ability to make any changes or add any indices or anything. 我不是数据库所有者,我没有能力进行任何更改或添加任何索引或任何东西。 I have to work with what I have. 我必须与我所拥有的一起工作。 (I think the DB designer was on drugs.) (我认为数据库设计师是吸毒者。)

The DB is accessed via a python script, but I will show pseudo code here as it's the SQL that important. DB是通过python脚本访问的,但我会在这里显示伪代码,因为它是重要的SQL。

For this there are 5 items of data, let's call them A, B, C, D, and RecipeInstance. 为此,有5项数据,我们称之为A,B,C,D和RecipeInstance。 In the database, A, B, C, and D are concatenated and stored in a single column as A@B@C@D. 在数据库中,A,B,C和D被连接并作为A @ B @ C @ D存储在单个列中。 There is a one to many relationship between 'A@B@C@D' and RecipeInstance. 'A @ B @ C @ D'和RecipeInstance之间存在一对多的关系。

My 2 tasks are: 我的两项任务是:

1) Given A, B, C, and D get all the recipes 1)给定A,B,C和D得到所有的食谱

This is easy enough conceptually, but my query is very slow. 这在概念上很容易,但我的查询非常慢。 Here's my query for this: 这是我对此的查询:

SELECT PDEName as recipe
FROM RecipeInstance
WHERE PdeInstanceId
IN (SELECT DISTINCT PdeInstanceId FROM RecipeTableValue WHERE CellValue
IN (SELECT DISTINCT PDEName FROM RunInstance WHERE PdeInstanceId
IN (SELECT PdeInstanceId FROM RunTableValue WHERE CellValue = 'A@B@C@D')))

This query takes 16 seconds. 此查询需要16秒。 I really need to make it faster. 我真的需要让它更快。 I tried breaking it down into 4 seperate queries, but together they still took 16 seconds. 我尝试将其分解为4个单独的查询,但他们一起还需要16秒。 There are no useful indices on these tables, and I cannot create any. 这些表上没有有用的索引,我也无法创建任何索引。 Can anyone think of anyway to make this faster? 任何人都可以想到让这更快?

2) Given A, B, C, and Recipe get D 2)给定A,B,C和配方得到D.

This is more complicaed, since there's no relationship back from RecipeInstance to TargetInstance where D is. 这更加复杂,因为从RecipeInstance到TargetInstance之间没有任何关系,其中D是。 Here is what I came up with: 这是我想出的:

select PdeName as TargetPdeName
FROM TargetInstance
WHERE PdeName like 'A@B@C@%'

# this query returns between 20,000 and 40,000 rows

foreach TargetPdeName returned from the above query
    SELECT PDEName as RecipePdeName
    FROM RecipeInstance
    WHERE PdeInstanceId
    IN (SELECT DISTINCT PdeInstanceId FROM RecipeTableValue WHERE CellValue
    IN (SELECT DISTINCT PDEName FROM RunInstance WHERE PdeInstanceId
    IN (SELECT PdeInstanceId FROM RunTableValue WHERE CellValue = TargetPdeName)))

    if RecipePdeName == Recipe:
        # this is the one we want
        (a, b, c, d) = TargetPdeName.split('@')
        return d

So the problem here is obviously that I have to run tens of thousands of queries, each one taking 16 seconds. 因此,问题显然是我必须运行数万个查询,每个查询需要16秒。 Can anyone see how I can traverse this relationship backwards in an efficient manner? 任何人都可以看到我如何以有效的方式向后追溯这种关系?

Below are JOIN and EXISTS queries. 以下是JOINEXISTS查询。 Try both and let us know how they run. 试试两者,让我们知道它们是如何运行的。

1) 1)

JOIN version 加入版

SELECT DISTINCT reci.PDEName as recipe
FROM RecipeInstance reci
JOIN RecipeTableValue rectv ON reci.PdeInstanceId = rectv.PdeInstanceId
JOIN RunInstance runi ON rectv.CellValue = runi.PDEName
JOIN RunTableValue runtv ON runi.PdeInstanceId = runtv.PdeInstanceId 
WHERE runtv.CellValue = 'A@B@C@D'

EXISTS version EXISTS版本

SELECT PDEName as recipe
FROM RecipeInstance reci
WHERE EXISTS (
    SELECT * FROM RecipeTableValue rectv 
    WHERE rectv.PdeInstanceId = reci.PdeInstanceId
    AND EXISTS (
        SELECT * FROM RunInstance runi 
        WHERE runi.PDEName = rectv.CellValue
        AND EXISTS (
            SELECT * FROM RunTableValue runtv 
            WHERE runi.PdeInstanceId = runtv.PdeInstanceId
            AND CellValue = 'A@B@C@D'            
        )
    )
)

2) EDIT: To split ti.PdeName by @ and extract the last value you'll need to define your own function. 2) 编辑:@分割ti.PdeName并提取你需要定义自己的函数的最后一个值。 See How do I split a string so I can access item x 请参阅如何拆分字符串以便我可以访问项目x

JOIN version 加入版

SELECT DISTINCT ti.PdeName
FROM RecipeInstance reci
JOIN RecipeTableValue rectv ON reci.PdeInstanceId = rectv.PdeInstanceId
JOIN RunInstance runi ON rectv.CellValue = runi.PDEName
JOIN RunTableValue runtv ON runi.PdeInstanceId = runtv.PdeInstanceId 
JOIN TargetInstance ti ON runtv.CellValue = ti.PdeName
WHERE reci.PDEName = "MyRecipe"

EXISTS version EXISTS版本

SELECT ti.PdeName
FROM TargetInstance ti
WHERE EXISTS (
    SELECT * FROM RunTableValue runtv
    WHERE runtv.CellValue = ti.PdeName
    AND EXISTS (
        SELECT * FROM RunInstance runi
        WHERE runi.PdeInstanceId = runtv.PdeInstanceId 
        AND EXISTS (
            SELECT * FROM RecipeTableValue rectv
            WHERE rectv.CellValue = runi.PDEName
            AND EXISTS (
                SELECT * FROM RecipeInstance reci
                WHERE reci.PdeInstanceId = rectv.PdeInstanceId
                AND reci.PDEName = "MyRecipe"
            )
        )
    )
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM