简体   繁体   English

过滤Autodesk Vault垂直数据,获取每个工程图的最新记录

[英]Filtering Autodesk Vault vertical data, getting the newest record for each drawing

Using Microsoft SQL Server Express Edition (64-bit) 10.0.550.0 使用Microsoft SQL Server Express Edition(64位)10.0.550.0

I'm trying to extract data from an Autodesk Vault server. 我正在尝试从Autodesk Vault服务器提取数据。 The SQL involved to get to the required data is too advanced for my current level of knowledge, so I'm trying to lay a puzzle using bits from Google and StackOverflow as pieces. 就我目前的知识水平而言,要获取所需数据所涉及的SQL太高级了,因此我试图使用来自Google和StackOverflow的代码作为碎片来打个谜。 Using this excellent answer I was able to transpose the vertical data into a manageable horizontal format. 使用这个出色的答案,我能够将垂直数据转换为可管理的水平格式。

The Autodesk Vault database stores information about CAD drawings (among other things). Autodesk Vault数据库存储有关CAD工程图的信息(以及其他信息)。 The main vertical table dbo.Property holds information about all the different revisions of each CAD drawing. 主垂直表dbo.Property包含有关每个CAD工程图的所有不同修订的信息。 The problem I'm currently facing is that I'm getting too much data. 我当前面临的问题是我获取了太多数据。 I just want the data from the latest revision of each CAD drawing. 我只想要每个CAD工程图的最新版本的数据。

Here's my SQL so far 到目前为止,这是我的SQL

select 
    CreateDate,
    EntityID,
    PartNumber,
    CategoryName,
    [Subject],
    Title
from 
(
    select 
        EntityID,
        CreateDate,
        [53] as PartNumber,
        [28] as CategoryName,
        [42] as [Subject],
        [43] as Title
    from 
    (
        select
            p.Value, 
            p.PropertyDefID,
            p.EntityID,
            e.CreateDate
        from dbo.Property as p
        inner join dbo.Entity as e on p.EntityID = e.EntityId
        where p.PropertyDefID in(28, 42, 43, 53)
        and e.EntityClassID = 8
    ) t1
    pivot 
    (
        max(Value)
        for PropertyDefID in([28], [42], [43], [53])
    ) t2
) t3
where PartNumber is not null
and PartNumber != ''
and CategoryName = 'Drawing'
-- (1) additional condition
order by PartNumber, CreateDate desc

Where dbo.Property.Value is of sql_variant datatype. 其中dbo.Property.Valuesql_variant数据类型。 The query above results in a data set similar to this: 上面的查询产生了类似于以下内容的数据集:

CreateDate | EntityID | PartNumber | CategoryName | Subject | Title 
---------------------------------------------------------------------
2016-01-01 |    59046 |      10001 | Drawing      | Xxxxx   | Yyyyy
2016-05-01 |    60137 |      10001 | Drawing      | Xxxxx   | Yyyyy
2016-08-01 |    62518 |      10001 | Drawing      | Xxxx    | Yyyyyy
2016-12-16 |    63007 |      10001 | Drawing      | Xxxxxx  | Yyyyyy
2016-01-01 |    45776 |      10002 | Drawing      | Zzzzz   | NULL  
2016-11-01 |    65011 |      10002 | Drawing      | Zzzzzz  | NULL  
...
(about 23000 rows)

The problem that I have is that I'm getting all revisions for each drawing. 我的问题是我要获取每个图形的所有修订版。 In the example above I only want the latest revision for PartNumber=10001 dated '2016-12-16' etc. 在上面的示例中,我只想要PartNumber=10001的最新版本(日期为“ 2016-12-16”等)。

I have also looked at this answer on how to group and select rows where one of the columns has a max value, but I just can't seem to figure out how to combine the two. 我还查看了有关如何分组和选择其中一列具有最大值的行的答案 ,但是我似乎无法弄清楚如何将两者结合起来。 I tried adding the following snippet to the commented line in the above query, but it fails on many different levels. 我尝试将以下代码段添加到上述查询的注释行中,但在许多不同级别上均失败。

and (PartNumber, CreateDate) in 
(
    select PartNumber, max(CreateDate)
    from t3
    group by PartNumber 
)

The reason I'm tagging this question "pivot", although the pivoting is already done, is that I suspect that the pivoting is what's causing me trouble. 尽管已经完成了透视,但仍将这个问题标记为“透视”的原因是,我怀疑透视是导致我麻烦的原因。 I just haven't been able to wrap my head around this pivoting stuff yet, and my SQL optimization skills are seriously lacking. 我只是还无法解决这些关键问题,而且我的SQL优化技能严重不足。 Maybe the filtering should be done at an inner level? 也许应该在内部进行过滤?

Drawing inspiration from the comment provided by @Strawberry, I kept working and tweaking until I got something that seems to work. 从@Strawberry提供的评论中汲取了灵感,我一直在努力并进行调整,直到发现一些可行的方法为止。 I had to use a PIVOT inside a PIVOT for it all to work. 我必须在PIVOT内使用PIVOT才能正常工作。

Edit: At first I used views, but then the prerequisites changed as I had to work with a read-only database user. 编辑:起初我使用视图,但是前提条件发生了变化,因为我必须与只读数据库用户一起工作。 Fortunately, I was still allowed to create temporary tables. 幸运的是,我仍然被允许创建临时表。

This is the final result. 这是最终结果。

if object_id('tempdb.dbo.#Properties', 'U') is not null
    drop table #Properties

create table #Properties 
(
    PartNumber  nvarchar(max),
    [Subject]   nvarchar(max),
    Title       nvarchar(max),
    CreateDate  datetime
)

insert into #Properties
(
    PartNumber,
    [Subject],
    Title,
    CreateDate
)
select 
    convert(nvarchar(max), PartNumber),
    convert(nvarchar(max), [Subject]), 
    convert(nvarchar(max), Title),
    convert(datetime, CreateDate)
from 
(
    select 
        EntityID,
        CreateDate,
        [53] as PartNumber,
        [42] as [Subject],
        [43] as Title
    from 
    (
        select
            p.Value, 
            p.PropertyDefID,
            p.EntityID,
            e.CreateDate
        from dbo.Property as p
        inner join dbo.Entity as e on p.EntityID = e.EntityId
        where p.PropertyDefID in (42, 43, 53)
        and e.EntityClassID = 8
        and p.EntityID in
        (
            select 
                max(EntityID) as MaxEntityID
            from 
            (
                select 
                    EntityID,
                    [28] as CategoryName,
                    [53] as PartNumber
                from
                (
                    select
                        p.Value,
                        p.EntityID,
                        p.PropertyDefID
                    from dbo.Property as p
                    inner join dbo.Entity as e on p.EntityID = e.EntityId
                    where p.PropertyDefID in (28, 53)
                    and e.EntityClassID = 8 -- FileIteration
                ) as t1
                pivot
                (
                    max(Value)
                    for PropertyDefID in ([28], [53])
                ) as t2
            ) as t3
            where CategoryName = 'Drawing'
            group by PartNumber
        )
    ) as t4
    pivot 
    (
        max(Value)
        for PropertyDefID in ([42], [43], [53])
    ) as t5
) as t6
where PartNumber is not null
and PartNumber != ''
order by PartNumber

select * from #Properties;
-- search conditions goes here

I had to change the suggested join to a where x in(y) because the join was insanely slow (I terminated the query after four minutes). 我不得不将建议的join更改为where x in(y)因为联接非常慢(我在四分钟后终止了查询)。 Now the resulting data set (which takes ~2 seconds to produce) looks promising: 现在生成的数据集(大约需要2秒钟的时间)看起来很有希望:

PartNumber | Subject | Title  | CreateDate       | ...
-----------------------------------------------------------------------
100000     | Xxxxxx  | Yyyyyy | 2015-08-17 09-10 | ...
100001     | Zzzzzz  | NULL   | 2015-09-02 15-23 | ...
...
(about 8900 rows)

No more old revisions in the set. 集合中没有更多的旧版本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM