简体   繁体   English

如何提高此查询的性能?

[英]How to improve performance of this query?

With reference to SQL Query how to summarize students record by date? 参照SQL Query如何总结按日期记录的学生? I was able to get the report I wanted. 我能够得到我想要的报告。

I was told in real world the students table will have 30 Millions of records. 有人告诉我,在现实世界中,学生表将具有3000万条记录。 I do have index on (StudentID, Date). 我确实有(StudentID,Date)的索引。 Any suggestions to improve the performance or is there a better way to build the report ? 有什么建议可以改善绩效,还是有更好的方法来生成报告?

Right now I have the following query 现在我有以下查询

;with cte as
(
  select id, 
    studentid,
    date,
    '#'+subject+';'+grade+';'+convert(varchar(10), date, 101) report
  from student
) 
-- insert into studentreport
select distinct 
  studentid,
  STUFF(
         (SELECT cast(t2.report as varchar(50))
          FROM cte t2
          where c.StudentId = t2.StudentId
          order by t2.date desc
          FOR XML PATH (''))
          , 1, 0, '')  AS report
from cte c;

Without seeing the execution plan, it's not really possible to write an optimized SQL statement so I'll make suggestions instead. 如果不查看执行计划,实际上不可能编写优化的SQL语句,因此我将提出建议。

Don't use a cte as they often don't handle queries with large memory requires well (at least, in my experience). 不要使用cte,因为它们通常不处理大内存查询,这很好(至少以我的经验)。 Instead, stage the cte data in a real table, either with a materialized/indexed view or with a working table (maybe a large temp table). 取而代之的是,将CTE数据放在具有实物化/索引视图或工作表(可能是大型临时表)的真实表中。 Then execute the second select (after the cte) to combine your data in an ordered list. 然后执行第二个选择(在cte之后),以将您的数据合并到有序列表中。

The number of comments to your question indicates that you have a large problem (or problems). 对问题的评论数量表明您有一个或多个大问题。 You're converting tall and skinny data (think integers, datetime2 types) into ordered lists within a strings. 您正在将粗细的数据(例如整数,datetime2类型)转换为字符串中的有序列表。 Try to think instead in terms of storing in the smallest data formats available and manipulating into strings until afterward (or never). 尝试考虑以最小的可用数据格式存储并处理为字符串直到事后(或永不)。 Alternatively, give serious thought into creating an XML data field to replace the 'report' field. 另外,请认真考虑创建XML数据字段以替换“报告”字段。

If you can make it work, this is what I would do (including a test case without indexes). 如果可以使它工作,这就是我要做的(包括一个没有索引的测试用例)。 Your mileage may vary, but give it a try: 您的里程可能会有所不同,但请尝试一下:

create table #student (id int not null, studentid int not null, date datetime not null, subject varchar(40), grade varchar(40))

insert into #student (id,studentid,date,subject,grade)
select 1, 1, getdate(), 'history', 'A-' union all
select 2, 1, dateadd(d,1,getdate()), 'computer science', 'b' union all
select 3, 1, dateadd(d,2,getdate()), 'art', 'q' union all
--
select 1, 2, getdate() , 'something', 'F' union all
select 2, 2, dateadd(d,1,getdate()), 'genetics', 'e' union all
select 3, 2, dateadd(d,2,getdate()), 'art', 'D+' union all
--
select 1, 3, getdate() , 'memory loss', 'A-' union all
select 2, 3, dateadd(d,1,getdate()), 'creative writing', 'A-' union all
select 3, 3, dateadd(d,2,getdate()), 'history of asia 101', 'A-'

go

select      studentid as studentid
            ,(select s2.date as '@date', s2.subject as '@subject', s2.grade as '@grade' 
            from #student s2 where s1.studentid = s2.studentid for xml path('report'), type) as 'reports'
from        (select distinct studentid from #student) s1;

I don't know how to make the output legible on here, but the resultset is 2 fields. 我不知道如何使输出在此处清晰可见,但是结果集是2个字段。 Field 1 is an integer, field 2 is XML with one node per report. 字段1是整数,字段2是XML,每个报表一个节点。 This still isn't as ideal as just sending the resultset, but it is at least one result per studentid. 这仍然不像发送结果集那样理想,但是每个学生ID至少有一个结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM