简体   繁体   中英

How to improve performance of this query?

With reference to SQL Query how to summarize students record by date? I was able to get the report I wanted.

I was told in real world the students table will have 30 Millions of records. I do have index on (StudentID, Date). Any suggestions to improve the performance or is there a better way to build the report ?

Right now I have the following query

;with cte as
(
  select id, 
    studentid,
    date,
    '#'+subject+';'+grade+';'+convert(varchar(10), date, 101) report
  from student
) 
-- insert into studentreport
select distinct 
  studentid,
  STUFF(
         (SELECT cast(t2.report as varchar(50))
          FROM cte t2
          where c.StudentId = t2.StudentId
          order by t2.date desc
          FOR XML PATH (''))
          , 1, 0, '')  AS report
from cte c;

Without seeing the execution plan, it's not really possible to write an optimized SQL statement so I'll make suggestions instead.

Don't use a cte as they often don't handle queries with large memory requires well (at least, in my experience). Instead, stage the cte data in a real table, either with a materialized/indexed view or with a working table (maybe a large temp table). Then execute the second select (after the cte) to combine your data in an ordered list.

The number of comments to your question indicates that you have a large problem (or problems). You're converting tall and skinny data (think integers, datetime2 types) into ordered lists within a strings. Try to think instead in terms of storing in the smallest data formats available and manipulating into strings until afterward (or never). Alternatively, give serious thought into creating an XML data field to replace the 'report' field.

If you can make it work, this is what I would do (including a test case without indexes). Your mileage may vary, but give it a try:

create table #student (id int not null, studentid int not null, date datetime not null, subject varchar(40), grade varchar(40))

insert into #student (id,studentid,date,subject,grade)
select 1, 1, getdate(), 'history', 'A-' union all
select 2, 1, dateadd(d,1,getdate()), 'computer science', 'b' union all
select 3, 1, dateadd(d,2,getdate()), 'art', 'q' union all
--
select 1, 2, getdate() , 'something', 'F' union all
select 2, 2, dateadd(d,1,getdate()), 'genetics', 'e' union all
select 3, 2, dateadd(d,2,getdate()), 'art', 'D+' union all
--
select 1, 3, getdate() , 'memory loss', 'A-' union all
select 2, 3, dateadd(d,1,getdate()), 'creative writing', 'A-' union all
select 3, 3, dateadd(d,2,getdate()), 'history of asia 101', 'A-'

go

select      studentid as studentid
            ,(select s2.date as '@date', s2.subject as '@subject', s2.grade as '@grade' 
            from #student s2 where s1.studentid = s2.studentid for xml path('report'), type) as 'reports'
from        (select distinct studentid from #student) s1;

I don't know how to make the output legible on here, but the resultset is 2 fields. Field 1 is an integer, field 2 is XML with one node per report. This still isn't as ideal as just sending the resultset, but it is at least one result per studentid.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM