简体   繁体   中英

Need help optimizing very slow DB2 SQL query touching millions of records

I'm trying to optimize the following SQL but my knowledge of SQL optimization is rather green and I'm not making much headway.(I generalized the columns and other identifiers due to company policy) In its current state, this SQL takes anywhere between 1 to 2 minutes to run depending on load. The VKTINFO table contains about 1 million records and the GNTINFO table contains about 3 million records. Normally the 1-2 minutes wouldn't be a big deal if this was a batch process, but we have agents needing this information live and as quickly as possible - to make matters worse, our system times out eventually and returns a sorry error to the user. It is not an option to extend the timeout windows however. We have other criteria to search against eg First name, zip code, account type, account status, etcetera but when a broad search such below is performed, the query becomes rather slow.

If there are any suggestions/techniques on how this SQL might be able to be manipulated to speed up the select, I would greatly appreciate any thoughts on the matter. If more information is needed, I would be glad to provide as much as I can that still complies with our company policy.

edit: As requested here are the indexes for the VKTINFO and GNTINFO tables.

  • account_number
  • expiration_date
  • effective_date

Indexes for the gnt_account_info and vkt_account_info:

  • pi_account_num
  • pi_policy_num_gid

Indexes for the gntnad and vktnad tables:

  • nad_account_number
  • nad_name_type

Index for the gntpolrf and vktpolrf tables:

  • xrf_account_number
select
processing_system,
total_premium,
quote_by,
email_address,
account_number,
expiration_date,
account_state,
xrf_file,
customer_name
from
(
   select
   'ABCD' as processing_system,
   total_premium,
   quote_by,
   email_address,
   account_number,
   expiration_date,
   account_state,
   xrf_file,
   customer_name
   from vktinfo 
    left outer join vkt_account_info on account_number = pi_account_number 
    left outer join vktpolrf on account_number = xrf_account_number 
    left outer join VKTNAD on account_number = nad_account_number
    and history_expiration_date=nad_history_expiration_date
    and nad_name_type='HA'
   WHERE effective_date >= '2013-02-01'
   AND effective_date <= '2013-02-28'
   AND customer_name like '_SMITH%'
   AND account_state = 'South Carolina'
   union all
   select
   'EFGH' as processing_system,
   total_premium,
   quote_by,
   email_address,
   account_number,
   expiration_date,
   account_state,
   xrf_file,
   customer_name
   from gntinfo 
    left outer join gnt_account_info on account_number = pi_account_number 
    left outer join vktpolrf on account_number = xrf_account_number 
    left outer join GNTNAD on account_number = nad_account_number
    and history_expiration_date=nad_history_expiration_date
    and nad_name_type='HA'
   WHERE effective_date >= '2013-02-01'
   AND effective_date <= '2013-02-28'
   AND customer_name like '_SMITH%'
   AND account_state = 'South Carolina'
)
a
order by customer_name ASC fetch first 1000 rows only WITH UR

I don't have a rock-solid answer for you. But I do have some things you can try. I understand you don't have permissions to get an execution plan.

  • Check with someone who's been there a while, and ask whether you're supposed to be able to run EXPLAIN.
  • You probably need an index on account_state. Rule of thumb: index every column used in a join condition or a WHERE clause. Sometimes multi-column indexes perform better than several single-column indexes.
  • Try moving every part of the subquery's WHERE clauses that you can move to the outer query, and test two things.
    • Use those parts in an ordinary WHERE clause in the outer query.
    • Rerrange the outer query so that, instead of selecting from the UNIONed subquery, you do an inner join on it.
  • Determine whether any of the left outer joins can be replaced by inner joins. The table that stores "nad_name_type" is a likely candidate for an inner join. (Do you understand why?)
  • Test the performance of the subquery when it's implemented as a view. You might need DBA help with that. (If they don't let you run EXPLAIN, they probably don't let you create views, either.)
  • Test the performance of the subquery when it's implemented as a materialized query table. You might need DBA help with that, too.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM