简体   繁体   中英

sas teradata fastload issue

Is there a fast way to load data to teradata? I need to load 350,000 account numbers to teradata and it's been running for about 4.5 hours now.

I am just using a data step. Below is my code. Thank you

libname myid  teradata authdomain=IDWPRD server=IDWPRD database=myid mode=teradata connection=global;

proc delete data=myid.tera1;
run;

proc sql; 
create table out.REQ_1_1_05l as 
select distinct ACCOUNT_NB as ACCT_NB
FROM OUT.REQ_1_1_05;
quit;

data myid.tera1;
set OUT.REQ_1_1_05l ;
run;

Use the bulkload=yes option in your libname statement:

libname myid  teradata authdomain=IDWPRD server=IDWPRD database=myid mode=teradata connection=global bulkload=yes;

data tera.want;
     set have;
run;

Additional performance information specific to Teradata can be found here: http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a001405937.htm

This is most often result of a bad practice. if 350,000 records takes more then few minutes without even bulk load utility then also it is surprising to me(unless it is very wide table).

In Teradata, table rows are distributed on Access Module Processor (AMP). Row distribution is dependent on uniqueness of defined primary index column. More unique the primary index column is better the data distribution and vice versa. Improper distribution of table rows in AMP's will results in skewed data.

Below query will create Teradata table with first column as primary index. If first column has few distinct values a skewed table is created. As mentioned earlier, the impact of the skewed table results in wastage of space and can take unusually long time for your queries to finish.

  data myid.tera1;
 set OUT.REQ_1_1_05l ;
run;                                                                                                                                            

Data set option dbcreate_table_opts can define primary index explicitly. dbcreate_table_opts = Data Set option needs a key word primary index followed by column name in parenthesis.

 data  myid.tera1
    (dbcreate_table_opts= 'primary index(yourcolumn)');  
  set OUT.REQ_1_1_05l; 
 run;

Please select appropriate unique primary index, which is often most important thing in Teradata.

Please look into below paper, which explains what are common issues SAS programmers may have while using Teradata.

https://www.lexjansen.com/mwsug/2016/SA/MWSUG-2016-SA11.pdf

You can also use fast load utility as shown below. Fast load does bulkloading and makes it tremendously fast to move data from sas to Teradata.

    data  myid.tera1
    (fastload =yes dbcreate_table_opts= 'primary index(yourcolumn)');  
  set OUT.REQ_1_1_05l; 
 run;

Look into paper by Jeff bailey if you want know everything about SAS and Teradata data movement.

https://support.sas.com/.../EffectivelyMovingSASDataintoTeradata.pdf

Finally check whether your table myid.tera1 is set table, which will not allow duplicates but this may not be major factor. If you Teradata sql assistant you do show table , it will give you whether it is set or multiset table. Set table does not allow row level duplicates and checks for every row before insertion and time for loading.

Add the dbcommit= option to your libname statement. The default is 1 record, ie it commits on every record. Play around with this value to find the optimal setting for your configuration.

libname myid teradata authdomain=IDWPRD server=IDWPRD database=myid mode=teradata connection=global dbcommit=5000 ;

https://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a001371531.htm

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM