比较好的一篇关于增强sqlldr性能的文章
http://www.remote-dba.net/teas_rem_util18.htm
1.几个文件的准备:
create table L5M.load_01 as
select 1 as u_id ,a.* from (
select * from all_tables where 1=0 )a;
[oracle@qht108 sqlldr]$ cat para.txt
userid=l5m/l5m
control='/home/oracle/sqlldr/control.txt'
data='/home/oracle/sqlldr/data.txt'
log='/home/oracle/sqlldr/log.txt'
bad='/home/oracle/sqlldr/bad.txt'
#errors=0
direct=true
[oracle@qht108 sqlldr]$ cat control.txt
load data
append into table LOAD_01
fields terminated by ',' TRAILING NULLCOLS
(u_id recnum,OWNER,TABLE_NAME,TABLESPACE_NAME,CLUSTER_NAME,IOT_NAME,STATUS,PCT_FREE,PCT_USED,INI_TRANS,MAX_TRANS,INITIAL_EXTENT,NEXT_EXTENT,MIN_EXTENTS,MAX_EXTENTS,PCT_INCREASE,FREELISTS,FREELIST_GROUPS,LOGGING,BACKED_UP,NUM_ROWS,BLOCKS,EMPTY_BLOCKS,AVG_SPACE,CHAIN_CNT,AVG_ROW_LEN,AVG_SPACE_FREELIST_BLOCKS,NUM_FREELIST_BLOCKS,DEGREE,INSTANCES,CACHE,TABLE_LOCK,SAMPLE_SIZE,LAST_ANALYZED date 'yyyy-mm-dd HH24:mi:ss)',PARTITIONED,IOT_TYPE,TEMPORARY,SECONDARY,NESTED,BUFFER_POOL,ROW_MOVEMENT,GLOBAL_STATS,USER_STATS,DURATION,SKIP_CORRUPT,MONITORING,CLUSTER_OWNER,DEPENDENCIES,COMPRESSION,DROPPED)
data.txt文件是将all_tables的资料导成txt档生成的,复制了N遍后有232M.
[oracle@qht108 sqlldr]$ tail -5 data.txt
SYS,SCHEDULER$_WINDOW_GROUP,SYSTEM,,,VALID,10,40,1,255,65536,,1,2147483645,,1,1,YES,N,1,1,0,0,0,7,0,0, 1, 1, N,ENABLED,1,2008-2-17 2:02:38,NO,,N,N,NO,DEFAULT,DISABLED,YES,NO,,DISABLED,YES,,DISABLED,DISABLED,NO
SYS,SCHEDULER$_WINGRP_MEMBER,SYSTEM,,,VALID,10,40,1,255,65536,,1,2147483645,,1,1,YES,N,2,1,0,0,0,8,0,0, 1, 1, N,ENABLED,2,2008-2-17 2:02:38,NO,,N,N,NO,DEFAULT,DISABLED,YES,NO,,DISABLED,YES,,DISABLED,DISABLED,NO
SYS,SCHEDULER$_SCHEDULE,SYSTEM,,,VALID,10,40,1,255,65536,,1,2147483645,,1,1,YES,N,1,1,0,0,0,50,0,0, 1, 1, N,ENABLED,1,2008-2-17 2:02:38,NO,,N,N,NO,DEFAULT,DISABLED,YES,NO,,DISABLED,YES,,DISABLED,DISABLED,NO
SYS,SCHEDULER$_CHAIN,SYSTEM,,,VALID,10,40,1,255,65536,,1,2147483645,,1,1,YES,N,0,0,0,0,0,0,0,0, 1, 1, N,ENABLED,0,2008-2-17 2:02:37,NO,,N,N,NO,DEFAULT,DISABLED,YES,NO,,DISABLED,YES,,DISABLED,DISABLED,NO
SYS,SCHEDULER$_STEP,SYSTEM,,,VALID,10,40,1,255,6
[oracle@qht108 sqlldr]$ du -h data.txt
232M data.txt
建立redo_size视图,便于查看redosize,当然这里取的系统的整个redo,由于sqlldr没有办法抓取到v$mystat的数据.
create or replace view redo_size
as
select value
from v$sysstat, v$statname
where v$sysstat.statistic# = v$statname.statistic#
and v$statname.name = 'redo size';
2.直接用上面的参数文件进行load的结果.
[oracle@qht108 sqlldr]$ sqlldr parfile=para.txt
SQL*Loader: Release 10.2.0.4.0 - Production on Wed Jul 16 12:34:31 2008
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Load completed - logical record count 311379.
查看loading时的日志,用了1分17.92秒
[oracle@qht108 sqlldr]$ tail -29 log.txt
Table LOAD_01:
1123237 Rows successfully loaded.
4 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Date cache:
Max Size: 1000
Entries : 83
Hits : 1055798
Misses : 0
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 1123241
Total logical records rejected: 4
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 233
Total stream buffers loaded by SQL*Loader load thread: 924
Run began on Wed Jul 16 13:48:58 2008
Run ended on Wed Jul 16 13:50:16 2008
Elapsed time was: 00:01:17.92
CPU time was: 00:00:15.66
3.加个索引看看
SQL>truncate table l5m.load_01;
SQL>create index l5m.i_load01 on l5m.load_01(u_id,owner,table_name);
SQL>create index l5m.i_load02 on l5m.load_01(tablespace_name,logging);
SQL> select * from redo_size;
VALUE
----------
47991260
查看loading时的日志,用了2分55秒,整整比不加索引慢了一倍多
[oracle@qht108 sqlldr]$ tail -29 log.txt
Table LOAD_01:
1123237 Rows successfully loaded.
4 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Date cache:
Max Size: 1000
Entries : 83
Hits : 1055798
Misses : 0
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 1123241
Total logical records rejected: 4
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 233
Total stream buffers loaded by SQL*Loader load thread: 924
Run began on Wed Jul 16 14:47:02 2008
Run ended on Wed Jul 16 14:49:57 2008
Elapsed time was: 00:02:55.14
CPU time was: 00:00:16.60
SQL> select * from redo_size;
VALUE
----------
48791144
SQL> select 48791144-47991260 from dual;
48791144-47991260
-----------------
799884
4.测试ROWS和bindsize参数,
加了Rows参数,传统加载时必须注意bindsize参数是多少,否则sqlldr会自动按bindsize的值来除每行的数据量,得到一个新的rows替换你的参数.直接加载时不受这个限制.
4.1 测一下传统加载的情况
参数文件改了一下:
[oracle@qht108 sqlldr]$ cat para.txt
userid=l5m/l5m
control='/home/oracle/sqlldr/control.txt'
data='/home/oracle/sqlldr/data.txt'
log='/home/oracle/sqlldr/log.txt'
bad='/home/oracle/sqlldr/bad.txt'
rows=300
bindsize=2560000
SQL> select * from redo_size;
VALUE
----------
604331212
log.txt的部分记录如下:
注意自动将rows=300按照bindsize 2560000计算得出202 rows.
Table LOAD_01:
1123237 Rows successfully loaded.
4 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Space allocated for bind array: 2558330 bytes(202 rows)
Read buffer bytes: 2560000
Total logical records skipped: 0
Total logical records read: 1123241
Total logical records rejected: 4
Total logical records discarded: 0
Run began on Thu Jul 17 16:19:25 2008
Run ended on Thu Jul 17 16:27:52 2008
Elapsed time was: 00:08:27.76
CPU time was: 00:01:02.26
redo生成的比较大,传统加载嘛,可以理解.
SQL> select * from redo_size;
VALUE
----------
1211475560
SQL> select 1211475560-604331212 from dual;
1211475560-604331212
--------------------
607144348
4.2 试试直接加载,将rows设为10000
[oracle@qht108 sqlldr]$ cat para.txt
userid=l5m/l5m
control='/home/oracle/sqlldr/control.txt'
data='/home/oracle/sqlldr/data.txt'
log='/home/oracle/sqlldr/log.txt'
bad='/home/oracle/sqlldr/bad.txt'
rows=10000
direct=true
SQL>truncate table l5m.load_01;
SQL> select * from redo_size;
VALUE
----------
1213797372
[oracle@qht108 sqlldr]$ sqlldr parfile=para.txt
SQL*Loader: Release 10.2.0.4.0 - Production on Thu Jul 17 16:44:42 2008
Copyright (c) 1982, 2007, Oracle. All rights reserved.
Save data point reached - logical record count 10000.
Save data point reached - logical record count 20000.
Save data point reached - logical record count 30000.
Save data point reached - logical record count 40000.
Save data point reached - logical record count 50000.
Save data point reached - logical record count 60000.
Save data point reached - logical record count 70000.
Save data point reached - logical record count 80000.
....
可10000行Save一次的,看到不是commit.
commit和save的不同,可以参考:
http://www.itpub.net/thread-1022543-2-2.html
log.txt部分内容如下:
The following index(es) on table LOAD_01 were processed:
index L5M.I_LOAD01 loaded successfully with 1123237 keys
index L5M.I_LOAD02 loaded successfully with 985224 keys
Table LOAD_01:
1123237 Rows successfully loaded.
4 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Date cache:
Max Size: 1000
Entries : 83
Hits : 1055798
Misses : 0
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 1123241
Total logical records rejected: 4
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 345
Total stream buffers loaded by SQL*Loader load thread: 822
Run began on Thu Jul 17 16:44:42 2008
Run ended on Thu Jul 17 16:48:15 2008
Elapsed time was: 00:03:33.79
CPU time was: 00:00:20.74
SQL> select * from redo_size;
VALUE
----------
1214767672
SQL> select 1214767672-1213797372 from dual;
1214767672-1213797372
---------------------
970300
5.测试unrecoverable参数.(索引没有删除,下面的测试都在不删除索引的前提下进行)
要注意:将unrecoverable加到控制文件的load data上面,而非加到参数文件.
[oracle@qht108 sqlldr]$ cat control.txt
unrecoverable
load data
append into table LOAD_01
fields terminated by ',' TRAILING NULLCOLS
(u_id recnum,OWNER,TABLE_NAME,TABLESPACE_NAME,CLUSTER_NAME,IOT_NAME,STATUS,PCT_FREE,PCT_USED,INI_TRANS,MAX_TRANS,INITIAL_EXTENT,NEXT_EXTENT,MIN_EXTENTS,MAX_EXTENTS,PCT_INCREASE,FREELISTS,FREELIST_GROUPS,LOGGING,BACKED_UP,NUM_ROWS,BLOCKS,EMPTY_BLOCKS,AVG_SPACE,CHAIN_CNT,AVG_ROW_LEN,AVG_SPACE_FREELIST_BLOCKS,NUM_FREELIST_BLOCKS,DEGREE,INSTANCES,CACHE,TABLE_LOCK,SAMPLE_SIZE,LAST_ANALYZED date 'yyyy-mm-dd HH24:mi:ss)',PARTITIONED,IOT_TYPE,TEMPORARY,SECONDARY,NESTED,BUFFER_POOL,ROW_MOVEMENT,GLOBAL_STATS,USER_STATS,DURATION,SKIP_CORRUPT,MONITORING,CLUSTER_OWNER,DEPENDENCIES,COMPRESSION,DROPPED)
SQL>truncate table l5m.load_01
SQL> select * from redo_size;
VALUE
----------
49505816
查看loading时的日志,用了3分04秒,速度比不加还慢点,这个和Donald 的结论有点不太对,不过不用太计较,Donald 的结论也只是快5%,而我慢了5%,应该和server的性能有关.
[oracle@qht108 sqlldr]$ tail -29 log.txt
Table LOAD_01:
1123237 Rows successfully loaded.
4 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Date cache:
Max Size: 1000
Entries : 83
Hits : 1055798
Misses : 0
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 1123241
Total logical records rejected: 4
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 233
Total stream buffers loaded by SQL*Loader load thread: 924
Run began on Wed Jul 16 15:22:02 2008
Run ended on Wed Jul 16 15:25:06 2008
Elapsed time was: 00:03:04.00
CPU time was: 00:00:23.29
现在我们关心的是redo有没有更少一点,也产生了799580的redo,看来和没加也差不到哪儿去.
SQL> select * from redo_size;
VALUE
----------
50305396
SQL> select 50305396-49505816 from dual;
50305396-49505816
-----------------
799580
6.测试skip_index_maintenance参数.
[oracle@qht108 sqlldr]$ echo "skip_index_maintenance=true" >> para.txt
[oracle@qht108 sqlldr]$ cat para.txt
userid=l5m/l5m
control='/home/oracle/sqlldr/control.txt'
data='/home/oracle/sqlldr/data.txt'
log='/home/oracle/sqlldr/log.txt'
bad='/home/oracle/sqlldr/bad.txt'
#errors=0
direct=true
skip_index_maintenance=true
SQL>truncate table l5m.load_01
SQL> select * from redo_size;
VALUE
----------
51052076
日志中看到只用了1分17秒,和没有索引的情况是一样的,且出现SKIP_INDEX_MAINTENANCE option requested的信息.
The following index(es) on table LOAD_01 were processed:
index L5M.I_LOAD01 was made unusable due to:
SKIP_INDEX_MAINTENANCE option requested
index L5M.I_LOAD02 was made unusable due to:
SKIP_INDEX_MAINTENANCE option requested
Table LOAD_01:
1123237 Rows successfully loaded.
4 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Date cache:
Max Size: 1000
Entries : 83
Hits : 1055798
Misses : 0
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 1123241
Total logical records rejected: 4
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 233
Total stream buffers loaded by SQL*Loader load thread: 924
Run began on Wed Jul 16 16:01:13 2008
Run ended on Wed Jul 16 16:02:30 2008
Elapsed time was: 00:01:16.96
CPU time was: 00:00:16.16
下面看一下产生的redo大小及index的状态.
由于skip了建立index的redo,产生的redo少多了.
SQL> select * from redo_size;
VALUE
----------
51459748
SQL> select 51459748-51052076 from dual;
51459748-51052076
-----------------
407672
SQL> select index_name,status from dba_indexes
2 where owner='L5M' and table_name='LOAD_01';
INDEX_NAME STATUS
------------------------------ --------
I_LOAD01 UNUSABLE
I_LOAD02 UNUSABLE
SQL> select segment_name,blocks from dba_segments where owner='L5M' and segment_type='INDEX';
SEGMENT_NAME BLOCKS
-------------------- ----------
I_LOAD02 8
I_LOAD01 8
可以看出索引没有用,必须rebuild才可以.
SQL> alter index l5m.i_load01 rebuild;
Index altered.
SQL> alter index l5m.i_load02 rebuild;
Index altered.
SQL> select index_name,status from dba_indexes
2 where owner='L5M' and table_name='LOAD_01';
INDEX_NAME STATUS
------------------------------ --------
I_LOAD01 VALID
I_LOAD02 VALID
SQL> select segment_name,blocks from dba_segments where owner='L5M' and segment_type='INDEX';
SEGMENT_NAME BLOCKS
-------------------- ----------
I_LOAD02 3072
I_LOAD01 6016
7.测试skip_unusable_indexes参数
将此参数分别设为true和flase,并没有发现有何不同,呵.
8.测试commit_discontinued参数
该参数默认为false,表示当load被异外中止后,已load的数据是不是自动提,经过测试改为TRUE后,中止sqlldr数据会自动提交,默认的false不会提交.