Stats collection using CTAS or direct path load

One of the new features of 12.1 is the automatic stats collection when performing a Create table as Select action.

But be aware of the site effects when using this feature, one a index is created it is also maintained but only for the first time, when an append is performed or a turncate table and a new CTAS is performed the index related to the table is not updated. This still requires manual interfereance.

The output:

Bulk load will occur if create table as select or insert into … select
Create a table MYEMP


SQL> create table myemp as (
2 SELECT LEVEL empl_id,
3 MOD (ROWNUM, 50000) dept_id,
4 TRUNC (DBMS_RANDOM.VALUE (1000, 50000), 2) salary,
5 DECODE (ROUND (DBMS_RANDOM.VALUE (1, 2)), 1, 'M', 2, 'F') gender,
6 TO_DATE ( ROUND (DBMS_RANDOM.VALUE (1, 28))
7 || '-'
8 || ROUND (DBMS_RANDOM.VALUE (1, 12))
9 || '-'
10 || ROUND (DBMS_RANDOM.VALUE (1950, 2013)),
11 'DD-MM-YYYY'
12 ) dob,
13 DBMS_RANDOM.STRING ('x', DBMS_RANDOM.VALUE (20, 40)) address
14 FROM DUAL
15 CONNECT BY LEVEL <= 5000);

Table created.

SQL> commit;
Commit complete.


SQL> explain plan for create table myemp as (
2 SELECT LEVEL empl_id,
3 MOD (ROWNUM, 50000) dept_id,
4 TRUNC (DBMS_RANDOM.VALUE (1000, 50000), 2) salary,
5 DECODE (ROUND (DBMS_RANDOM.VALUE (1, 2)), 1, 'M', 2, 'F') gender,
6 TO_DATE ( ROUND (DBMS_RANDOM.VALUE (1, 28))
7 || '-'
8 || ROUND (DBMS_RANDOM.VALUE (1, 12))
9 || '-'
10 || ROUND (DBMS_RANDOM.VALUE (1950, 2013)),
11 'DD-MM-YYYY'
12 ) dob,
13 DBMS_RANDOM.STRING ('x', DBMS_RANDOM.VALUE (20, 40)) address
14 FROM DUAL
15 CONNECT BY LEVEL <= 5000);

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
------------------------------------------------------------
Plan hash value: 1600317434


------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | 1 | 3 (0)| 00:00:01 |
| 1 | LOAD AS SELECT | MYEMP | | | |
| 2 | OPTIMIZER STATISTICS GATHERING | | 1 | 2 (0)| 00:00:01 |
| 3 | COUNT | | | | |
|* 4 | CONNECT BY WITHOUT FILTERING | | | | |
| 5 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |


PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------


4 - filter(LEVEL<=5000) 17 rows selected. Check if stats are gathered for myemp SQL> col table_name for a10
SQL> col column_name for a10
SQL> col owner for a10
SQL> col notes for a20
SQL> select to_char(LAST_ANALYZED,'dd-mm-yyyy hh24:mi:ss'),table_name,owner from DBA_TABLES where TABLE_NAME='MYEMP';


TO_CHAR(LAST_ANALYZ TABLE_NAME OWNER
------------------- ---------- ----------
03-11-2014 08:36:10 MYEMP BCB

Display notes column to display if auto stats is load

SQL> select column_name, num_distinct, histogram, notes
2 from user_tab_col_statistics where table_name='MYEMP';

COLUMN_NAM NUM_DISTINCT HISTOGRAM NOTES
---------- ------------ --------------- --------------------
EMPL_ID 5000 NONE STATS_ON_LOAD
DEPT_ID 5000 NONE STATS_ON_LOAD
SALARY 4997 NONE STATS_ON_LOAD
GENDER 2 NONE STATS_ON_LOAD
DOB 4437 NONE STATS_ON_LOAD
ADDRESS 5000 NONE STATS_ON_LOAD


6 rows selected.

SQL> select num_rows,blocks from user_tables where table_name='MYEMP';

NUM_ROWS BLOCKS
---------- ----------
5000 47

Check index stats when an index is created – happend since 10g if table which create the index is not empty.

SQL> create index myindex1 on myemp(dept_id);
Index created.


SQL> col table_name format a10
SQL> col index_name format a10
SQL> select to_char(LAST_ANALYZED,'dd-mm-yyyy hh24:mi:ss'),index_name,table_name,owner from DBA_INDEXES where TABLE_NAME='MYEMP';

TO_CHAR(LAST_ANALYZ INDEX_NAME TABLE_NAME OWNER
------------------- ---------- ---------- ----------
03-11-2014 08:36:18 MYINDEX1 MYEMP BCB

SQL> select num_rows from user_indexes where index_name='MYINDEX1';

NUM_ROWS
----------
5000

SQL> truncate table myemp;
Table truncated.

Insert again 15000 rows

SQL> insert /*+ append */ into myemp
2 SELECT LEVEL empl_id,
3 MOD (ROWNUM, 50000) dept_id,
4 TRUNC (DBMS_RANDOM.VALUE (1000, 50000), 2) salary,
5 DECODE (ROUND (DBMS_RANDOM.VALUE (1, 2)), 1, 'M', 2, 'F') gender,
6 TO_DATE ( ROUND (DBMS_RANDOM.VALUE (1, 28))
7 || '-'
8 || ROUND (DBMS_RANDOM.VALUE (1, 12))
9 || '-'
10 || ROUND (DBMS_RANDOM.VALUE (1950, 2013)),
11 'DD-MM-YYYY'
12 ) dob,
13 DBMS_RANDOM.STRING ('x', DBMS_RANDOM.VALUE (20, 40)) address
14 FROM DUAL
15 CONNECT BY LEVEL <= 15000; 15000 rows created. SQL> commit;
Commit complete.

Display notes column to display if auto stats is load

SQL> select column_name, num_distinct, histogram, notes from user_tab_col_statistics where table_name='MYEMP';

COLUMN_NAM NUM_DISTINCT HISTOGRAM NOTES
---------- ------------ --------------- --------------------
ADDRESS 15000 NONE STATS_ON_LOAD
DOB 10581 NONE STATS_ON_LOAD
GENDER 2 NONE STATS_ON_LOAD
SALARY 14969 NONE STATS_ON_LOAD
DEPT_ID 15000 NONE STATS_ON_LOAD
EMPL_ID 15000 NONE STATS_ON_LOAD

6 rows selected.

SQL> select num_rows,blocks from user_tables where table_name='MYEMP';

NUM_ROWS BLOCKS
---------- ----------
15000 138

SQL> select num_rows from user_indexes where index_name='MYINDEX1';

NUM_ROWS
----------
5000

Check if stats are gathered for the table myemp

SQL> select to_char(LAST_ANALYZED,'dd-mm-yyyy hh24:mi:ss'),table_name from DBA_TABLES where TABLE_NAME='MYEMP';

TO_CHAR(LAST_ANALYZ TABLE_NAME
------------------- ----------
03-11-2014 08:36:24 MYEMP

Display explain plan for the gather stats.

SQL> explain plan for insert /*+ APPEND */ into myemp
2 SELECT LEVEL empl_id,
3 MOD (ROWNUM, 50000) dept_id,
4 TRUNC (DBMS_RANDOM.VALUE (1000, 50000), 2) salary,
5 DECODE (ROUND (DBMS_RANDOM.VALUE (1, 2)), 1, 'M', 2, 'F') gender,
6 TO_DATE ( ROUND (DBMS_RANDOM.VALUE (1, 28))
7 || '-'
8 || ROUND (DBMS_RANDOM.VALUE (1, 12))
9 || '-'
10 || ROUND (DBMS_RANDOM.VALUE (1950, 2013)),
11 'DD-MM-YYYY'
12 ) dob,
13 DBMS_RANDOM.STRING ('x', DBMS_RANDOM.VALUE (20, 40)) address
14 FROM DUAL
15 CONNECT BY LEVEL <= 15000; Explained. SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Plan hash value: 1600317434

----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 1 | 2 (0)| 00:00:01 |
| 1 | LOAD AS SELECT | MYEMP | | | |
| 2 | OPTIMIZER STATISTICS GATHERING | | 1 | 2 (0)| 00:00:01 |
| 3 | COUNT | | | | |
|* 4 | CONNECT BY WITHOUT FILTERING | | | | |
| 5 | FAST DUAL | | 1 | 2 (0)| 00:00:01 |

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - filter(LEVEL<=15000) 17 rows selected. Insert again 10000 rows SQL> insert into myemp
2 SELECT LEVEL empl_id,
3 MOD (ROWNUM, 50000) dept_id,
4 TRUNC (DBMS_RANDOM.VALUE (1000, 50000), 2) salary,
5 DECODE (ROUND (DBMS_RANDOM.VALUE (1, 2)), 1, 'M', 2, 'F') gender,
6 TO_DATE ( ROUND (DBMS_RANDOM.VALUE (1, 28))
7 || '-'
8 || ROUND (DBMS_RANDOM.VALUE (1, 12))
9 || '-'
10 || ROUND (DBMS_RANDOM.VALUE (1950, 2013)),
11 'DD-MM-YYYY'
12 ) dob,
13 DBMS_RANDOM.STRING ('x', DBMS_RANDOM.VALUE (20, 40)) address
14 FROM DUAL
15 CONNECT BY LEVEL <= 10000; 10000 rows created. SQL> commit;
Commit complete.

Display notes column to display if auto stats is load

SQL> select column_name, num_distinct, histogram, notes from user_tab_col_statistics where table_name='MYEMP';

COLUMN_NAM NUM_DISTINCT HISTOGRAM NOTES
---------- ------------ --------------- --------------------
ADDRESS 15000 NONE STATS_ON_LOAD
DOB 10581 NONE STATS_ON_LOAD
GENDER 2 NONE STATS_ON_LOAD
SALARY 14969 NONE STATS_ON_LOAD
DEPT_ID 15000 NONE STATS_ON_LOAD
EMPL_ID 15000 NONE STATS_ON_LOAD

6 rows selected.

SQL> select num_rows,blocks from user_tables where table_name='MYEMP';

NUM_ROWS BLOCKS
---------- ----------
15000 138

SQL> select num_rows from user_indexes where index_name='MYINDEX1';

NUM_ROWS
----------
5000

As you can see from the above the index is updated only once. After that you always need to collects stats for the index, just like the behavior starting from Oracle 10g.

Leave A Comment