12c New Features – Partial Indexes


I would share a series of posts to cover the 12c New Features in Details. There are several New features included in this release, I would try to cover most of them or atleast the ones that I find interesting and useful.

This article would cover the feature Partial Indexes.

This feature is one of the best gifts with 12c for sites using the Partitioning Option. Partial Index gives us the ability to create both local and global indexes on only a subset of partitions within a partitioned table. Prior to Oracle 12c , you could not create indexes on selective partitions; Indexes always meant on all of the data.  However with Oracle 12c, you can create partial indexes that  contains index data from selective partitions only. Thus CBO would now be aware of the indexing characteristics of individual partitions and can access partitions in based on the query requirement.

To stress more on the point, In some of the sites using partitioning the indexes occupied even more space than the partition in concern as index was made on the whole table, this pretty much violated the point of having indexes, thus these sites can’t afford to have indexes when table is partitioned. But then if the table size is huge and even the partition size is large enough then CBO had a lot of load as there was no other means but to scan the whole partition or in other case to maintain the index alon with scanning the same.

So now we’ll go through a little demo to see how can we use partial index and its benefits:-

Demo:-

Let’s create a table for our demo:-

create table test
(id number, status varchar2(6), name varchar2(30))
INDEXING OFF
partition by range (id) (
partition pf1 values less than (1000001),
partition pf2 values less than (2000001) INDEXING OFF,
partition pf3 values less than (maxvalue) INDEXING ON );

Table created.

The INDEXING clause determines whether or not the partition is to be indexed. It can be set at the table level and so set the default behaviour for the table or at the individual partition/subpartition level.

In the above demo, I’ve set INDEXING OFF at the table level and so indexing by default is not be enabled for the table partitions. Therefore the PF1 partition is not indexed by default. The PF2 partition is explicitly set to also not be indexed but the PF3 index is explicitly set (INDEXING ON) to enable indexing and so override the table level default.

SQL> insert into test select rownum, 'CLOSED', 'Nitish Anand'from dual connect by level<= 3000000;
3000000rows created.
SQL> commit;
Commit complete.
SQL> update test set status = 'OPEN'where id > 2000000and mod(id,10000)=0;
100rows updated.
SQL> commit;
Commit complete.

So we have changed STATUS of a few rows to OPEN. Now lets create an index on the STATUS column.

SQL> create index test_status_i on pink_floyd(status);
Index created.
SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'TEST', estimate_percent=>null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1 FOR COLUMNS STATUS SIZE 5');
PL/SQL procedure successfully completed.
SQL> select index_name, num_rows, leaf_blocks, indexing from dba_indexes where index_name = 'TEST_STATUS_I';
INDEX_NAME NUM_ROWS LEAF_BLOCKS INDEXIN
-------------------- ---------- ----------- -------
TEST_STATUS_I 30000009203FULL

By default, an index will include all partitions in a table, regardless of the INDEXING table clause setting. So this index covers all 3M rows in the table and currently has 9203 leaf blocks. The new INDEXING column in DBA_INDEXES shows us that this index is a FULL (non-Partial) index.

We can of course get the data of interest (STATUS = ‘OPEN’) via this index now:

SQL> select * from test where status = 'OPEN';
100rows selected.
Execution Plan
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------
| 0| SELECT STATEMENT | | 964| 24100| 4(0)| 00:00:01| | |
| 1| TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| TEST | 964| 24100| 4(0)| 00:00:01| ROWID | ROWID |
|* 2| INDEX RANGE SCAN | TEST_STATUS_I | 100| | 3(0)| 00:00:01| | |
-------------------------------------------------------------------
However, we can potentially also run a query based on just the last partition as all the ‘OPEN’ statuses of interest only reside in this last partition:
select * from TEST where status = 'OPEN'and id > 2000001;
100rows selected.
Execution Plan
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------
| 0| SELECT STATEMENT | | 99| 2475| 4(0)| 00:00:01| | |
|* 1| TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| TEST | 99| 2475| 4(0)| 00:00:01| 3| 3|
|* 2| INDEX RANGE SCAN | TEST_STATUS_I | 100| | 3(0)| 00:00:01| | |
-------------------------------------------------------------------

But still the index includes data from all the table partitions, whereas we were interested to retrieve the less common ‘OPEN’ status that resides in only the last table partition. With 12c, there is now the capability to only index those partitions that are of interest to us, which with proper design can also be implemented such that only those column values of interest are included within an index.

Now lets drop the index that we created earlier and now create a PARTIAL INDEX:-

SQL> drop index test_status_i;
Index dropped.
SQL> create index test_status_i on test(status) indexing partial;
Index created.
SQL> select index_name, num_rows, leaf_blocks, indexing from dba_indexes where index_name = 'TEST_STATUS_I';
INDEX_NAME NUM_ROWS LEAF_BLOCKS INDEXIN
-------------------- ---------- ----------- -------
TEST_STATUS_I 10000003068 PARTIAL

Notice how the index, which is a Global, Non-Partitioned Index, now only has 1M entries (not all 3M as previously) and with 3068 leaf blocks is only 1/3 of what it was previously. The INDEXING column now denotes this as a “Partial” index.

Running the query again now explicitly references the last “active” table partition:-

SQL> select * from pink_floyd where status = 'OPEN'and id >= 2000001;
100rows selected.
Execution Plan
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-------------------------------------------------------------------
| 0| SELECT STATEMENT | | 100| 2500| 4(0)| 00:00:01| | |
|* 1| TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| TEST | 100| 2500| 4(0)| 00:00:01| 3| 3|
|* 2| INDEX RANGE SCAN | TEST_STATUS_I | 33| | 3(0)| 00:00:01| | |
-------------------------------------------------------------------

We see that the index is used as it was previously. By stating with the ID > 2000001 predicate we’re only interested in data that can only reside in the last table partition, the partition with INDEXING ON, the CBO knows the index can be used to retrieve all the rows of interest.

As you can now see this is the major difference now with partial indexes CBO has the visibility of a partial index on a specific partition which gives CBO more power to devise a effective execution plan for the query.

I hope the demo cleared this New 12c feature and helps in building the basics to use this feature in your environments. BUT still there is a lot to understand with Partial Indexing, there needs to smart changes in your application code to maximize the benefits from this feature.

In the above demo would the CBO have considered to use the PARTIAL INDEX if the clause to refer the last active table partition (i.e. where ID >= 2000001)  wasn’t mentioned in the where clause ? What would have been more cost effective doing a Full Partition Scan (Partition Pruning) or PArtial Index Scan ? please comment below to know these answers (but after the homework is done) 🙂