lunes, 13 de febrero de 2017

How to extract the hints we need from a SQL Query.


Let's say you have a SQL query running in one database and you need to run it in another similar one with the same execution plan perhaps only
once just to see its behaviour. Let's say you think "Hey, I could do it if I had the appropiate hints". But you do not know for sure
which hints exactly use. Here is a way to retrieve them!:



For this test, we'll keep things simple. I have tried it in a real environment with a complex query and it worked, anyways,
remember that hints are not orders to the optimizer, but suggestions or petitions (that is, as telling him, "please ,
if you are considering to use this path instead of the actual one, use it"). The optimizer could choose not to follow your sugestions.

We will create two tables in two different databases. In one of them , will create an index and gather statistics.
In the other one, we won't. Then, we will run a query and extract the execution plan.

In both of the databases we run:

create table prueba1(
campo number);

create table prueba2(
campo number);

insert into prueba1 (select 1 from dual connect by level <= 100);

insert into prueba2 (select 1 from dual connect by level <= 100);

insert into prueba1 values(2);

insert into prueba2 values(2);

commit;

So, now we  have two tables with skewed data.

We create an index in PRE1:

create index prueba1_test on prueba1(campo);

Next, we gather statistics in PRE1

PRE1 SQL >exec dbms_stats.gather_table_stats(ownname => 'TEST',tabname =>'PRUEBA1',cascade => true, estimate_percent => 100, method_opt=>'for all columns size 2', degree => 1);
PRE1 SQL >exec dbms_stats.gather_table_stats(ownname => 'TEST',tabname =>'PRUEBA2',cascade => true, estimate_percent => 100, method_opt=>'for all columns size 2', degree => 1);

We do not gather in PRE2


In Database PRE1, we run this query and get the execution plan:

select a.campo, b.campo from prueba1 a, prueba2 b where a.campo =b.campo and b.campo = 2;


PRE1 SQL >select a.campo, b.campo from prueba1 a, prueba2 b where a.campo = b.campo and b.campo = 2;

     CAMPO      CAMPO
---------- ----------
         2          2

PRE1 SQL >select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------                                       --------------------------------------------------
SQL_ID  0tqcandybp0jp, child number 0
-------------------------------------
select a.campo, b.campo from prueba1 a, prueba2 b where a.campo =
b.campo and b.campo = 2

Plan hash value: 1924520590

-------------------------------------------------------------------------------------
| Id  | Operation            | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |              |       |       |     2 (100)|          |

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------                                       --------------------------------------------------
|   1 |  MERGE JOIN CARTESIAN|              |     1 |     6 |     2   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL  | PRUEBA2      |     1 |     3 |     2   (0)| 00:00:01 |
|   3 |   BUFFER SORT        |              |     1 |     3 |     0   (0)|          |
|*  4 |    INDEX RANGE SCAN  | PRUEBA1_TEST |     1 |     3 |     0   (0)|          |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("B"."CAMPO"=2)
   4 - access("A"."CAMPO"=2)

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------                                       --------------------------------------------------



OK. Now, we go to PRE2. The setup is the same, with the difference that we did not create the index PRUEBA1_TEST and we did not gather statistics.


PRE2 SQL> select a.campo, b.campo
  2  from
  3  prueba1 a, prueba2 b
  4  where
  5  a.campo = b.campo and
  6  b.campo = 2;

     CAMPO      CAMPO
---------- ----------
         2          2

PRE2 SQL> select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  a0gmb2r4mq3pp, child number 0
-------------------------------------
select a.campo, b.campo from prueba1 a, prueba2 b where a.campo =
b.campo and b.campo = 2

Plan hash value: 2911335713

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |       |       |     4 (100)|          |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|*  1 |  HASH JOIN         |         |     1 |    26 |     4   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| PRUEBA1 |     1 |    13 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| PRUEBA2 |     1 |    13 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."CAMPO"="B"."CAMPO")
   2 - filter("A"."CAMPO"=2)
   3 - filter("B"."CAMPO"=2)

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


27 rows selected.




Now, let's say that we want to run the query in PRE1, but using the execution plan from PRE2. How do we get the hints we need?  (To know it can be very difficult in a more complex query)

So, we run in PRE2 next query:

PRE2 SQL> select * from table(dbms_xplan.display_cursor('a0gmb2r4mq3pp',null,'OUTLINE'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  a0gmb2r4mq3pp, child number 0
-------------------------------------
select a.campo, b.campo from prueba1 a, prueba2 b where a.campo =
b.campo and b.campo = 2

Plan hash value: 2911335713

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |       |       |     4 (100)|          |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|*  1 |  HASH JOIN         |         |     1 |    26 |     4   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| PRUEBA1 |     1 |    13 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| PRUEBA2 |     1 |    13 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
      OPTIMIZER_FEATURES_ENABLE('12.1.0.2')
      DB_VERSION('12.1.0.2')
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "A"@"SEL$1")
      FULL(@"SEL$1" "B"@"SEL$1")
      LEADING(@"SEL$1" "A"@"SEL$1" "B"@"SEL$1")
      USE_HASH(@"SEL$1" "B"@"SEL$1")
      END_OUTLINE_DATA
  */



PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."CAMPO"="B"."CAMPO")
   2 - filter("A"."CAMPO"=2)
   3 - filter("B"."CAMPO"=2)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)


44 rows selected.



Now, from the outline data section (highlighted in red) , we take the most relevant fields (I took the ones that matches a hint syntax. We can run the same query in PRE1 to check
if there are environment changes (in this case, ALL_ROWS, OPTIMIZER_FEATURES_ENABLE, etc). We can obviate them if they match.

And now, we run the query with those hints in PRE1:

PRE1 SQL>select /*+ FULL(@"SEL$1" "A"@"SEL$1") FULL(@"SEL$1" "B"@"SEL$1") LEADING(@"SEL$1" "A"@"SEL$1" "B"@"SEL$1") USE_HASH(@"SEL$1" "B"@"SEL$1") */  a.campo, b.campo from prueba1 a, prueba2 b where a.campo =b.campo and b.campo = 2;

     CAMPO      CAMPO
---------- ----------
         2          2

PRE1 SQL >select * from table(dbms_xplan.display_cursor());

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------                                       -----------
SQL_ID  4acrmj9afqga3, child number 0
-------------------------------------
select /*+ FULL(@"SEL$1" "A"@"SEL$1") FULL(@"SEL$1" "B"@"SEL$1")
LEADING(@"SEL$1" "A"@"SEL$1" "B"@"SEL$1") USE_HASH(@"SEL$1"
"B"@"SEL$1") */  a.campo, b.campo from prueba1 a, prueba2 b where
a.campo =b.campo and b.campo = 2

Plan hash value: 2911335713

------------------------------------------------------------------------------
| Id  | Operation          | Name    | Rows  | Bytes | Cost (%CPU)| Time     |

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------                                       -----------
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |         |       |       |     4 (100)|          |
|*  1 |  HASH JOIN         |         |     1 |     6 |     4   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| PRUEBA1 |     1 |     3 |     2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| PRUEBA2 |     1 |     3 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."CAMPO"="B"."CAMPO")

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------                                       -----------
   2 - filter("A"."CAMPO"=2)
   3 - filter("B"."CAMPO"=2)


25 rows selected.

PRE1 SQL >

Here we have it!.

As said, I tried with a more complex query (I cannot put it here for confidential reasons), with a couple of nested views and worked.

Remember, hints are suggestions or petitions to the optimizer. Sometimes the optimizer will not pay attention.
If you do not get plan you want, try to copy the statistics from one database to the other one , ( you could use perhaps the pending statistics feature)
and run again.
Also, take care that all the indexes and objects exist in both of the databases, beware of the oracle created indexes (related to constraints, etc)
as the ones whose name start with SYS_CXXX. They usually differ from one database to another.

It happened to me that in one execution, the plan did not match, but at the next one, did. ( i did not go deeper in this, so do not ask! :-p ). Try a couple of times.


Hope you find it useful!!

No hay comentarios: