Thursday, November 22, 2012

To parse or not to parse

To parse or not to parse

When your application sends SQL to the database it has to be parsed. This happens automatically but you can influence the way the Oracle® database parses statements with parameters and bind variables. But when to use bind variables and when to avoid them?

When to avoid is detailed below it is a special case. First this article explains the basic kind of parsings in Oracle database 10g. Then it gives an practical example with a test table and SQL trace to show how the database system works and where the differences are.

When you are not using bind variables and your database uses default parameters your statement will in all probability be hard parsed. That means in a nutshell:
  • syntactical analyse the sql text - is it well formed with the right keywords?
  • semantical analyse the sql text - do the tables/columns exist, are they ambigious ...
  • environmental check which effective parameter has the session.
  • calculation a hash value of the statement and looking in the shared pool for a matching statement.
  • and if not found optimization to find an execution plan.
  • finally the execution of the SQL Statement
Whether your statement is hardparsed or softparsed depends on if there is a matching SQL with the same hash code in the shared pool and it has the same parameters (enviroment). With a matching statement the most expensive step, the optimization or optimal plan generation can be skipped and the exisiting plan can be reused. Whether there is a matching statement or not depends on bind variables. See the two statements:



select * from emp where empno = 1;
select * from emp where empno = 2;

if you send both of them to the database they are both hardparsed because statement one hash code is not equal to the hash code of statement two. In case of bind variables the database compares this statement:

select * from emp where empno = :1;


With whatever number you bind the statement the hashcode is always the same. So depending on the recurrences you can save a lot of hardparses.

The third kind of parsing is the one with the highest performance simply no parse. You can realise it with a programming technique where you prepare a statement and execute it n times without closing the statment and just rebinding the variables.

In the example a simple insert statement on a test table is 10.000 times repeated. Example Code is here:

First we create a test table:
SQL> create table test_tab(name varchar(200), id number(10));

Table created.

Now we have three different functions inserting data into test_tab see the source code in Java(TM) technology. While executing the functions I trace all statement with sql trace. Here are the results of the file:

Hardparse:
insert into test_tab (name, id) 
values
('name0',0)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0         78          0           0
Execute      1      0.00       0.00          0          1          5           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.00          0         79          5           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61  
********************************************************************************

insert into test_tab (name, id) 
values
('name1',1)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          1          5           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.00          0          1          5           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61  
********************************************************************************

insert into test_tab (name, id) 
values
('name2',2)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          1          5           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.00          0          1          5           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61  
********************************************************************************

insert into test_tab (name, id) 
values
('name3',3)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          1          5           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.00          0          1          5           1

...
You have for every statements a different hashcode because the strings are not equal. That means one Hardparse and one execution for every statement.

Softparse:
insert into test_tab (name, id) 
values
(:1,:2)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse    10000      0.14       0.22          0          0          0           0
Execute  10000      1.15       1.20          3      10070      50193       10000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    20000      1.30       1.42          3      10070      50193       10000

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61  
Due to the bind variables :1 and :2 the Strings are equal we have all Strings summarized 10.000 softparses and 10.000 executions.

Noparse
insert into test_tab (name, id) 
values
(:1,:2)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute  10000      1.73       2.03          0          1      27326       10000
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    10001      1.73       2.03          0          1      27326       10000

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 61  
This is the way to execute statements with the highest performance. You have one parse and 10.000 executions. In fact you dont need time for parsing.

This Article applies to:Oracle Database (any Version)
Initial published:22.01.2008
Last change:10.04.2008