Thursday, January 01, 2009

Scaling dynamic SQL

The story

I was working for one of the Russian biggest telcos back then and marketing guys came out with their next idea. One of the most frequent things your customers do is checking their balances. The idea was to attach an advertising to each balance response. However, usually you don't really want to attach some static message that you show to everyone because it is not the way your customers care about it. Think of Google, which became so popular because it prioritizes information by relevance. Doing something useless to your customers not only annoys them but wastes your bandwidth as well.

What you have to do is targeted adverts. What that means is that you may substantially increase the value by figuring out what your customers would like to see based on a data about them which you have in your billing system. And you know literally hundreds of things about any single customer. Starting from age, martial status, birthday, active services and ending up with approximate location based on triangulation data. Think of Google's sponsored links.

The system

Upon receiving a balance request, the system had to find out various stuff about requester and then figure out what to show based on the rules from a marketing department. Rules had to offer virtually unlimited flexibility and the system itself had to be flexible to introduce and remove targeting conditions on the fly.

The implementation

What we had is a simple table with a PL/SQL predicates, like this:
SQL> create table p
2 (
3 p_id number primary key,
4 p varchar2(4000)
5 );

Table created

SQL> insert into p values (1, 'to_number(:age) between 16 and 18');

1 row inserted
SQL> insert into p values (2, 'to_date(:birthday, ''yyyymmdd'')=trunc(sysdate) or :balance > 1000');

1 row inserted
SQL> insert into p values (3, 'to_number(:balance) between 100 and 200 and :age > 18');

1 row inserted

SQL> commit;

Commit complete
Of course, in reality it was a bit more complex than that, allowing for prioritizing, schedules, etc. As you'd probably already guessed, upon receiving a balance request we were binding data about customer into the above predicates to figure out which one evaluates to true which in turn lead to relevant advert. Given that you may be getting around 5000 balance requests each second plus you have to multiply this by a number of predicates you have to check, you already talking about tens of thousands evaluations per second. In other words, you have to make the entire thing pretty darn efficient.

Soft parsing

Soft parsing is something which is deemed by many to be inevitable as soon as you have to deal with stuff like above -- storing dynamically generated PL/SQL predicates in a table and executing these during runtime. And this is something we had to avoid because, even with stuff like session_cached_cursor, parse is still a parse, a scalability inhibitor and CPU waster. Think of not using KIWI.

What to do?

I'll show you a little trick you can use to break the dreaded marriage of dynamic SQL with parsing. Take a look at the following package:
create or replace package eval is

type t_cached_cursors is table of number index by p.p%type;
g_cached_cursors t_cached_cursors;

function evaluate(
p in p.p%type,
p_n in dbms_sql.Varchar2_Table,
p_v in dbms_sql.Varchar2_Table
) return boolean;

function evaluate_nc(
p in p.p%type,
p_n in dbms_sql.Varchar2_Table,
p_v in dbms_sql.Varchar2_Table
) return boolean;
end eval;

create or replace package body eval is

function evaluate(
p in p.p%type,
p_n in dbms_sql.Varchar2_Table,
p_v in dbms_sql.Varchar2_Table
) return boolean is
l_cursor number;
l_res number;
begin
begin
l_cursor:=g_cached_cursors(p);
exception when no_data_found then
l_cursor:=dbms_sql.open_cursor;
dbms_sql.parse(l_cursor, p, dbms_sql.native);
g_cached_cursors(p):=l_cursor;
end;
dbms_sql.bind_variable(l_cursor, 'l_res', l_res);

for i in 1 .. p_n.count
loop
if (instr(p, ':'||p_n(i)) > 0)
then
dbms_sql.bind_variable(l_cursor, p_n(i), p_v(i));
end if;
end loop;

l_res:=dbms_sql.execute(l_cursor);
dbms_sql.variable_value(l_cursor, 'l_res', l_res);

return (l_res=1);
end;

function evaluate_nc(
p in p.p%type,
p_n in dbms_sql.Varchar2_Table,
p_v in dbms_sql.Varchar2_Table
) return boolean is
l_cursor number;
l_res number;
begin
l_cursor:=dbms_sql.open_cursor;
dbms_sql.parse(l_cursor, p, dbms_sql.native);
dbms_sql.bind_variable(l_cursor, 'l_res', l_res);

for i in 1 .. p_n.count
loop
if (instr(p, ':'||p_n(i)) > 0)
then
dbms_sql.bind_variable(l_cursor, p_n(i), p_v(i));
end if;
end loop;

l_res:=dbms_sql.execute(l_cursor);
dbms_sql.variable_value(l_cursor, 'l_res', l_res);
dbms_sql.close_cursor(l_cursor);

return (l_res=1);
end;

end eval;
The package has two functions -- evaluate and evaluate_nc, the first one is using a simple caching trick. The idea behind evaluate function is really that simple -- upon opening and parsing a cursor, place it into in-memory table indexed by cursor text for further reuse instead of closing it. Each execution peeks at that in-memory table to see if there is a cursor we can reuse instead of going through the whole parsing exercise.

The usage is simple as well:
SQL> declare
2 l_n dbms_sql.varchar2_table;
3 l_v dbms_sql.varchar2_table;
4 l_res boolean;
5 begin
6 l_n(1):='age';
7 l_n(2):='birthday';
8 l_n(3):='balance';
9 l_v(1):='16';
10 l_v(2):='20090101';
11 l_v(3):='1500';
12
13 for cur in (select p_id, 'declare l_res number; begin :l_res:=case when ('||p||') then 1 else
0 end; end;' p from p)
14 loop
15 l_res:=eval.evaluate(cur.p, l_n, l_v);
16 dbms_output.put_line('eval p'||to_char(cur.p_id)||': '||case when l_res then 'true' else 'fal
se' end);
17 end loop;
18 end;
19 /
eval p1: true
eval p2: true
eval p3: false
Let's do some tests now. Here are two test procedures:
create or replace procedure test_cached(
p_i in number
) is
l_p dbms_sql.Varchar2_Table;
l_n dbms_sql.varchar2_table;
l_v dbms_sql.varchar2_table;
l_res boolean;
begin
l_n(1):='age';
l_n(2):='birthday';
l_n(3):='balance';

l_v(1):='100';
l_v(2):='20090101';
l_v(3):='1000';

select 'declare /* cached */ l_res number; begin :l_res:=case when ('||p||') then 1 else 0 end; end;'
bulk collect into l_p
from p;

dbms_monitor.session_trace_enable(waits => true, binds => false);
for i in 1 .. p_i
loop
for j in 1 .. l_p.count
loop
l_res:=eval.evaluate(l_p(j), l_n, l_v);
end loop;
end loop;
dbms_monitor.session_trace_disable;
end;

create or replace procedure test_not_cached(
p_i in number
) is
l_p dbms_sql.Varchar2_Table;
l_n dbms_sql.varchar2_table;
l_v dbms_sql.varchar2_table;
l_res boolean;
begin
l_n(1):='age';
l_n(2):='birthday';
l_n(3):='balance';

l_v(1):='100';
l_v(2):='20090101';
l_v(3):='1000';

select 'declare /* not_cached */ l_res number; begin :l_res:=case when ('||p||') then 1 else 0 end; end;'
bulk collect into l_p
from p;

dbms_monitor.session_trace_enable(waits => true, binds => false);
for i in 1 .. p_i
loop
for j in 1 .. l_p.count
loop
l_res:=eval.evaluate_nc(l_p(j), l_n, l_v);
end loop;
end loop;
dbms_monitor.session_trace_disable;
end;
I tested each of these using four parallel jobs with 100000 iterations (p_i parameter) each. Here is what I've got (I'm using one of the predicates as an example):
declare /* cached */ l_res number; begin :l_res:=case when (to_number(:age) 
between 16 and 18) then 1 else 0 end; end;


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 100000 1.40 1.29 0 0 0 100000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 100001 1.40 1.29 0 0 0 100000

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 40 (recursive depth: 2)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
cursor: pin S wait on X 35 0.01 0.37
cursor: pin S 91 0.00 0.00

declare /* not_cached */ l_res number; begin :l_res:=case when
(to_number(:age) between 16 and 18) then 1 else 0 end; end;


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 100000 1.00 1.21 0 0 0 0
Execute 100000 2.73 2.74 0 0 0 100000
Fetch 0 0.00 0.00 0 0 0 0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 200000 3.73 3.95 0 0 0 100000

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 40 (recursive depth: 2)

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
library cache: mutex X 116 0.00 0.00
cursor: mutex S 24 0.00 0.00
cursor: pin S 83 0.00 0.00
cursor: pin S wait on X 30 0.01 0.28
That's a three times improvement. Note that in first tkprof report we got only one parse, while in second one amount of parses equals executions, as we expected.

11 comments:

  1. I recognize the backgroung on your new picture on the blog. :)

    Btw, in no_cache case execute time itself increased twice increasing time by 100% while parsing itself added another 100%. Right?

    ReplyDelete
  2. Alex,

    that's going to be my "new" office for some time, yes.

    If we take figures relatively to exec time of the first example then yes.

    As you can spot, 11G seems to gone "mutex'ed everywhere" way further. When I've implemented this on 9.2.0.6 running on HP-UX the difference was around an order of magnitude, but that system was latching quite a bit already so returns on avoiding serialization mechanisms were much greater as well.

    ReplyDelete
  3. Alex, did you try dbms_rule for this task?

    ReplyDelete
  4. Stas, I guess you just figured out one of my blog posts :-)

    http://afatkulin.blogspot.com/2009/01/using-dbmsruleadm-to-evaluate-dynamic.html

    ReplyDelete
  5. Hi,
    i have exactly the same situation. I have an API function that takes three arguments: Object type, varchar2, integer. DBMS_SQL api does not support object types for binding(9.2). So, i am using NDS instead.

    Do you have any idea that i let oracle not to soft parse these dynamic function calls?

    Regards
    Mennan

    ReplyDelete
  6. Hi,
    Did you try precompile all the rules into some package and call the required procedures?

    I mean something like
    case rule_id
    when 1 then
    return pkg_my_rules.rule_1(...);
    when 2 then
    return pkg_my_rules.rule_2(...);
    ...
    end case;
    That works much faster than dynamic sql (up to 10 times faster).

    ReplyDelete
  7. Vladimir,

    but that will stop this from being dynamic, no?

    Each time you create or modify the rule you'll have to recompile the package as well as the evaluation routine.

    ReplyDelete
  8. Alex Fatkulin,

    You are right. That would not be very dynamic. I guess the most common case is:
    * rules change often during the development
    * rules are almost static in production

    I understand, there could be very dynamic queries (like information retrieval queries based on the specific criteria or even sql from the user). Those kind of dynamics is very hard to improbable to pre-compile. On the other hand, "rules" are usually quasi-static and they benefit much from being pre-compiled.


    I do not see too much trouble in recompiling the package as the rules change.

    In fact, you could have even two copies of dispatch packages so the users could use the first one while you recompile the second. As the second one is compiled and verified, you could notify the application to switch to a new dispatch implementation. That would require almost zero downtime (I mean no threads would wait for the recompilation to complete).

    ReplyDelete
  9. "* rules are almost static in production"

    Unfortunately in my case this was false. Remember, we're talking about marketing guys. They do this for a living.

    Having the extra code to compile and maintain the package (let alone having a second package for switching purposes) sounds like more stuff and complexity to maintain (at least to me).

    ReplyDelete
  10. What is the way you clear the cursor cache then?

    I bet, the change rate does not exceed 1 per hour. In fact, I do not believe billing could really change that fast. More real case would be like once-twice per month.

    Provided there was "around 5000 balance requests each second", one recompilation each hour will do the magic: it would eliminate the SQL completely.

    ReplyDelete
  11. In our case the cache was clearing by "itself" as application was going through connection pool with time expiration.

    ReplyDelete