Rafu on db: modeling

Showing posts with label modeling. Show all posts

2011-05-06

many to many access path -view

Having a many to many relationship. It is the access path from one table to another. Lets call them cust, ord and prod tables.


cust 1-* ord *-1 prod

There is a need to query prod that has an access path to a certain cust. And we need a database view to implement this. So we want prod out and give a cust to the query as a parameter. The first impression to a developer is to do joins to ord table. That would lead to duplicates in the result. Something that is not desired. It is possible to implement a fine performing view without those duplicates. Here is an example of such.



create table cust (cust_id number constraint cust_pk primary key, cname varchar2(20) not null);

create table prod (prod_id number constraint prod_pk primary key, pname varchar2(20) not null);

create table ord ( ord_id number constraint ord_pk primary key
                 , cust_id constraint ord_cust_fk references cust not null
                 , prod_id constraint ord_prod_pk references prod not null
                 , amount number not null);

create index ord_cust_fk_idx on ord(cust_id,prod_id);

create or replace view custprod as
select cust.cust_id
     , cust.cname
     , prod.prod_id
     , prod.pname
  from cust,prod
 where exists (select 1 from ord where ord.prod_id=prod.prod_id and ord.cust_id=cust.cust_id)
;

select prod_id,pname from custprod where cust_id = :cust;

So there it is. Take only those rows to the from clause you want results from. Use exists on another parts of the access path.
Some data and a query plan.



insert all into cust(cust_id,cname) values (le,le||le)
           into prod(prod_id,pname) values (le,le||le)
 select level le from dual connect by level < 1000; 

begin 
 dbms_stats.gather_table_stats(user,'PROD');
 dbms_stats.gather_table_stats(user,'CUST');
end;
/

insert into ord select rownum,cust_id,prod_id,mod(cust_id,prod_id) from cust, prod where cust_id <= prod_id;

begin 
 dbms_stats.gather_table_stats(user,'ORD');
end;
/


select prod_id,pname from custprod where cust_id = :cust;

--------------------------------------------------------
| Id  | Operation                    | Name            |
--------------------------------------------------------
|   0 | SELECT STATEMENT             |                 |
|   1 |  NESTED LOOPS                |                 |
|   2 |   NESTED LOOPS               |                 |
|   3 |    NESTED LOOPS              |                 |
|   4 |     INDEX UNIQUE SCAN        | CUST_PK         |
|   5 |     SORT UNIQUE              |                 |
|   6 |      INDEX RANGE SCAN        | ORD_CUST_FK_IDX |
|   7 |    INDEX UNIQUE SCAN         | PROD_PK         |
|   8 |   TABLE ACCESS BY INDEX ROWID| PROD            |
--------------------------------------------------------

drop table ord;

drop table prod;

drop table cust;

drop view custprod;

2010-05-18

Types of columns

Just participated C.J.Date two day seminar in 10 hours today. Thanks to the ash from Iceland. Monday was delayed.

Just a small thing picked up from the massive amount of information. To avoid type conversions behind the scenas while doing natural joins avoid using different types for columns named similarily. Just checking that



select * from (
select count(distinct data_type
                      ||'|'||cast(data_length as varchar2(30))
                      ||'|'||cast(data_precision as varchar2(30))
                      ||'|'||cast(data_scale as varchar2(30))
            ) over (partition by column_name) as dis
     , table_name
     , column_name
     , data_type
     , data_length
     , data_precision
     , data_scale 
  from user_tab_columns
) 
where dis > 1
;

Well on the other hand do not use select *. As an analogy I would suggess not to use natural joins. Mr Date suggested to use views on to of base tables. That makes sense. And a good thing here user_tab_columns has also the columns in views included.

2010-04-27

Not overlapping (MV approach)

Presented earlier a not overlapping function based unique indexes approach. In this post I am using a materialized view and a unique constraining that. Several commits seen in here because MV approach makes constraints kind of deferrable. The MV is refreshed at commit phase. As the dirty hack function based unique indexes on the table itself are violated straight at the insert.



DROP TABLE Z CASCADE CONSTRAINTS PURGE;

DROP TABLE YEARS CASCADE CONSTRAINTS PURGE;

DROP MATERIALIZED VIEW Z_MV;

CREATE TABLE Z(Z NUMBER(16) NOT NULL
             , VALIDFROM NUMBER(4) NOT NULL
             , VALIDTILL NUMBER(4) NOT NULL
             , CONSTRAINT FRO2000 CHECK (2000 < VALIDFROM)
             , CONSTRAINT TIL2050 CHECK (VALIDTILL <= 2050)
             , CONSTRAINT FROTIL CHECK (VALIDFROM <= VALIDTILL)
             );


CREATE TABLE YEARS AS 
SELECT 2000+LEVEL TIM FROM DUAL CONNECT BY LEVEL < (2051-2000)
;

CREATE MATERIALIZED VIEW LOG ON Z WITH ROWID
;

CREATE MATERIALIZED VIEW LOG ON YEARS WITH ROWID
;

CREATE MATERIALIZED VIEW Z_MV REFRESH FAST ON COMMIT AS
SELECT Z.ROWID ZRID,T.ROWID TRID,Z.Z,T.TIM 
  FROM Z INNER JOIN YEARS T ON VALIDFROM < T.TIM AND T.TIM <= VALIDTILL 
;

With Oracle 11.1.0.7 and 11.2.0.1 getting
ORA-12054: cannot set the ON COMMIT refresh attribute for the materialized view

No worries. It is about inner join syntax not so widely supported with MVs.



CREATE MATERIALIZED VIEW Z_MV REFRESH FAST ON COMMIT AS
SELECT Z.ROWID ZRID,T.ROWID TRID,Z.Z,T.TIM 
  FROM Z,YEARS T 
 WHERE VALIDFROM < T.TIM AND T.TIM <= VALIDTILL 
;

ALTER TABLE Z_MV ADD CONSTRAINT Z_MV_U UNIQUE (Z,TIM);

CREATE INDEX Z_MV_ZRID_IDX ON Z_MV(ZRID);

ZRID indexed to give the optimizer at least a possibility to do small updates "fast". More about the issue may be read from Alberto Dell'Era's Oracle blog



INSERT INTO Z VALUES(1,2001,2011);

INSERT INTO Z VALUES(1,2011,2011);

COMMIT;

INSERT INTO Z VALUES(1,2010,2012);

COMMIT;

SQL ERROR: ORA-12008: ERROR IN MATERIALIZED VIEW REFRESH PATH
ORA-00001: UNIQUE CONSTRAINT (RAFU.Z_MV_U) VIOLATED


INSERT INTO Z VALUES(2,2049,2050);

COMMIT;

INSERT INTO Z VALUES(2,2049,2050);

COMMIT;

SQL ERROR: ORA-12008: ERROR IN MATERIALIZED VIEW REFRESH PATH
ORA-00001: UNIQUE CONSTRAINT (RAFU.Z_MV_U) VIOLATED



INSERT INTO Z VALUES(2,2010,2012);

COMMIT;

INSERT INTO Z VALUES(2,2001,2049);

COMMIT;

SQL ERROR: ORA-12008: ERROR IN MATERIALIZED VIEW REFRESH PATH
ORA-00001: UNIQUE CONSTRAINT (RAFU.Z_MV_U) VIOLATED


INSERT INTO Z VALUES(2,2014,2017);

COMMIT;

SELECT * FROM Z ORDER BY Z, VALIDFROM;

1 2001 2011
1 2011 2011
2 2010 2012
2 2014 2017
2 2049 2050

SELECT COUNT(*) FROM Z_MV;

16

2010-03-26

Pivoting EAV

If you are responsible for designing a data model and just consider to invent again and create this fine generic entity attribute value structure, maybe you should consider attending some teaching about the issue. For example some available soon by C.J. Date in and near Finland.

Well maybe you have a EAV model that you have to deal with. Example


SQL> create table eav as
  2  select 1 e, 'first' a, 'Timo' v from dual union all
  3  select 1 e, 'last' a, 'Raitalaakso' v  from dual union all
  4  select 1 e, 'nic' a, 'Rafu' v  from dual union all
  5  select 2 e, 'first' a, 'John' v  from dual union all
  6  select 2 e, 'last' a, 'Doe' v from dual
  7  ;

Table created.

SQL> select * from eav;

         E A     V
---------- ----- -----------
         1 first Timo
         1 last  Raitalaakso
         1 nic   Rafu
         2 first John
         2 last  Doe

You should not query it in a basic case using joins. Most possibly you have tens of joins to the same table.


SQL> select la.e, fi.v firs, la.v las
  2    from eav la, eav fi
  3   where la.e=fi.e
  4     and fi.a='first'
  5     and la.a='last'
  6  ;

         E FIRS        LAS
---------- ----------- -----------
         1 Timo        Raitalaakso
         2 John        Doe

It is a pivot you want to do.


SQL> select e, firs, las
  2    from eav
  3    pivot (max(v) for a in ('first' as firs, 'last' as las))
  4  ;

         E FIRS        LAS
---------- ----------- -----------
         1 Timo        Raitalaakso
         2 John        Doe

With the pivot you get the nullable columns also easier without filtering out the whole entity


SQL> select e, firs, las, ni
  2    from eav
  3    pivot (max(v) for a in ('first' as firs, 'last' as las, 'nic' as ni))
  4  ;

         E FIRS        LAS         NI
---------- ----------- ----------- -----------
         1 Timo        Raitalaakso Rafu
         2 John        Doe

Maybe you do not have 11g features available.


SQL> select e
  2       , max(case when a = 'first' then v end) firs
  3       , max(case when a = 'last' then v end) las
  4    from eav
  5   group by e
  6  ;

         E FIRS        LAS
---------- ----------- -----------
         1 Timo        Raitalaakso
         2 John        Doe

And the best thing to do with it might be.


SQL> drop table eav purge;

Table dropped.

2009-10-08

Not overlapping daily

Attending. First day gone. Looking forward for tommorow. Tanel should you call your seminar Basic Oracle Troubleshooting Seminar as you talk about how a certain c program is executing function by function? Recommended attendance to all who troubleshoot Oracle.

Back to basics. Should it not be basics of a rdbms system to launch a trigger on an event that it is instructed to execute. With Oracle the issue is not so obvious. A certain kind of a compound trigger will not fire when called through jdbc. Metalink -soon to retire- bug no 6785707.

Another reason for my previous post. But was i thinking too complex. There is actually no need to track bitwise the used years. The problem is not O 2^n problem, but O n. As n is the number of distinct possible values in the validity interval. How about changing possible values from yearly to daily.



drop table z cascade constraints purge;

create table z(z number(16) not null
              , validfrom date not null
              , validtill date not null
              , constraint tilld check (trunc(validfrom)=validfrom)
              , constraint fromd check (trunc(validtill)=validtill)
              , constraint fro2000 
                     check (to_date('20000101','yyyymmdd') < validfrom)
              , constraint til2050 
                     check (validtill <= to_date('20500101','yyyymmdd'))
              , constraint frotil check (validfrom <= validtill)
              );

begin 
 for i in (
  select 'create unique index z'||d||' 
              on z (case when validfrom <= to_date('''||d||''',''yyyymmdd'') 
                          and to_date('''||d||''',''yyyymmdd'') < validtill 
                         then z 
                         else null 
                     end)' createindex
   from
  (select level l
        , to_char(to_date('20000101','yyyymmdd')+level-1,'yyyymmdd') d 
     from dual 
  connect by level<=to_date('20500101','yyyymmdd')-to_date('20000101','yyyymmdd')
  )
  order by l
  )
 loop
  execute immediate i.createindex;
 end loop;
end;
/

begin
*
ERROR at line 1:
ORA-01792: maximum number of columns in a table or view is 1000
ORA-06512: at line 18

Not a production ready aproach. Howcome index creation states that no more columns for a table?


select count(*) 
  from user_indexes 
 where table_name = 'Z';

  COUNT(*)
----------
       997

select count(*) 
  from user_tab_columns 
 where table_name = 'Z';

  COUNT(*)
----------
         3

select count(*) 
  from user_tab_cols 
 where table_name = 'Z';

  COUNT(*)
----------
      1000

A function based index is creating a invisible virtual column to table. A table may have at least 1000 columns.

Well we have 997 first days covered. How a small inserting test is performing with those 997 indexes?



SQL> insert into z
  2  values(1,to_date('20010101','yyyymmdd'),to_date('20010102','yyyymmdd'));

1 row created.

Elapsed: 00:00:00.50
SQL>
SQL> insert into z
  2  values(1,to_date('20020101','yyyymmdd'),to_date('20020102','yyyymmdd'));

1 row created.

Elapsed: 00:00:00.40
SQL>
SQL> insert into z
  2  values(1,to_date('20020102','yyyymmdd'),to_date('20030101','yyyymmdd'));

1 row created.

Elapsed: 00:00:00.42
SQL>
SQL> insert into z
  2  values(1,to_date('20020201','yyyymmdd'),to_date('20020202','yyyymmdd'));
insert into z
*
ERROR at line 1:
ORA-00001: unique constraint (RAFU.Z20020201) violated


Elapsed: 00:00:01.50
SQL>
SQL> select * from z;

         Z VALIDFRO VALIDTIL
---------- -------- --------
         1 20010101 20010102
         1 20020101 20020102
         1 20020102 20030101

Elapsed: 00:00:01.01
SQL>
SQL> update z
  2     set validfrom=to_date('20010101','yyyymmdd')
  3   where z=1 and validfrom=to_date('20020101','yyyymmdd');
update z
*
ERROR at line 1:
ORA-00001: unique constraint (RAFU.Z20010101) violated


Elapsed: 00:00:01.15

2009-09-28

Not overlapping

Lets have a validity period in a table and a rule not to have overlapping periods. The correct solution is to have an another table having the concurrency lock and a combound trigger to handle the overlapping check. Or might there be an alternative way? A way that does not force us to rely that all code modifying the table remember to use the concurrency lock table before modifying the periods. Here it comes. A great abuse of the "bad hack" function based indexes. No need for the concurrency table or triggers.



drop table z cascade constraints purge;

create table z(z number(16) not null
             , validfrom number(4) not null
             , validtill number(4) not null
             , constraint fro2000 check (2000 < validfrom)
             , constraint til2050 check (validtill <= 2050)
             , constraint frotil check (validfrom <= validtill)
             );

create or replace function f(fro number,til number) 
return number 
deterministic
as
  n number(16);
begin
  n:=0;
  for i in fro..(til-1) loop
    n:= n + power(2,i-2001);
  end loop;
  return n;
end;
/


create unique index z1 on z (case when bitand(1,f(validfrom,validtill)) = 1 then z else null end); 

create unique index z2 on z (case when bitand(2,f(validfrom,validtill)) = 2 then z else null end); 

create unique index z3 on z (case when bitand(4,f(validfrom,validtill)) = 4 then z else null end); 

create unique index z4 on z (case when bitand(8,f(validfrom,validtill)) = 8 then z else null end); 

create unique index z5 on z (case when bitand(16,f(validfrom,validtill)) = 16 then z else null end); 

create unique index z6 on z (case when bitand(32,f(validfrom,validtill)) = 32 then z else null end); 

create unique index z7 on z (case when bitand(64,f(validfrom,validtill)) = 64 then z else null end); 

create unique index z8 on z (case when bitand(128,f(validfrom,validtill)) = 128 then z else null end); 

create unique index z9 on z (case when bitand(256,f(validfrom,validtill)) = 256 then z else null end); 

create unique index z10 on z (case when bitand(512,f(validfrom,validtill)) = 512 then z else null end); 

--You maybe got the idea. Left out indexes z11-z46 ... 

create unique index z47 on z (case when bitand(70368744177664,f(validfrom,validtill)) = 70368744177664 then z else null end); 

create unique index z48 on z (case when bitand(140737488355328,f(validfrom,validtill)) = 140737488355328 then z else null end); 

create unique index z49 on z (case when bitand(281474976710656,f(validfrom,validtill)) = 281474976710656 then z else null end); 

SQL> insert into z values(1,2001,2011);

1 row created.

SQL> insert into z values(1,2011,2011);

1 row created.

SQL> insert into z values(1,2010,2012);
insert into z values(1,2010,2012)
*
ERROR at line 1:
ORA-00001: unique constraint (RAFU.Z10) violated

SQL> insert into z values(2,2049,2050);

1 row created.

SQL> insert into z values(2,2049,2050);
insert into z values(2,2049,2050)
*
ERROR at line 1:
ORA-00001: unique constraint (RAFU.Z49) violated

SQL> insert into z values(2,2010,2012);

1 row created.

SQL> insert into z values(2,2001,2049);
insert into z values(2,2001,2049)
*
ERROR at line 1:
ORA-00001: unique constraint (RAFU.Z10) violated

SQL> insert into z values(2,2014,2017);

1 row created.

2009-09-11

Only one (is_current)

Kimball writes in The Date Warehouse Toolkit about slowly changing dimension type 6. Seen such structures with marked current values in a separate attribute. Like in wikipedia. If the dimension is storing transaction time as durations then the current indicator column might as well be a virtual column in 11g, evaluated to Y if the end day is set to end_date value. No need to update the current indicator column when populating new values for a dimension.

There should be at most one current indicator = Y row for each supplier. To satisfy this requirement it is possible to create a unique constraint to a virtual column. Actually I would replace the Y value with the supplier_key if the end of days is set in the row. And put the unique key to that column as show earlier. Adding such a column to an existing model might be problematic because the ETL or mainetenance software might populate new rows before updating the old ones. Should there exist atomic multi table merge? With 11g and virtual columns this is quite easy to overcome by setting the newly created unique constraint deferrable initially deferred.

But how about 10.2 without virtual columns. Lets dig into the only one, insert first and update then problem with an example without durations.


drop table di purge;

drop materialized view di_mv;

create table di(di_id number(4) primary key
               , sour number(4) not null
               , val number(4) not null
               , is_current number(1) not null);

Each sour should have only one current. This can be acheaved by creating a bad hack function based index.


create unique index d_only_one_u on di(case when is_current = 1 then sour end);

insert into di (di_id,sour,val,is_current) values (1,1,0,0);

insert into di (di_id,sour,val,is_current) values (2,1,1,1);

insert into di (di_id,sour,val,is_current) values (3,2,2,1);

Problematic ETL prosess might want to populate new values first and update the current status at the end of transaction.


insert into di (di_id,sour,val,is_current) values (4,2,3,1);
insert into di (di_id,sour,val,is_current) values (4,2,3,1)
*
ERROR at line 1:
ORA-00001: unique constraint (RAFU.D_ONLY_ONE_U) violated

Unique index is checked immediately and now row is allowed to be inserted.
Glue on commit refreshable materialized view to the model.


drop index d_only_one_u;

create materialized view log on di with rowid; 

create materialized view di_mv refresh on commit as
select sour,is_current, rowid drid
  from di
 where is_current = 1
;

alter table di_mv
add constraint sour_u unique (sour);

insert into di (di_id,sour,val,is_current) values (5,2,3,1);

commit;
commit
*
ERROR at line 1:
ORA-12008: error in materialized view refresh path
ORA-00001: unique constraint (RAFU.SOUR_U) violated

And it behaves like an deferrable initially deferred unique constraint. Uniquenes check is done at the end of a transaction.


insert into di (di_id,sour,val,is_current) values (6,2,3,1);

update di set is_current = 0 where di_id = 3;

commit;

Ok. The downside is that di_mv is using space. May be worth using, if your data population is beeing coded by humans.

2009-09-01

Only one (alternative 2)

10g friendly approach to store primary currencies. No virtual column used. Structure seems a lot like Tom Kyte suggested. Difference is that all country currencies are here populated in the same table and the materialized view replaced with a foreign key constraint.


create table country(country varchar2(2) constraint country_pk primary key
                   , primary_currency varchar(3) not null
                    )
;

create table currency(country references country
                    , currency varchar(3)
                    , constraint currency_pk primary key (currency,country)
                    )
;

alter table country
  add constraint primary_curr_in_country_curr
  foreign key(primary_currency,country)
  references currency(currency,country)
  deferrable initially deferred
;

Populating


insert into country (country,primary_currency) values ('US','USS');

insert into currency (country,currency) values ('US','USD');

insert into currency (country,currency) values ('US','USN');

insert into currency (country,currency) values ('US','USS');

commit;

insert into country (country,primary_currency) values ('FI','EUR');

insert into currency (country,currency) values ('FI','EUR');

insert into currency (country,currency) values ('FI','FIM');

commit;

Changing the default. Easier than before. Less code


update country set primary_currency='USD' where country='US';

commit;

select * from country;
country primary_currency
US      USD
FI      EUR

select * from currency;
country currency
FI      EUR
FI      FIM
US      USD
US      USN
US      USS

Testing the model:



update country set primary_currency='USD' where country='FI';

commit;

ORA-02091: transaction rolled back
ORA-02291: integrity constraint (RAFU.PRIMARY_CURR_IN_COUNTRY_CURR) violated -
parent key not found

insert into country (country, primary_currency) values ('SW','SEK');

commit;

ORA-02091: transaction rolled back
ORA-02291: integrity constraint (RAFU.PRIMARY_CURR_IN_COUNTRY_CURR) violated -
parent key not found

There is only one default currency per country, should be obvious.

Be sure to run unindex on your production model.

2009-08-27

Only one

Last September Tom Kyte wrote a yet another good article in Oracle magazine The Trouble with Triggers. The point about triggers is very good, but the example and the "Correct Answer" I do not like. He is suggesting to model currencies to two separate tables. In my opinion the data model is entirely wrong if the same thing is modeled in many places. "Less code equals fewer bugs. Look for ways to write less code." Here is my approach to the problem without a materialized view and materialized view logs.

The problem itself is that
-a country must have a default currency
-there is only one default currency per country


create table country(country varchar2(2) constraint country_pk primary key);

create table currency(country references country
                    , currency varchar(3)
                    , is_primary varchar2(1) not null 
                                             check (is_primary in ('Y','N'))
                    , primary_country 
                            as (case when is_primary = 'Y' then country end) 
                               virtual 
                        constraint only_one_primary_u unique 
                    , constraint currency_pk primary key (currency,country)
                    )
;

alter table country
  add constraint must_have_at_least_one_primary
  foreign key(country)
  references currency(primary_country)
  deferrable initially deferred;

And that is it. One may see that there is used the "bad hack" behind only_one_primary_u. The unique constraint generates a function-based normal index.
The good thing about this is that constraints are visible in constraint list, not implemented only in a index or a materialized view.


select * 
  from user_constraints 
 where constraint_name 
    in ('ONLY_ONE_PRIMARY_U','MUST_HAVE_AT_LEAST_ONE_PRIMARY');