Monthly Archives: July 2016

Synthetic data – nice little SQL feature

This blog was inspired by a Twitter conversation between Jeffrey Kemp and Franck Pachot about little SQL tricks.
Countless times I’ve created synthetic data to model a case or just play around in general. For quick and dirty and small volumes I prefer “CONNECT BY”. When doing it properly, I use recursive CTEs.

But this puts a smile on my face every time I look at it.

select
    item.getNumberVal() num
from xmltable('1 to 10,20,30,40 to 50') item
;

       NUM
----------
	 1
	 2
	 3
	 4
	 5
	 6
	 7
	 8
	 9
	10
	20
	30
	40
	41
	42
	43
	44
	45
	46
	47
	48
	49
	50

23 rows selected.

Here’s what I usually start with as a base and then tweak the script to my current needs.

drop table t1 purge;
create table t1 (
    id number
  , num_small number (5, 0)
  , num_big number
  , vc2_short varchar2(200 byte)
  , vc2_long varchar2(4000 byte)
  , dt date
  , cl clob
)
  tablespace users
  lob (cl) store as securefile t1_cl_seg (
    tablespace users
    disable storage in row
    retention auto
    nocache logging
    chunk 32K)
;
 
insert /*+ append */ into t1 (id, num_small, num_big, vc2_short, vc2_long, dt)
with datagen as (
  select --+ materialize
      rownum as id
  from
      dual
  connect by level <= 1e4
)
select
    rownum as id
  , trunc(dbms_random.value(1, 100)) as num_small
  , round(dbms_random.value(10000, 1000000), 4) as num_big
  , dbms_random.string('L', trunc(dbms_random.value(10, 200))) as vc2_short
  , dbms_random.string('A', trunc(dbms_random.value(200, 4000))) as vc2_long
  , trunc(sysdate + dbms_random.value(0, 366)) as dt
from
    datagen dg1
  , datagen dg2
where
    rownum <= 1e4
;
 
commit;
 
<<populate_lob>>
declare
  num_rows number;
  cl clob;
  chunk_size integer := 1024; -- 1 KB
  max_cl_size integer := 8192; -- 8 KB
begin
  select count(*) cnt
  into num_rows
  from t1
  ;
  
  for row_num in 1 .. num_rows loop
    dbms_lob.createtemporary(cl, false, dbms_lob.call);
 
    for idx in 1 .. trunc(dbms_random.value(1, (max_cl_size / chunk_size))) loop
      dbms_lob.writeappend(cl, chunk_size, dbms_random.string('A', chunk_size));
      null;
    end loop;
 
    update t1
      set t1.cl = populate_lob.cl
    where id = row_num
    ;
    commit;
 
    dbms_lob.freetemporary(cl);
  end loop;
end populate_lob;
/
 
--alter table t1 modify lob (cl) (cache logging);

I hope above little trick amuses you as much as it did me 🙂
Imagine, it made me blog about it!