{"id":210,"date":"2017-05-13T17:03:33","date_gmt":"2017-05-13T17:03:33","guid":{"rendered":"https:\/\/www.spotonoracle.com\/?p=210"},"modified":"2017-05-13T17:03:33","modified_gmt":"2017-05-13T17:03:33","slug":"scalable-primary-keys-guid-follow-up","status":"publish","type":"post","link":"https:\/\/www.spotonoracle.com\/?p=210","title":{"rendered":"Scalable primary keys (GUID follow-up)"},"content":{"rendered":"<p>In the <a href=\"https:\/\/www.spotonoracle.com\/?p=166\" target=\"_blank\">previous post<\/a> we have seen that GUIDs as primary keys can lead to some serious performance problems. What we need is a primary key that won&#8217;t cause excessive I\/O on inserts and reduce contention on the right hand side index leaf block. And I like it to be RAC aware.<br \/>\nTo build a primary key that addresses above mentioned issue we use three different components:<\/p>\n<ul>\n<li>the Oracle instance number<\/li>\n<li>a one (or two) digit number<\/li>\n<li>Oracle generated sequence number<\/li>\n<\/ul>\n<p>The Oracle instance number is useful in Real Application Cluster setups to minimize data block transfers between nodes on inserts as each instance has its own part of the index tree to work on.<br \/>\nThe one (or two) digit number could be anything that allows for distribution between sessions. For instance &#8220;mod(sid, 10)&#8221; or &#8220;round(dbms_random.value(0, 9), 0)&#8221;. The goal here is that each session works on different parts of the index.<br \/>\nThe Oracle sequence number makes the key values unique.<\/p>\n<p>I&#8217;m going to use the same test setup from the <a href=\"https:\/\/www.spotonoracle.com\/?p=166\" target=\"_blank\">previous post<\/a>. Tables TESTS and RESULTS are the same except that I changed the data type from &#8220;RAW(16)&#8221; to &#8220;NUMBER(10, 0)&#8221;.<\/p>\n<pre class=\"brush: sql; collapse: false; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ncreate table tests (\r\n    test_id number(10, 0) not null\r\n  , vc2_long varchar2(4000 byte)\r\n) tablespace myapp_data;\r\n\r\ncreate unique index tests_idx_01 on tests (test_id) tablespace myapp_index;\r\nalter table tests add constraint tests_pk primary key (test_id) using index tests_idx_01;\r\n\r\ncreate table results (\r\n    result_id number(10, 0) not null\r\n  , test_id number(10, 0) not null\r\n  , dt date\r\n  , vc2_short varchar2(200 byte)\r\n) tablespace myapp_data;\r\n\r\ncreate unique index results_idx_01 on results(result_id) tablespace myapp_index;\r\nalter table results add constraint results_pk primary key (result_id) using index results_idx_01;\r\ncreate index results_idx_02 on results(test_id) tablespace myapp_index;\r\nalter table results add constraint results_fk_01 foreign key (test_id) references tests(test_id) on delete cascade;\r\n<\/pre>\n<p>And we need sequences to get uniqueness.<\/p>\n<pre class=\"brush: sql; collapse: false; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ncreate sequence tests_seq_pk cache 1000 noorder;\r\ncreate sequence results_seq_pk cache 10000 noorder;\r\n<\/pre>\n<p>From the table definition above you can see I&#8217;m using one single number column to hold the three-part primary key values. As we need the three parts put together in a specific order (instance, some number, sequence) we cannot simply add up the numbers. I&#8217;ll use stored functions to concatenate the parts as strings and then cast the result to a number.<\/p>\n<pre class=\"brush: sql; collapse: false; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ncreate or replace function tests_id_distr\r\nreturn number\r\nis\r\nbegin\r\n  return to_number(sys_context('userenv', 'instance') || to_char(mod(sys_context('userenv', 'sid'), 10)) || to_char(tests_seq_pk.nextval) || dummy_java_call);\r\nend tests_id_distr;\r\n\/\r\n\r\ncreate or replace function results_id_distr\r\nreturn number\r\nis\r\nbegin\r\n  return to_number(sys_context('userenv', 'instance') || to_char(mod(sys_context('userenv', 'sid'), 10)) || to_char(results_seq_pk.nextval || dummy_java_call));\r\nend results_id_distr;\r\n\/\r\n<\/pre>\n<p>What&#8217;s the &#8220;dummy_java_call&#8221; at the end there, you ask?<br \/>\nWell, PL\/SQL is blazingly fast \ud83d\ude42 &#8230;and concatenating &#038; casting a few strings and numbers doesn&#8217;t do the Java GUID implementation justice. To get fairly comparable results I made this implementation also do a context switch to the JVM for every key generated. The Java method simply returns an empty string (or NULL if you will in the PL\/SQL context).<\/p>\n<pre class=\"brush: sql; collapse: false; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ncreate or replace and compile java source named &quot;DummyJavaCall&quot;\r\nas\r\npublic class DummyJavaCall {\r\n  public static String create() {\r\n    return &quot;&quot;;\r\n  }\r\n}\r\n\/\r\n\r\ncreate or replace function dummy_java_call\r\nreturn varchar2\r\nis\r\nlanguage java name 'DummyJavaCall.create() return java.lang.String';\r\n\/\r\n<\/pre>\n<p>The test script was just slightly modified to use the new stored functions to get the primary key values. Otherwise it is the same: inserting 1000 Tests in a loop. After each test it inserts 100 Results per Test in the inner loop and commits.<\/p>\n<pre class=\"brush: sql; collapse: false; title: ; wrap-lines: false; notranslate\" title=\"\">\r\nset verify off feedback off\r\nset serveroutput on size unlimited\r\n\r\n&lt;&lt;myapp&gt;&gt;\r\ndeclare\r\n  test_id tests.test_id%type;\r\nbegin\r\n  rrt.stat_cfg_default;\r\n  rrt.stat_start('&amp;1');\r\n  rrt.timer_start();\r\n\r\n  for tid in 1 .. 1000 loop\r\n    insert into tests t (test_id, vc2_short) values (tests_id_distr(), dbms_random.string('A', trunc(dbms_random.value(200, 4000))))\r\n    returning t.test_id into myapp.test_id;\r\n    \r\n    for rid in 1 .. 100 loop\r\n      insert into results r (result_id, test_id, dt, vc2_short) values (results_id_distr(), myapp.test_id, sysdate, dbms_random.string('L', trunc(dbms_random.value(10, 200))));\r\n    end loop;\r\n    \r\n    commit;\r\n  end loop;\r\n\r\n  dbms_output.put_line('Elapsed:' || to_char(rrt.timer_elapsed));\r\n  rrt.stat_stop;\r\nend;\r\n\/\r\n\r\nexit success\r\n<\/pre>\n<p>I again used the shell script from the GUID test case to run 10 concurrent session for each run. The VM and database specification is the exact same for both test cases. To get approximately the same data volume for the indexes I had to run this test case 12 times.<br \/>\nRESULTS_IDX_01 is 34 MB bigger  as in the GUID test case (385 vs 351 MB)<br \/>\nRESULTS_IDX_02 is  6 MB smaller as in the GUID test case (389 vs 395 MB)<br \/>\nAlso the number of leaf blocks are in the same ballpark.<\/p>\n<pre class=\"brush: sql; collapse: false; title: ; wrap-lines: false; notranslate\" title=\"\">\r\nSQL&gt; select segment_name, tablespace_name, blocks, extents, bytes\/1024\/1024 mb from dba_segments where (segment_name like 'TESTS%' or segment_name like 'RESULTS%') order by segment_name desc;\r\n\r\nSEGMENT_NAME    TABLESPACE_NAME  BLOCKS  EXTENTS  MB    \r\n--------------- ---------------- ------- -------- ----\r\nTESTS_IDX_01    MYAPP_INDEX      512     4        4     \r\nTESTS           MYAPP_DATA       41216   322      322   \r\nRESULTS_IDX_02  MYAPP_INDEX      49792   389      389   \r\nRESULTS_IDX_01  MYAPP_INDEX      49280   385      385   \r\nRESULTS         MYAPP_DATA       220160  1720     1720  \r\n<\/pre>\n<pre class=\"brush: sql; collapse: false; title: ; wrap-lines: false; notranslate\" title=\"\">\r\nSQL&gt; select table_name, index_name, uniqueness, blevel, leaf_blocks, distinct_keys, avg_leaf_blocks_per_key, avg_data_blocks_per_key, clustering_factor, num_rows from dba_indexes where index_name in ('TESTS_IDX_01', 'RESULTS_IDX_01', 'RESULTS_IDX_02') order by table_name desc, index_name;\r\n\r\nTABLE_NAME  INDEX_NAME      UNIQUENESS  BLEVEL  LEAF_BLOCKS  DISTINCT_KEYS  AVG_LEAF_BLOCKS_PER_KEY  AVG_DATA_BLOCKS_PER_KEY  CLUSTERING_FACTOR  NUM_ROWS  \r\n----------- --------------- ----------- ------- ------------ -------------- ------------------------ ------------------------ ------------------ --------\r\nTESTS       TESTS_IDX_01    UNIQUE      1       442          120000         1                        1                        92579              120000    \r\nRESULTS     RESULTS_IDX_01  UNIQUE      2       47849        11704420       1                        1                        1082748            11704420  \r\nRESULTS     RESULTS_IDX_02  NONUNIQUE   2       49037        118856         1                        4                        497219             11986476  \r\n<\/pre>\n<p>From this setup the aggregated session statistics show that we constantly get low numbers of physical reads and elapsed times irrespective of the data volume of the underlying tables and indexes.<\/p>\n<p><a href=\"https:\/\/www.spotonoracle.com\/wp-content\/uploads\/2017\/05\/chart-test2.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1542\" height=\"473\" src=\"https:\/\/www.spotonoracle.com\/wp-content\/uploads\/2017\/05\/chart-test2.png\" alt=\"\" class=\"alignnone size-medium wp-image-209\" srcset=\"https:\/\/www.spotonoracle.com\/wp-content\/uploads\/2017\/05\/chart-test2.png 1542w, https:\/\/www.spotonoracle.com\/wp-content\/uploads\/2017\/05\/chart-test2-300x92.png 300w, https:\/\/www.spotonoracle.com\/wp-content\/uploads\/2017\/05\/chart-test2-768x236.png 768w, https:\/\/www.spotonoracle.com\/wp-content\/uploads\/2017\/05\/chart-test2-1024x314.png 1024w, https:\/\/www.spotonoracle.com\/wp-content\/uploads\/2017\/05\/chart-test2-624x191.png 624w\" sizes=\"auto, (max-width: 1542px) 100vw, 1542px\" \/><\/a><\/p>\n<p>I like this solution because it addresses the scalability issues and doesn&#8217;t rely on any fancy Oracle feature (which you might need extra cost options for).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the previous post we have seen that GUIDs as primary keys can lead to some serious performance problems. What we need is a primary key that won&#8217;t cause excessive I\/O on inserts and reduce contention on the right hand side index leaf block. And I like it to be RAC aware. To build a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-210","post","type-post","status-publish","format-standard","hentry","category-general"],"_links":{"self":[{"href":"https:\/\/www.spotonoracle.com\/index.php?rest_route=\/wp\/v2\/posts\/210","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.spotonoracle.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.spotonoracle.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.spotonoracle.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.spotonoracle.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=210"}],"version-history":[{"count":8,"href":"https:\/\/www.spotonoracle.com\/index.php?rest_route=\/wp\/v2\/posts\/210\/revisions"}],"predecessor-version":[{"id":219,"href":"https:\/\/www.spotonoracle.com\/index.php?rest_route=\/wp\/v2\/posts\/210\/revisions\/219"}],"wp:attachment":[{"href":"https:\/\/www.spotonoracle.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=210"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.spotonoracle.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=210"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.spotonoracle.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=210"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}