2012-10: Existence check SQL – Through the ages

 

Before DB2 V7
DB2 V7
DB2 V8

Over the years, we have all been faced with the problem of checking for existence. (not just with DB2 and SQL either – that is for another BLOG!). Now in the days before even SYSIBM.SYSDUMMY1 existed, it was pretty nasty, and in my old firm we created our own single row dummy table for exactly this requirement (and doing date arithmetic etc.). However the programmers of the world often re-invent the wheel time and time again, a simple existence check has also evolved over the years. Here is a brief selection of the various ways that people have done it in the past, and could well be still running today, every one to two seconds, somewhere in the world…

Before DB2 V7

Line 188 in the EXPLAIN Output SELECT COUNT(*) FROM ZIPCODES WHERE COL1 = ‘xxx’ If the count(*) was > 0 – BINGO! (Line 199 WITH UR) Or Line 211 SELECT 1 FROM ZIPCODES WHERE COL1 = ‘xxx’ If the SQLCODE was 0 or -811 – BINGO! (Line 222 WITH UR) Or Line 234 SELECT 1 FROM SYSIBM.SYSDUMMY1 WHERE EXISTS ( SELECT 1 FROM ZIPCODES A WHERE A.COL1 = ‘xxx’) If the SQLCODE was 0 – BINGO! (Line 247 WITH UR) Or Line 261 SELECT 1 FROM SYSIBM.SYSDUMMY1 D WHERE EXISTS ( SELECT 1 FROM ZIPCODES A WHERE A.COL1 = ‘xxx’ AND D.IBMREQD = D.IBMREQD )   If the SQLCODE was 0 – BINGO! Note the use of the “correlated” predicate to actually force good access! (Line 275 WITH UR)

DB2 V7

In DB2 V7 you could then add the FETCH FIRST 1 ROW ONLY. So the scan would stop Line 290 SELECT 1 FROM ZIPCODES WHERE COL1 = ‘xxx’ FETCH FIRST 1 ROW ONLY If the SQLCODE was 0 – BINGO! (Line 302 WITH UR) Adding WITH UR “allegedly” also sped up the processing of SYSDUMMY1 and the possibility to declare a cursor with OPTIMIZE FOR 1 ROW and a single FETCH arrived.

DB2 V8

In DB2 V8 you could now drop the correlation check and still get good performance Same as line 234 SELECT 1 FROM SYSIBM.SYSDUMMY1 WHERE EXISTS ( SELECT 1 FROM ZIPCODES A WHERE A.COL1 = ‘xxx’ ) If the SQLCODE was 0 – BINGO!   As Terry Purcell wrote on listserv many years ago, “The FETCH FIRST 1 ROW ONLY made all other existence checks obsolete”, and so now is the time to dig through all of your old SQL, and see if you can improve even simple existence checks!   Here is the EXPLAIN output of all of these variations in a DB2 10 NF:

---------+---------+---------+---------+---------+---------+---------
LINE   QNO  TABLE_NAME    A   P  CE  TYPE             MS        SU
---------+---------+---------+---------+---------+---------+---------
00188  01   ZIPCODES      R   S  R   SELECT          137        388
00199  01   ZIPCODES      R   S  R   SELECT          137        388
00211  01   ZIPCODES      R   S      SELECT          187        529
00222  01   ZIPCODES      R   S      SELECT          187        529
00234  01   SYSDUMMY1     R   S      SELECT           11         29
00234  02   ZIPCODES      R   S      NCOSUB           11         29
00247  01   SYSDUMMY1     R   S      SELECT           11         29
00247  02   ZIPCODES      R   S      NCOSUB           11         29
00261  01   SYSDUMMY1     R   S      SELECT            2          6
00261  02   ZIPCODES      R   S      CORSUB            2          6
00275  01   SYSDUMMY1     R   S      SELECT            2          6
00275  02   ZIPCODES      R   S      CORSUB            2          6
00290  01   ZIPCODES      R          SELECT            1          1
00302  01   ZIPCODES      R          SELECT            1          1

Access is always a tablespace scan, sometimes Sequential pre-fetch is active. So you can see that now, today the WITH UR makes no difference, and the absolutely best performance is indeed with the FETCH FIRST 1 ROW ONLY code. Just for fun, I ran the above COBOL program calling each of these methods 10,000,000 times… the CPU usage varied from 10.87 seconds for the worst and 10.63 seconds for the best. That is how good DB2 really is: 0.24 / 10,000,000 is a very very small number indeed! Sometimes you can teach an old dog new tricks! Finally here’s a real cutey from about ten years ago, where the task was not to see if a given row existed but if a given TABLE existed.

DECLARE TESTTAB CURSOR FOR
SELECT 'PS_DOES_TABLE_EXIST'
FROM PS_TAX_BALANCE
WHERE 1 = 0
FOR FETCH ONLY

I hope this has been fixed to actually query the DB2 catalog these days, as in DB2 9, it now gets a PRUNED access path.

 ---------+---------+---------+---------+---------+---------+---
 LINE   QNO   TABLE_NAME  A P CE  TYPE       MS       SU
 ---------+---------+---------+---------+---------+---------+---
 00319 01                         PRUNED     0         0

Which would probably really kill the application!!!   As usual, any  questions or comments, then please feel free to mail me!