2022-05 Let’s get hashed

This month I will delve into the wonderful new field QUERY_HASH in the SYSIBM.SYSPACKSTMT Db2 Catalog table.

Isn’t it Old?

The QUERY_HASH column first appeared back in the Db2 11 tables SYSQUERY and DSN_STATEMENT_CACHE_TABLE where it is used to identify dynamic SQL and enable joining between these two tables.

Now it Gets Interesting

In Db2 12 it was added throughout the Catalog to tables SYSDYNQRY, SYSPACKSTMT and DSN_STATEMNT_TABLE, thus adding the availability to Static SQL as well as finishing the Dynamic SQL support.

For Static SQL the column is actually very interesting. I will now show you a list of little queries that you can use to see what it is and how to use it at your site.

First up: Baseline

How many static SQL statements do you have in the Db2 catalog at this moment?

SELECT COUNT(*)
FROM SYSIBM.SYSPACKSTMT A
FOR FETCH ONLY
WITH UR
;

Gives me:

---------+---------+---------+---------+---------+---------+
324675
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100

0 = 0 = 0 ?

But that includes the “dummy” stuff in all packages, so let’s rerun with a better predicate:

SELECT COUNT(*)
FROM SYSIBM.SYSPACKSTMT A
WHERE NOT (A.SEQNO  = 0
       AND A.STMTNO = 0
       AND A.SECTNO = 0)
FOR FETCH ONLY
WITH UR
;

Which shows my true list of actual SQLs:

---------+---------+---------+---------+---------+---------
319015
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100

So how many different hashes do I have?

SELECT COUNT(*)
     , HEX(A.QUERY_HASH) AS QUERY_HASH
FROM SYSIBM.SYSPACKSTMT A
WHERE NOT (A.SEQNO  = 0 
       AND A.STMTNO = 0 
       AND A.SECTNO = 0)
GROUP BY A.QUERY_HASH
ORDER BY 1 DESC
FETCH FIRST 100 ROWS ONLY
FOR FETCH ONLY
WITH UR
;

Gives me:

---------+---------+---------+---------+-----
      QUERY_HASH
---------+---------+---------+---------+-----
73671 00000000000000000000000000000000
27228 40404040404040404040404040404040
12161 00000000000000000000000025B72000
 9821 40C7C5D7C9E240404040404040404040
 7372 00000000000000000000000024CC5000
 5989 000000000000000000000000257CD000
 5324 000000000000000000000000257CB000
 4530 40404070580000800000000000000000
 4265 1372041862047A61177D116D03002220
 4033 00000000000000000000000024E2F000
 3791 000000000000000000000000257EB000
 3114 27B830EAC5C4D44040C7C5D7C9E24040
 3089 7A69031C0174677E6844533C59555533
 2881 7B7F67796A17051E04077C027E142055
 2690 1473780D166A031F575A2F432047382F
 2446 6C166B000A6E13186161751F1A255340
 2264 6D760D7A75066A7A111A691E62592154
 2248 27B8353AC5C4D44040C7C5D7C9E24040
 2098 000000000000000000000000257BA000

Now I was a bit worried about all the low-values and all the 4040 entries but thought “The low-values are probably not bound or not executable or some such.” The spaces were more worrying! Then I noticed the rows with lots of leading zeroes…

Details, Details…

At this point I thought we needed to break down the SQLs between true “real” SQLs and “fake” ones – FETCH, OPEN, CLOSE, SET etc. which are not EXPLAINable. So I added the EXPLAINABLE column to the select to see if I was right:

SELECT COUNT(*)
     , A.EXPLAINABLE
     , HEX(A.QUERY_HASH) AS QUERY_HASH
FROM SYSIBM.SYSPACKSTMT A
WHERE NOT (A.SEQNO  = 0 
       AND A.STMTNO = 0 
       AND A.SECTNO = 0)
GROUP BY A.EXPLAINABLE, A.QUERY_HASH
ORDER BY 1 DESC, 2
FETCH FIRST 100 ROWS ONLY
FOR FETCH ONLY
WITH UR
;

Gives me:

---------+---------+---------+---------+---------+--------
      EXPLAINABLE QUERY_HASH
---------+---------+---------+---------+---------+--------
73485 N           00000000000000000000000000000000
27228 N           40404040404040404040404040404040
12161 N           00000000000000000000000025B72000
 9821 N           40C7C5D7C9E240404040404040404040
 7372 N           00000000000000000000000024CC5000
 5989 N           000000000000000000000000257CD000
 5324 N           000000000000000000000000257CB000
 4530 N           40404070580000800000000000000000
 4055 N           1372041862047A61177D116D03002220
 4033 N           00000000000000000000000024E2F000
 3791 N           000000000000000000000000257EB000

Aha! So I guessed right all these, well over a third of *all* SQLs are not actually explainable and so a QUERY_HASH would be a little bit pointless.

Getting There…

So, now I added a predicate to remove all those:

SELECT COUNT(*)
     , A.EXPLAINABLE
     , HEX(A.QUERY_HASH) AS QUERY_HASH
FROM SYSIBM.SYSPACKSTMT A
WHERE NOT (A.SEQNO  = 0 
       AND A.STMTNO = 0 
       AND A.SECTNO = 0)
  AND NOT A.EXPLAINABLE = 'N'
GROUP BY A.EXPLAINABLE, A.QUERY_HASH
ORDER BY 1 DESC, 2
FETCH FIRST 100 ROWS ONLY
FOR FETCH ONLY
WITH UR
;

Which gives me much better data:

---------+---------+---------+---------+---------+--------
    EXPLAINABLE QUERY_HASH
---------+---------+---------+---------+---------+--------
372 Y           6014030D641C1325583D214B504C3750
372 Y           3E4E5C30101603600A60620574076268
312 Y           70106E0C106E150F7274790C53255340
307 Y           49335C5B4A1C6B6276101914001D6D73
248 Y           53473E64001574120C191862767E1360

Enhanced It All!

Then I enhanced the query to now join back to the SYSPACKAGE and show me columns of interest from there, especially TYPE as I had a suspicion!

SELECT A.COLLID, A.NAME, A.CONTOKEN, A.TIMESTAMP
      ,A.BINDTIME, A.VALID, A.OPERATIVE, A.LASTUSED
      ,A.TYPE, B.STATUS, B.EXPLAINABLE, B.STATEMENT
FROM SYSIBM.SYSPACKAGE  A
    ,SYSIBM.SYSPACKSTMT B
WHERE NOT (B.SEQNO  = 0 
       AND B.STMTNO = 0 
       AND B.SECTNO = 0)
  AND     A.LOCATION    = ''
  AND     A.LOCATION    = B.LOCATION
  AND     A.COLLID      = B.COLLID
  AND     A.NAME        = B.NAME
  AND     A.CONTOKEN    = B.CONTOKEN
  AND     B.QUERY_HASH  = X'00000000000000000000000000000000'
  AND NOT B.EXPLAINABLE = 'N'
ORDER BY 2 , 1
FOR FETCH ONLY
WITH UR
;

Not my TYPE

Scrolling right to the TYPE column:

----+---------+---------+---------+---------+---------+---------+----
VALID OPERATIVE LASTUSED   TYPE STATUS EXPLAINABLE STATEMENT
----+---------+---------+---------+---------+---------+---------+----
Y     Y         2021-12-23      C      Y      DECLARE DB2JCCCURSOR8 C
Y     Y         0001-01-01      H      Y      DECLARE DB2JCCCURSOR1 C
Y     Y         0001-01-01      C      Y      DECLARE DB2JCCCURSOR3 C
Y     Y         2016-09-02 1
Y     Y         2016-09-02 1
Y     Y         2016-09-02 1
Y     Y         0001-01-01 1
Y     Y         0001-01-01 1
Y     Y         2022-05-12 N
Y     Y         2022-05-12 N
Y     Y         0001-01-01 T
Y     Y         0001-01-01 T

Aha! Advanced Triggers (1) , Procedures (N) and Normal Triggers (T). Functions(F) would also be there. There’s a Gotcha here, too: RESTful Services identify themselves like a normal package but with HOSTLANG=’R’, yet they do not have a usable Hash. So now remove all of these from the picture like this:

SELECT A.COLLID, A.NAME, A.CONTOKEN, A.TIMESTAMP
      ,A.BINDTIME, A.VALID, A.OPERATIVE, A.LASTUSED
      ,HEX(B.QUERY_HASH) AS QUERY_HASH
      ,B.STATUS, B.STATEMENT
FROM SYSIBM.SYSPACKAGE  A
    ,SYSIBM.SYSPACKSTMT B
WHERE NOT (B.SEQNO  = 0 
       AND B.STMTNO = 0 
       AND B.SECTNO = 0)
  AND A.LOCATION    = ''
  AND A.LOCATION    = B.LOCATION
  AND A.COLLID      = B.COLLID
  AND A.NAME        = B.NAME
  AND A.CONTOKEN    = B.CONTOKEN
  AND A.TYPE        = ' ' -- ONLY REAL PACKAGES
  AND B.EXPLAINABLE = 'Y' -- ONLY EXPLAINABLE SQL
  AND NOT B.HOSTLANG = 'R' -- NO RESTFUL SERVICES 
ORDER BY 2 , 1
FETCH FIRST 100 ROWS ONLY
FOR FETCH ONLY
WITH UR
;

Now I get the real data out:

---------+---------+---------+---------+---------+---------+---------
COLLID     NAME    CONTOKEN TIMESTAMP
---------+---------+---------+---------+---------+---------+---------
PTFCOLL008 ADMDMEM +oæ (p8  2021-10-08-12.40.14.562716
PTFCOLL008 ADMDMEM +oæ (p8  2021-10-08-12.40.14.562716
PTFCOLL008 BUILDS  ?À ã     2016-12-28-08.50.58.486446
PTFCOLL008 CLEAN   â­â  r4   2016-12-28-08.50.59.911739
PTFCOLL008 CLEAN   â­â  r4   2016-12-28-08.50.59.911739
-+---------+---------+---------+---------+--
BINDTIME                   VALID OPERATIVE
-+---------+---------+---------+---------+--
2022-01-20-15.26.20.128052 Y     Y
2022-01-20-15.26.20.128052 Y     Y
2022-01-20-15.26.20.533383 Y     Y
2022-01-20-15.26.20.639940 Y     Y
2022-01-20-15.26.20.639940 Y     Y
-----+---------+---------+---------+---------+------
LASTUSED   QUERY_HASH                       STATUS
-----+---------+---------+---------+---------+------
2022-03-21 48534A5F7A741F0E75131067686F066A C
2022-03-21 585850457A760F066F091067686F0668 C
2022-02-18 6A0A777E720B7B671E703E40232C584B C
0001-01-01 59283E494E332341514B37572A29376F C
0001-01-01 6F051C041E1C136A0024374B293F3742 C
--------+---------+---------+---------+---------+---
STATEMENT
--------+---------+---------+---------+---------+---
DECLARE PTFTOOL-02 CURSOR FOR SELECT A . PMEMBERNAME
DECLARE PTFTOOL-01 CURSOR FOR SELECT A . PMEMBERNAME
DECLARE GET-MEMBER CURSOR FOR SELECT A . RMEMBERNAME
DECLARE PTFTOOL-02 CURSOR FOR SELECT D . PTFNO , D .
DECLARE PTFTOOL-01 CURSOR WITH HOLD FOR SELECT D . P

Finally, I get my “real” useful HASH data with text.

And Now?

So what can you do with the QUERY_HASH? One simple thing is to just use it to pull out all the duplicate (see note below!) SQLs that you have in different collections or packages:

SELECT A.COLLID, A.NAME, A.CONTOKEN, A.TIMESTAMP
      ,A.BINDTIME, A.VALID, A.OPERATIVE, A.LASTUSED
      ,B.STATUS, B.STATEMENT
FROM SYSIBM.SYSPACKAGE  A
    ,SYSIBM.SYSPACKSTMT B
WHERE NOT (B.SEQNO  = 0 
       AND B.STMTNO = 0 
       AND B.SECTNO = 0)
  AND A.LOCATION    = ''
  AND A.LOCATION    = B.LOCATION
  AND A.COLLID      = B.COLLID
  AND A.NAME        = B.NAME
  AND A.CONTOKEN    = B.CONTOKEN
  AND A.TYPE        = ' ' -- ONLY REAL PACKAGES
  AND B.QUERY_HASH  = X'621B6C6564170F63151C5E45544E4A40'
  AND B.EXPLAINABLE = 'Y' -- ONLY EXPLAINABLE SQL
  AND NOT B.HOSTLANG = 'R' -- NO RESTFUL SERVICES 
ORDER BY 2 , 1
FETCH FIRST 100 ROWS ONLY
FOR FETCH ONLY
WITH UR
;

This shows me all the packages with the selected QUERY_HASH value:

---------+---------+---------+---------+
COLLID           NAME     CONTOKEN
---------+---------+---------+---------+
MDB2VNEX_TEST    DSMALTER ) }_8
RTDX0510RCH      DSMALTER î Ö t Y
RTDX0510_AT      DSMALTER À´ Â ¢
RTDX0510_BE      DSMALTER _K W y
RTDX0510_COLL_AN DSMALTER À´ Â ¢
RTDX0510_DA      DSMALTER î Ö t Y
---------+---------+---------+---------+---------+----
TIMESTAMP                  BINDTIME
---------+---------+---------+---------+---------+----
2019-04-08-14.40.11.723943 2022-01-20-15.23.56.457297
2021-11-10-08.34.22.949593 2022-02-28-09.51.35.203521
2021-11-09-10.25.16.043349 2022-05-10-09.05.47.146861
2021-11-17-06.25.25.922112 2022-03-21-13.44.10.759957
2017-11-09-13.17.35.820393 2022-04-25-14.27.48.875174
2021-12-01-11.36.44.218374 2022-03-21-13.44.49.788358
---+---------+---------+---------+--
VALID OPERATIVE LASTUSED   STATUS
---+---------+---------+---------+--
Y     Y         2021-11-22 C
Y     Y         0001-01-01 C
Y     Y         0001-01-01 C
Y     Y         0001-01-01 H
Y     Y         2021-03-24 C
Y     Y         0001-01-01 H
--+---------+---------+---------+---
STATEMENT
--+---------+---------+---------+---
INSERT INTO DSM_ALTER VALUES ( : H ,
INSERT INTO DSM_ALTER VALUES ( : H ,
INSERT INTO DSM_ALTER VALUES ( : H ,
INSERT INTO DSM_ALTER VALUES ( : H ,
INSERT INTO DSM_ALTER VALUES ( : H ,
INSERT INTO DSM_ALTER VALUES ( : H ,

This does add an extra, pretty neat, element to the DBA’s tool kit.

Useful for You?

Do you think you will be using this feature or just let external tooling handle all of this for your system?

I would be very interested to hear any, and all, of your thoughts!

TTFN

Roy Boxwell

Note: One thing that I have noticed is that the hash algorythm is not actually that good! I get duplicates which are not actually duplicate SQLs when only one – four characters are different (typically table names!) Not really a major problem but something you had all better be aware of!

CategoriesUncategorized