2019-12 Fun with Db2 12 PBR RPN

I have recently enhanced our SpaceAssuranceExpert (SAX) product to automatically help out if partitions, or partitioned indexes, are getting too big for their boots in a productive system.

SAX – What is it?

The idea behind SAX, is to stop any and all chances of getting a dreaded SQLCODE -904 in production, especially “out of the blue”.

Our idea was, in Db2 12, with the new PBR RPN to do an on-the-fly ALTER to the DSSIZE, just like SAX does now with the SECQTY to avoid running out of extents.


A quick look in the manual tells you that :

it is an immediate ALTER (as long as you make the new DSSIZE larger than the old one!) and there are no package invalidations or REORGS required.

This is fantastic!

So I created a nice little PBR and then ran a horrible Cartesian join SPUFI to flood the first partition with data. This join had a TIME card of (,1) to limit it to one second of CPU before getting an Abend S322.

SAX Monitor

The SAX monitor reacted perfectly and did the TP alter to 513 GB DSSIZE (I had an increment size of 512 as a test), and got an SQLCODE -666 (I just love that SQLCODE…), as the INSERT was still running.

This ALTER was then internally queued to be attempted later, like in any other failure case.

All well and good.

Wham! Nasty errors !

Then I did the same for a DPSI on my PBR RPN… Oh dear!I got a nasty IO Error and then an even nastier ROLLBACK loop, meaning I had to cancel the IRLM to stop Db2… (There is an APAR for this problem PH18977.)

Rollback loop?

The ROLLBACK loop was caused by me choosing to use NOT LOGGED as a tablespace attribute. We have a finite amount of log space, and when the transaction was S322’d after one second of CPU,

the rollback could not find one of the archive logs and then we had to cold start Db2

– Not pretty!

Proper test!

I then wrote a couple of little test programs that actually COMMITted after 5000 inserts and then the ALTERs all worked as designed.

Do I worry too much about extended format and extended accessibility?

Next, I worried about the ominous “extended format and extended addressability“ attributes in the DATACLASS for a PBR RPN, and wondered what would happen if a customer has SAX running and it happily ALTERs a TP to say 6 GB when they can only address 4 GB…

IBM to the rescue!

Luckily for us, Db2 development had thought about this!

  • If you attempt to create a PBR RPN (even with a very small size) and your DATACLASS does not have the two attributes set, you get an error message 00D70008 telling you this detail.

  • If you ALTER an existing tablespace to be a PBR RPN and your DATACLASS does not have the two attributes set, then the ALTER works fine. But remember, this is a pending alter and you *must* do a REORG at the TS level with inline TP level copies. This REORG then fails – also with 00D70008.

So, in other words, SAX cannot hurt you here!

What about PBGs?

After all this we also considered PBGs. They have a limit as well – MAXPARTITIONS in their case. So we added an ability to also ALTER MAXPARTITIONS as well. Here you must be more careful though, as these ALTERs are still immediate *but* they invalidate any referring packages!


If you work with ABIND set to YES all is good as the ALTER comes in, Db2 invalidates your packages and the auto rebind happens so fast that you do not even notice it happening, however, if you work with ABIND NO then any packages, even the package actually running and doing the inserts, will fail! If you are just using dynamic SQL then it is 100% OK otherwise – Buyer beware!


I have opened an Analytics Idea (DB24ZOS-I-1057) to try and get this loophole closed, as I cannot see what access path change could be affected by going from 32 to, say, 36 MAXPARTITIONS.

Db2 keeps adding the parts dynamically and nothing happens then… Go figure… Anyway, if you would like it – Please vote for it!

There are already some nice comments attached to it:

  • DP commented

    this is just a limit in the catalog.  So how could access path be affected?  The actual growth of a partition doesn’t invalidate the package so how just changing the limit do so ?
  • BW commented

    I opened a Case on this asking why packages are being invalidated in this situation and it is still open waiting for a reply.
  • BD commented

    With only impact to catalog, not sure how Access Path would be impacted.  Seems wasteful and counter productive to invalidate packages.

Remember – You never stop learning!

As always I would be pleased to hear from you!

Roy Boxwell
Senior Architect

2019-08 FTB (Fast Traversal Block): Just another TLA?

So we get a new TLA (Three Letter Abbreviation) to try and wrap our heads around in the busy busy world of mainframe databases. This time it comes with a twist, as FTB is actually not “Feed The Beast” sadly, but “Fast Traversal Block”.

In reality, it is referred to everywhere as:

FIT (Fast Index Traversal)

All clear on this and still with me? Good then I may begin!

The problem with modern indexes

The core problem these days is time… the time it takes to go through the leaf pages down to the actual data page is just too long. The vast amount of data we have forces upon the index a large number of layers which, in turn, forces a lot of non-leaf page accesses.

Index structure

We all know how indexes look in Db2, but remember they all have a root page (which is always technically speaking a non-leaf page) which points, normally, to non-leaf pages which, in turn, point to either more non-leaf pages – so you are getting extra levels in the index here  – and finally they point to a leaf page which actually contains pairs of keys and RIDs so that Db2 can get the data from the table page.

Which level are you?

One of the quirks of Db2, is that the leaf pages are called Level 0 and then the non-leaf pages directly above them are Level 1. The root page above these is then Level 2 and so you have a “two level” index.

So how many fetches for a random data access?

The first thing Db2 does is read the root page, looking at the root page it sees which non-leaf page must be fetched and it repeats this until it lands on the leaf page and then it fetches the data. So in my two level index it takes four fetches to get to the data. Now imagine you have a seven level index? Yep you end up fetching a ton of non-leafs!

The solution? FTB!

In Db2 12, IBM introduced the Fast Traversal Block to keep a copy of the non-leaf data in a new area, separate from the buffer pools.

The major reasons for the expected performance boost, are the facts that the structure is L2 cache-aware and each page is equal in size to one cache line (256 bytes), and I/O is massively reduced for the “other” non-leaf pages.

One size fits all?

The size of the FTB is determined by the new ZPARM INDEX_MEMORY_CONTROL with valid values AUTO, DISABLE or 10 – 200,000 MBs. By default AUTO is on. With this value, the size of the FTB is either 20% the size of all available buffer pools or 10MB, whichever is the highest.

The devil is in the detail

To actually start working with FTBs, you have to wait until the FTB Daemon has found an eligible index. From all of your indexes there will be quite a few candidate indexes. These must follow the following rules:

  1. It cannot be longer than 64 bytes

  2. It cannot be versioned (So the OLDEST_VERSION and CURRENT_VERSION must be the same!)

  3. It cannot contain a TIMESTAMP column with TIMEZONE

  4. It cannot have more than 2,000,000 leaf pages

Once through that selection list, the Daemon starts seeing what is happening to this index, is it a random select? That’s is a good thing! Is it an index only access? That’s even better! A split! Oh dear, that’s very bad…from all this, every two minutes, the daemon decides whether or not to use FTB for this index.

Control is in your hands

This is all well and good, but there are a lot of people who like to see *exactly* what is going on and where! So IBM introduced a new command:


From this you get to see which indexes are being processed and how much space they are using. The Daemon also spits out console messages so you can see what happened and why. Further there is a new Catalog table SYSIBM.SYSINDEXCONTROL where you can micro-manage exactly which indexes to attempt to force into FTB and which ones to exclude. I would try and avoid this option and just let Db2 do the business!


Always! FTB is limited to 10,000 per member which is probably ok! When a MASS DELETE (or TRUNCATE TABLE) happens then the FTB is removed for the related indexes and, even more importantly, is to keep up with current Db2 maintenance! As always with new things, it takes a while to get the birthing pains out of the way!

INCLUDE can kill you!

Finally, remember that the length cannot be greater than 64 bytes. So if you decide to add that one byte character column to the index to get Index Only access, you might then push the index over the edge and up to 65 bytes…

Before ALTERing indexes always check if they are in the FTB (or were!) and check if your alteration will disallow the FTB!

A little SQL to show you what you have

WITH INPUT (NLEVELS, LENGTH, INDEX_NAME) AS                            
       , SUM(CASE D.COLTYPE                                             
             WHEN 'DECIMAL ' THEN                                       
 -- IF , SEPERATOR           SMALLINT( CEILING(( D.LENGTH + 1,0 ) / 2 ))
                             SMALLINT( CEILING(( D.LENGTH + 1.0 ) / 2 ))
             WHEN 'GRAPHIC'  THEN D.LENGTH * 2                          
             WHEN 'VARG'     THEN D.LENGTH * 2                          
             WHEN 'LONGVARG' THEN D.LENGTH * 2                          
             ELSE D.LENGTH                                              
       + SUM(CASE A.PADDED                                              
             WHEN 'Y' THEN 0                                            
               CASE D.COLTYPE                                           
               WHEN 'VARG'     THEN 2                                   
               WHEN 'LONGVARG' THEN 2                                   
               WHEN 'VARCHAR'  THEN 2                                   
               WHEN 'LONGVAR'  THEN 2                                   
               WHEN 'VARBIN'   THEN 2                                   
               WHEN 'DECFLOAT' THEN 2                                   
               ELSE 0                                                   
       + SUM(CASE D.NULLS                                               
             WHEN 'Y' THEN 1                                            
             ELSE 0    
                END) AS LENGTH  
 FROM SYSIBM.SYSINDEXES        A                                       
     ,SYSIBM.SYSKEYS            C                                       
     ,SYSIBM.SYSCOLUMNS         D                                       
     ,SYSIBM.SYSINDEXSPACESTATS E                                       
 WHERE A.UNIQUERULE     <> 'D'              -- NOT DUPLICATE            
   AND D.COLTYPE        <> 'TIMESTZ'        -- NOT TIMEZONE             
   AND A.TBNAME         = D.TBNAME                                      
   AND A.TBCREATOR      = D.TBCREATOR                                   
   AND A.NAME           = C.IXNAME                                      
   AND A.CREATOR        = C.IXCREATOR                                   
   AND A.NAME           = E.NAME                                        
   AND A.CREATOR        = E.CREATOR                                     
   AND C.COLNAME        = D.NAME                                        
 GROUP BY A.NAME, A.CREATOR , E.NLEVELS, A.NLEVELS)                     
 SELECT NLEVELS, LENGTH , INDEX_NAME                                    
 FROM INPUT                                                             
 WHERE LENGTH <= 64   
 WITH UR              

I have limited it to just show you just the eligible unique, non timezone, non-versioned indexes with a length of 64 or less of course!

Blog time

Here’s an excellent write up from John Campbell with even more technical details:


and from Akiko Hoshikawa: a very good IDUG Blog:


As always I would be pleased to hear from you!

Roy Boxwell
Senior Architect

Rotten Results from RUNSTATS Require Rescue

Do you know the basic rules to ensure access path stability when using RUNSTATS?

Time for another of my “I noticed something strange at a customer site recently” newsletters. Enjoy!


RUNSTATS are good aren’t they?

At this particular site, the RUNSTATS methodology of RUNSTATS was, shall we say, “sub-optimal.” They use an ancient system to decide when to RUNSTATS, and they do tablespace’s and index’s *never* at the same time. Just to complicate matters even more, they never use inline RUNSTATS because “if the REORG abends, the statistics in the DB2 catalog are dead”. Now you are all probably well aware of the scale of the disaster at this site?


Daily fire fighting

Nearly every day, some access path somewhere goes horribly wrong… the under- manned and over-worked DBA group are tasked to find and fix ASAP. Cures range from a quick INDEX create or change, or perhaps even a really needed RUNSTATS or REORG.


Why do the Access Paths go “wrong”?

The real goal is to stop firefighting and to investigate the root cause. Why do so many access paths go wrong on such a regular basis? The answer is the systemic horribleness of RUNSTATS collection. Dynamic SQL is, obviously, very very sensitive to RUNSTATS. For one thing, the statements are kicked out of the cache! The very next time they come back, the DB2 Optimizer redrives the cost calculations and “Hey Presto!” you have a bad access path. Terry Purcell and Pat Bossmann have often said that about 90% of DB2 performance problems stem from bad RUNSTATS. The old adage “garbage in – garbage out” is still true!


Timing is everything

The timing of the RUNSTATS is critical for stable access paths.

Basic rules are:

 1 Only do a RUNSTATS if you really really need to!
a. RUNSTATS are not cheap!
b. The Dynamic Statement cache gets wiped
c. Locks on the Catalog can occur
2Avoid doing RUNSTATS even if RTS says to run one!
a. Lots of people use the incorrect counters to trigger a RUNSTATS. Use the correct ones for the correct Object type
b. Never RUNSTATS LOB spaces – completely pointless work!
c. Even if a MASSDELETE has occurred do you really want to “reset” the DB2 catalog statistics?
d. VOLATILE tables must be handled with *extreme* care!
3Choose your RUNSTATS parameters wisely!
a. Doing a blind “RUNSTATS the world” is just as bad as running an empty RUNSTATS!
b. HISTOGRAM should be used with caution
c. More than a hundred COLGROUPs should start alarm bells ringing

Quite a list here, and it really only shows some “Rules of Thumb”. I’ll bet you all have you own?

Is there a way back from the abyss?

But what happens if you have 1000’s of partitions with terabytes of data and the RUNSTATS was, shall we say, ill-advised or badly timed? Can you go back in time? Hands up those who wants to do a PiT recovery on the production catalog! No takers???

Yes! There is a way back from the abyss

I’ll bet you are all well ahead of me here, but the way to do this is pretty straightforward. You simply acquire our latest tool, RUNSTATS Rescue, to handle it all for you. Or, you could try and reset the data in the DB2 catalog from off-line backups that you happen to have taken before the RUNSTATS that is now killing you. …You did do that, right?

Why a tool?

Apart from the fact that this tool is from us, my firm, just trying to “roll your own” can be a real nightmare. Why?

  • Because you must first find out all of the objects that were touched by the badly performing SQL.
  • Then you must get all of the DB2 Optimizer relevant data back from a point in time before the RUNSTATS executed, and/or the last REBIND(s),
  • and then you must flush the dynamic statement cache and REBIND any static SQL.

Sounds like a lot of work.

What else must you do?

You also have to be transparent and so log what you do. You must allow for the ability to back-out your changes as perhaps you make another access path even worse. And it would be really cool if you could do “on the fly” explains to check that the RUNSTATS really *is* the root of all that evil. Remember that ZPARMS and BUFFERPOOLS also have a major influence on access paths. Even the speed of your machine! It is also a must to then be able to go even further back in time – perhaps as much as a year?

Hang on – What about PLAN STABILITY?

Doesn’t plan stability save you? I hear you all cry. Well, “No” is the short answer! If your package is invalidated by a Schema change (the classics are index drop and recreate or VIEW change), then plan stability does not work anymore. Further, in DB2 12, Dynamic Plan Stability has been announced. Sadly it *also* fails right here as there is no SWITCH PREVIOUS/ORIGINAL support!

It all works together

So, for the static SQL case where the package is not invalided, Plan Stability is good. If not: – RUNSTATS Rescue to the rescue. For Dynamic SQL – RUNSTATS Rescue is the answer.

As always, any questions or comments would be most welcome!


Roy Boxwell


GIVE and TAKE Program


Give and Take 2020

Information on the Give and Take 4,5,6,7

Previous Give & Take

We have “GIVEn” various free-of-charge Use Cases from our SQL WorkloadExpert for Db2 z/OS like:

1 Index Maintenance Costs

2 EXPLAIN Suppression

3 BIF Usage

Limited free-of-Charge Db2 Application

This Program started in Europe, during our 30th anniversary was such a success, that it is now being Extended for the benefit of North American Db2 z/OS sites.

<a href="http://www.seg.de/produkte/db2-zos-produkte/sql-workloadexpert-for-db2-zos/" target="

Index Maintenance Costs, EXPLAIN Suppression, BIF

Limited free-of-Charge DB2 Application

This Program started in Europe, during our 30th anniversary was such a success, that it is now being Extended for the benefit of North American DB2 z/OS sites.

SQL WorkloadExpert for DB2 z/OS (WLX) contains several “Use Cases”. We provided three of them, free of charge, for one month to different sites. In return, we received their results. We’d like to share this with you now.

We have “GIVEn” various free-of-charge Use Cases from SQL Workload Expert for DB2 z/OS like
1  Index Maintenance Costs
2  EXPLAIN Suppression
3  BIF Usage – This last one is still available
We TAKE the anonymized results for research
and will communicate with the local User Groups for discussions

Inspiring experiences

Customer Statements

Read the Customer Comments across the Industry 


  • Health Care
  • Insurance
  • Banking
  • Car Manufacturing

First results from DB2 z/OS sites

1Index Mantenance CostsNearly all the data we got back showed a positive result for created Indexes…
2EXPLAIN SuppressionAbout 10% of SQLs are actually “left over”…
3 BIF Usage When migrating to a new DB2 version, the BIFs are not always compatible and an extreme amount of data is produced.


The difficulty of near-time analysis to track down BIFs within dynamic SQL have been solved with the BIF Usage Use Case…

[Results from DB2 z/OS sites]

Program 3 – BIF Usage –  has now started




BIF CompatibilityDB2 10 compatibility mode
Changes to the STRING formating of decimal data within the CHAR and VARCHAR built-in function and to the CAST specification with CHAR and VARCHAR result types as well as  UNSUPPORTED TIMESTAMP STRINGs.
White PaperFinding BIFsAnd How to Lead a Problem-Free Life With Them in the Future
Navigating the Challenges of moving to a new DB2 Release
Newsletter2015-01 – BIFCIDS – Where’s the BIF?How will you deal with loop-hole usage in production code?
VideoBIF Usage(11min.) Trap  and correct the BIFs that will cause belly-ache one day soon

BIF Usage video

2015-09 A real CLUSTER Buster


Are you using the “default” clustering INDEX or are you defining the correct INDEX with the CLUSTER Attribute?

This newsletter is dedicated to all DDL designers who do their best but then omit that last tiny bit. No-one really notices, or even cares, for years and years until…

Imagine a huge table up in the billions of rows. Imagine now that you have SQL that accesses this table and it must, as always, run fast. So what do you do? You create an index, RUNSTATS it and Hey Presto! Everything is sweet and dandy! Now imagine this happening again and again over time… What you finally end up with is a huge table with billions of rows now with ten indexes! Not too brilliant for insert and update but that is not the point of this newsletter.


An SQL, that had worked perfectly well, suddenly went pear shaped…

So now stop imagining, as this had already really happened at a customer site. We come to the crux: An SQL, that had worked perfectly well, suddenly went pear shaped (belly-up for the non-British English readers!) and started using a two column index with one matching column instead of a six column index with six matching columns! This change in access path caused death by random-IO to occur and it all went horribly wrong.


Now the question is why? What on earth happened for the DB2 Optimizer to make such a terrible decision?

1- RUNSTATS review

First idea was, of course, my favourite – Incomplete or not Full RUNSTATS data. In fact there were “bogus stats” from 2003 in the SYSCOLDIST, but even after all the bogus stats were deleted and a complete RUNSTATS with HISTOGRAM and FREQVAL performed, the access path remained stuck on the “bad” index.


2- DDL review

I then reviewed the DDL that created all of the objects and noticed that none of the indexes was defined with the CLUSTER attribute. The table itself was “as old as the hills,” but all of the indexes had been created and/or altered many times over the last ten years or so.


3- Redefine the “bad” Index as CLUSTER

 The dummy CLUSTER

Now, as we all know, if no index is defined as CLUSTER DB2 picks one to be a dummy CLUSTER when it does a REORG. So you can end up with 100% clustering non-clustered indexes. In this case that was exactly what was happening. The “bad” index was, purely by fluke after many years of index maintenance, the “default” clustering index, however it was a *terrible* choice for a clustering index. Worse still: because of the fact it was non-unique with two columns and therefore small (well small in this case was still 40,000 pages!) it looked positively “good” to the DB2 Optimizer—hence the decision to abandon a six column matching index in favour of a single column one…

“Cleansing” with ALTER, REORG and the DB2 10 INCLUDE syntax

A quick ALTER of the original “first” index to get the CLUSTER attribute, a REORG scheduled for the weekend to get the data into *proper* CLUSTERing sequence and – Bob’s your uncle! Access path swapped back to the “good” index.

Now there’s still work to be done here as ten indexes is about seven too many, if you ask me.With DB2 10 it is possible to use the INCLUDE syntax to weed out some of the extra indexes and thus speed up all usage of this mega-table. But, for the right here and now, the job is done!

So now I am at the end of this sad story of how a little design “error” of just forgetting one little attribute on an index create statement caused major mayhem many years down the line… remember to check *all* of your tables and see if you have any beauties like this in your shop (surely not!)



Here’s a little SQL that will do the job for you:

-- IS DEFINED AS CLUSTER                                               
SELECT A.CREATOR                                                       
FROM SYSIBM.SYSTABLES  A                                               
WHERE NOT A.CREATOR = 'SYSIBM'                                         
  AND NOT EXISTS                                                       
          (SELECT 1                                                    
           FROM SYSIBM.SYSINDEXES B                                    
           WHERE A.NAME       = B.TBNAME                               
             AND A.CREATOR    = B.TBCREATOR                            
             AND B.CLUSTERING = 'Y')                                   
  AND 1 < (SELECT COALESCE(COUNT(*) , 0)                               
           FROM SYSIBM.SYSINDEXES B                                    
           WHERE A.NAME    = B.TBNAME                                  
             AND A.CREATOR = B.TBCREATOR)                              
ORDER BY 1 , 2                                                         

Note that this query excludes the SYSIBM indexes as IBM also forgot to CLUSTER them!


As usual, any comments or questions please mail me!



Roy Boxwell

2015-05 Top 10 Things to Ignore for DB2 z/OS


This newsletter was inspired by a recent article I read in the “Enterprise Systems Magazine” called “Top 10 Ways to Waste Money on CPU”. Why not the Top 10 things to ignore?


DB2 z/OS things you could ignore but most definitely should not!

So here’s my little list, in no particular order, of things you could ignore but most definitely should not!

  1SQL DELETE statements in mega-million  tablesSQL DELETE statements in mega-million  tables when a REORG DISCARD would kill two birds with one stone. (I love that phrase) Anyway, after 500,000 singleton deletes the tablespace probably needs a REORG anyway and so why not do two in one? A bit of a no-brainer really.
  2LOB columnsLOB columns, whose size would *easily* fit inside an inline LOB or even a VARCHAR. LOBs are still slow and cumbersome to use, but inline LOBs are great. If you can use ‘em – do so!
  3BP0 being used for *everything* by default…BP0 being used for *everything* by default… Please split the BP s into groups!!! BP0 is only, and I mean ONLY, for the Catalog and Directory. That way you can actually keep the size low and spare some memory for other BPs. LOB and XML tablespaces get their own BP. Tables and Indexes are split. Sort gets its own. You get the idea ?
  4Utility jobs still based on 1990’s ideasUtility jobs still based on 1990’s ideas. Are you still running a RUNSTATS to see if a REORG is needed? Are you running REORGs without inline RUNSTATS? Are your RUNSTATS using FREQVAL and, if required, HISTOGRAM?
  5Death by “indexiphication”.Death by “indexiphication”. Do you have tables with more than three indexes? Do you have ten or more indexes? Time to look for INCLUDE usage and LASTUSED Timestamps here!
  6PLAN_TABLE explosionPLAN_TABLE explosion. Do you have multiple PLAN_TABLEs in production? Are you REORGing, RUNSTATSing and Image Copying them on a regular basis? Are you purging them of rubbish data on a regular basis?
  7Are your ZPARMs up to date?Have you checked the Rules of Thumb in regard to ZPARMS since they were last set back in the 80’s? Now is the time to do a review of all the ZPARMS to see where you can really get performance boosts. (For example the default SRTPOOL In DB2 10 is now 10,000k but in DB2 V8 and 9 it was just 2,000k)
 8Are you removing garbage from the DB2 Catalog and Directory ?Are you removing garbage from the DB2 Catalog and Directory ? Do you really need all the packages and versions of those packages from 1989 these days? If a table gets RUNSTATSed that these ancient, never executed, packages uses then it should trigger a review of the access paths, which could, of course, flag up problems where no real problem exists.
  9 COMMIT frequency.You never need to check or change this do you…
 10 TrainingIDUG, Insight, and RUGs etc. you can never ever get enough info about how things work and how to make things better.


One thing you should certainly NOT ignore, is my newsletter! I have lots of exciting topics coming up in 2015 and I’ll also let you know about our webinars.

Upcoming Newsletters

  • SOUNDEX and other cool features part 4 – update for DB2 10 & all new for DB2 11
  • BAD Data Day
  • Overloaded Log
  • A real CLUSTER Buster


As usual, any comments or questions are welcome!


Roy Boxwell