2011-04: RTS – data for DBA daily usage

IBM brought in the RTS in DB2 V7 and it was a paradigm-shattering event! Well, at least for me…
What it did, was finally give us DBAs the ability to “see” what is happening in our DB2 sub-systems, without having to run an expensive RUNSTATS utility.

Up until this time, shops had to run 1000’s of RUNSTATS to see if a REORG was needed, and, of course, these RUNSTATS caused massive side effects:

  1. The DSC was invalidated for all RUNSTATSed objects
  2. Automatic REBIND detection software then scheduled a REBIND – which sometimes caused severe performance problems when the access path “went south” or “turned pear-shaped”
  3. Lots of CPU was wasted on non-changed data by RUNSTATS

 

The moment that RTS came into town it changed the whole game. Now you only need a RUNSTATS when you actually need a RUNSTATS and you only need a REORG when you actually need a REORG. (See my list of reasons why you do not actually need a REORG later in this newsletter). Just to complete the picture: You only need a COPY when you actually need a COPY! Lots of shops exploiting the RTS reported huge costs savings from using the real-time data for threshold based maintenance – if you haven’t yet considered the RTS then please do so, as it is a gold mine!

 

RTS tables from DB2 V8 up until DB2 10.

To start with, let us have a look at the RTS tables from DV8-10
As can readily be seen, IBM keeps adding columns to these tables!
Note that all the counter columns are NULLable. The way that the RTS data has been externalized, and which column(s) are updated  -and when – has changed a few times over the years. Now it seems to be very stable and the basic rule is:
When you first create an object (as long as it is not a DEFINE NO one of course!) it gets an entry in the RTS. The counters contain 0, but the relevant utility timestamp is NULL – which stops externalization of the counters (apart from REORGLASTTIME which is set to creation date).
When a RUNSTATS – or a COPY – has been done, then these counters start counting as well.

Now, just looking at the columns gives a great idea of the usefulness of this data: Upfront is a caveat, “The use of REORGLEAFNEAR is not recommended to decide upon a REORG or not”. The REORGLEAFNEAR is basically worthless as a data point. Recently, at a local DB2 user group meeting it was explained that the numbers in this column are really a waste of time and that, “IBM informed us that to use REORGLEAFNEAR in the REORG decision process was incorrect and could cause repetitive, needless, REORGs.” – You have been warned!! But apart from that, the data can help DBAs every day in their work.

 

REORG decisions:

Do not do a REORG
Simple huh? I basically see people all the time doing REORGs of TSs just because “The software told me to do it”. This is no excuse! The only real reason to REORG tablespaces is either to

a) reclaim space,

b) change attributes or

c) re-sort the data.

Now the last one really only makes sense if you do indeed have sequential access of the TS, otherwise, if the access is through an index and is effectively random, why bother with an expensive REORG???

 

RUNSTAT decisions:

Do not do a RUNSTATS
Equally simple! I see 1000’s of RUNSTATS conducted for absolutely no good reason. They kill your Dynamic Statement Cache (DSC), they can kill your access paths for REBINDs and BINDs, and what if you ran it when the table was empty but then along comes a LOAD of 2,000,000 rows??? Yup, a RUNSTATS run must be aware of the usage and environment and should be well timed. Basically, the rule is to run a RUNSTATS when the DB2 Optimizer benefits from the fresh statistics and generates a new, better access path. This may be a challenge if you don’t have the right database maintenance solution, but it’s another gold mine if you do!

 

COPY decision:

Do COPYs on-the-fly. Use SHRLEVEL CHANGE and run your ICs during the day. At 08:30 a.m. you have copied the world and your SLA’s for RECOVERY are great. What happens when you have a problem at 05:30 p.m.? I guess your SLA will *not* be met, right? What you must do, is learn to not be afraid of image copies during the day. They can save your life, but don’t do them on a fixed schedule. Do them when they’re needed!

At the INDEX level the RTS data can be used for the same set of Utilities as above.

 

IX REORG decisions:

The REORG of an index can bring real plus points! If REORGPSEUDODELETES is greater than 10% of TOTALENTRIES or REORGNUMLEVELS is greater than zero, then it is definitely worth doing an index REORG to clear out the junk and get the index back into tip-top condition. This should not change your access paths, if they were bound when the index was “fit”, but it will improve your elapsed time as the number of IOs will go down dramatically.

 

IX RUNSTAT decisions:

Do not do a RUNSTATS
Same as TS RUNSTATS, but with the added hint of never actually doing just an Index RUNSTATS on its own. If you do this, then the data stored in the DB2 catalog will start to “drift” apart and, eventually, even the DB2 Optimizer will “see” this and start to make bad decisions based on inaccurate data. If you can, always try to do a consistent full RUNSTATS at the TP or TS level. BTW, if you’d like to check your DB2 catalog for inconsistency, please try our Statistics HealthCheck Freeware. As the name says, it’s FREE! !
Also note that a RUNSTATS at the index level will only invalidate access paths in the DSC that actually currently use this index, so if the optimizer has picked a tablespace scan or another index, then just running a RUNSTATS index will not enable a new access path for existing statements – another reason to run at the TS or TP level!

 

IX COPY decision:

Do not do index COPYs on their own! Personally I am no fan of COPY YES indexes. The REBUILD INDEX utility is so fast these days, (I have examples of a 17,000,000 row table with 17 indexes and the rebuild took three minutes elapsed…). and in a data sharing system you can even bring the system crashing down around you if you do too many ALTERs at the same time, so be very careful! But, if you feel you must have COPY YES indexes to reduce CPU consumption during RECOVERY, then do the TS and the IX at the same time, in the same copy statement of course!

In the IBM manual “Performance Monitoring and Tuning Guide”, look at the chapter “Real-time Statistics” to see which thresholds IBM uses to determine when a utility should be run – and, of course, make sure you tailor them to suit your site, if you don’t have a set it and forget it solution like our RealTime DBAExpert! Solutions like that also manage the RTS. Orphans build up in the RTS and can cause problems, so every now and again make sure that only databases and tablespaces that actually exist are also in the RTS. Watch out for DSNDB01, Work Databases and the RTS itself, as the first two are *in* the RTS but the RTS is *not* in the RTS.

 

Finally, a quick run through some of my favourite columns in the RTS:

DBNAME & NAME – Remember that the DSNDB01 is in the RTS, but watch out for SYSLGRNX NACTIVE and do not forget that the RTS is not in the RTS, of course!
EXTENTS – This is the number of extents in the *last* linear dataset, not the total over all datasets.
INSTANCE – If using the CLONE system, then each CLONE has two entries in the RTS, so you must always also join to the INSTANCE column in the SYSIBM.SYSTABLESPACE to make sure you are looking at real data and not behind-the-scenes data – also for indexes, of course!
LASTUSED – A great place to begin the “unused index analysis saga”.

And, of course, the new DB2 10 fields, first the TABLESPACE ones:
REORGCLUSTERSENS – The number of times data has been read by SQL statements that are sensitive to the clustering sequence – in other words CLUSTER is important!
REORGSCANACCESS – The number of times data is accessed for SELECT, FETCH, searched UPDATE, or searched DELETE. Great for REORG decisions!
REORGHASHACCESS – The number of times data is accessed using hash access for SELECT, FETCH, searched UPDATE, searched DELETE, or used to enforce referential integrity constraints. Great for checking whether your usage of HASH Access is bringing dividends or not.
HASHLASTUSED – When it was last actually used. The docu states TIMESTAMP but I always see a DATE column here, just like normal index LASTUSED.
DRIVETYPE – Does this object live on a normal disk (HDD) or on a solid state drive (SSD)? Only for the first created object dataset.
LPFACILITY – Does the disk unit that this object lives on support the new High Performance List Prefetch facilty Y or N?
STATS01 – Some internally used data for IBM

 

And now the INDEX ones:

REORGINDEXACCESS – The number of times the index was used for SELECT, FETCH, searched UPDATE, searched DELETE, or used to enforce referential integrity constraints. Great for REORG decisions as, if this number is high, then a REORG may not bring much in performance terms. Extra bonus data is that for hash overflow indexes, this is the number of times DB2 has used the hash overflow index – which is a *BAD* thing and should be controlled!
DRIVETYPE – Does this object live on a normal disk (HDD) or on a solid state drive (SSD)? Only for the first created object dataset.
STATS101 – Some internally used data for IBM. I just hope it is nothing to do with the Orwellian room 101!
Feel free to send me your comments and ask questions.

TTFN,
Roy Boxwell
Senior Architect