2011-02: XML Table Spaces – How they work and what you must take care of, or “LOBs the Next Generation”

 

XML has become more and more interesting for DB2 over the years.  Starting with DB2 9 it has also become much easier to use!

One of the weird things about XML is that the table spaces come in three (yes three!) different flavors depending on the base table space/table definition. Therefore, they are a bit “trickier” than a “normal” table space.

Our newsletter this month covers the relatively straight-forward rules for creating these.

 

Segmented table spaces:

If you create a segmented table space with an XML column type, DB2 will automatically:

  1. Add a column to the base table “DB2_GENERATED_DOC_ID_FOR_XML” BIGINT.
  2. Create an XML Partition by Growth (PBG) universal table space, (the name will be Xyyynnnn where yyy is the first three UTF-8 bytes of the base table space, and nnnn four numeric’s starting with 0000).
  3. Create an XML table in the above space. The name will be Xyyyyyyyyyyyyyyyyyynnn, where the y-y are the first 18 UTF-8 characters of the base table name, if it already exists.  nnn is used as three numeric’s starting at 000 until a unique name is found.
  4. Create a document ID index on the implicit column DB2_GENERATED_DOC_ID_FOR_XML in the base table. The name will be I_DOCIDyyyyyyyyyyyyyyyyyynnn, where the y-y are the first 18 UTF-8 characters of the base table name, if it already exists. nnn is used as three numeric’s starting at 000 until a unique name is found.
  5. Create a node ID index on each XML column in the base table. The name will be I_NODEIDyyyyyyyyyyyyyyyyyynnn, where the y-y are the first 18 UTF-8 characters of the XML table name, if it already exists. nnn is used as three numeric’s starting at 000 until a unique name is found.

 

PBG spaces:

If you create a Partitioned by Growth (PBG) table space with an XML column type, DB2 does the same as above, but with a default DSSIZE of 4GB – also for *all* of the indexes.

 

PBR spaces:

If you create a Partitioned by Range (PBR) table space with an XML column, DB2 will create as many PBR XML table spaces for you with their associated indexes but with a possibly larger DSSIZE.
This can all lead to a spectacular growth in storage space. Imagine a 4096 partition space, which is then altered to have two XML columns…the mind boggles!

 

What does this boil down to?

  • When using XML spaces make sure your back-up and recovery plans work.
  • Coordinate space management proactively.
  • Use COMPRESS wisely. XML table spaces inherit the COMPRESS attribute or can be ALTERed and then REORGed to enable compression. They compress very well.
  • Monitor your 16k Bufferpool usage because this is always used by XML table spaces.
  • Watch out if you copy data from subsystem to subsystem! XML data tends not to copy across too well, (e.g., DSN1COPY does not support it) because the XML descriptive data in the XSR system tables tends to be “forgotten”…

 

New from IBM:

Take a look at IBM’s RedBook “Extremely pureXML in DB2 10 for z/OS, SG24-7915-00”, which is full of great info about the use and maintenance of XML data.

 

Life of an XML column example:

  • Create segmented table space XMLCUSTO.
  • Create table XMLCUSTOMER (columns CID BIGINT and INFO XML).
  • DB2 adds the column DB2_GENERATED_DOC_ID_FOR_XML BIGINT, and it creates table space XXML0000.
  • DB2 creates the XML table XXMLCUSTOMER containing the three columns:
     DOCID         BIGINT
     MIN_NODE_ID   VARBIN(128)    NOT NULL
     XMLDATA       VARBIN(15850)  NOT NULL
  • DB2 creates the unique index I_DOCIDXMLCUSTOMER on the base table with the following column:
    DB2_GENERATED_DOC_ID_FOR_XML

 

  • Finally, DB2 creates the clustering, not padded, non-unique index I_NODEIDXXMLCUSTOMER on the XML table with columns:
     DOCID
     XMLDATA

 

A table INSERT:

INSERT INTO XMLCUSTOMER VALUES (06001, XMLPARSE(DOCUMENT '
 <customerinfo><name>John Smith</name>
 <addr country="United States"><street>327 Smith Ave</street>
 <city>Glendale</city><prov-state>California</prov-state>
 <pcode-zip>91208-06001</pcode-zip></addr>
 <phone type="work">888-111-0001</phone></customerinfo>'));

 

A SELECT:  

SELECT CID
FROM BOXWELL.XMLCUSTOMER
WHERE XMLEXISTS ('//addr[city="San Jose"]' passing INFO)

You will also find useful information about XML basics, XML integration in DB2 9, 10 and XML index design in our presentation entitled “XML Episode 9 – the return of the hierarchical Empire –“.

 

If you would like to sample SQL to create and populate some example XML spaces, please contact us.

 

Feel free to send me your comments and ask questions.

TTFN,
Roy Boxwell
Senior Architect