Table Segmentation in Content Manager OnDemand
Introduction
In order to keep database queries fast, OnDemand uses a concept called "database table segmentation". The term 'table segmentation' refers to splitting extremely large tables into 'segments' of smaller tables for performance and/or ease of maintenance. Although the latest versions of database engines can do this natively, at the time CMOD was created, there was no built-in support for this functionality, so it still uses the old style segmentation to achieve the scalability and speed that customers require.
When an end user performs a search, CMOD performs the search on one or more tables, based on the date range contained in individual tables. This 'date range' is called the 'segment date'.
Before DB2 supported its own table segmentation natively, the Content Manger OnDemand developers decided to split index data into tables of 10 million rows each. Using this method keeps search performance linear, as only the tables containing documents in the date range you're looking for ( for example, 3 months, or 1 year) are actually searched.
In order to complete queries as quickly as possible, it's important that you minimize the number of tables that are searched. Each additional table is more work for the CPU and Input/Output ("I/O") that must be performed -- and delaying the response to the end user.
Optimizing Segment Size
One way to improve query performance is to match the volume of data you ingest each month with the segment size.
If your monthly volume is more than 10 million documents for a single Application Group, use this formula to estimate your optimal segment size:
Number of Individual Documents loaded per month * 1.10 = Max Rows per database table
The 1.10 in the forumla gives you 10% additional room for growth. If your growth rate is higher, adjust it accordingly. If your volumes increase suddenly, you should review the setting. It will take effect for the next database table that is created.