Difference between revisions of "Migrating From Filenet to Content Manager OnDemand"
(Initial edit on ingestion.) |
m (Added section on batches, filled out other sections.) |
||
| Line 60: | Line 60: | ||
=== Group reports in chronological order === | === Group reports in chronological order === | ||
Due to the way table segmentation works in OnDemand, you'll want to load the data in chronological order. When you name files, consider including a date field in YYYY-MM-DD format, so it can be sorted numerically at load time. | Due to the way table segmentation works in OnDemand, you'll want to load the data in chronological order. When you name files, consider including a date field in YYYY-MM-DD format, so it can be sorted numerically at load time. This ensures that when the production server goes live, that end users will get speedy and fast database queries. | ||
=== Concatenate Reports === | === Concatenate Reports === | ||
Concatenating reports together means fewer loads (and less overhead, as each load can represent up to 10k in metadata). It also means you'll get better compression for storage. | Concatenating reports together means fewer loads (and less overhead, as each load can represent up to 10k in metadata). It also means you'll get better compression for storage. Depending on the volume of data for a particular report, you may be able to group reports together by month -- and this also works perfectly with the point above, keeping groups of data with similar dates together inside database tables. | ||
=== Produce output in manageable batches === | |||
When producing output, remember that you'll likely need to transfer this data between systems, possibly across the network, and onto different operating systems. There are limitations to different archiving and compression tools (32678 files for .zip archives, and 2GB file size limits for older versions of gzip and bzip2), and you don't want to lose too much time or effort if a file transfer is interrupted. It's best to produce managable, similarly-sized batches that you can use to develop Applications, test loads, and promote from your Development to Quality Assurance ("QA") and Production Servers. | |||
== Order of Operations == | == Order of Operations == | ||
Revision as of 18:16, 6 September 2015
Scope
These are some recommendations and tips from my experiences migrating Filenet data to Content Manager OnDemand - since I'm not a Filenet admin, Filenet is outside the scope of this article -- this only deals with the OnDemand-specific tips and tricks to make your migration easier.
Nomenclature
This is the worst part of migrating between Filenet and CMOD, as some terms are used in both systems, but have different contexts and meanings. The process is complex enough without the added headaches of misunderstandings brought about by ambiguous terms.
Filenet Nomenclature
- Document Class (aka "DocClass")
- Defines the metadata used to find individual reports.
Content Manager OnDemand Nomenclature
- Application Group (aka "App Group" or simply "AG")
- A way of combining many different reports into a single group of data, organized by business need. Accounting reports for accounting, and operations reports for your operations teams. Reports that are bundled into Application Groups need to have the same index fields, storage hierarchy, and retention (ie, expiration) handling.
- Application (aka "App")
- Defines the type of document (AFP, Line data, PDF, Image), and how to collect search criteria (aka 'indexes') for storage in the database.
- Multiple Applications of any data type (ie, AFP for customer statements, Line data reports generated by a mainframe, special letters or notices in PDF format, and incoming faxes stores as TIFF images).
- Folder
- The Folder in OnDemand abstracts the internal complexities of Application Groups and Applications, and presents users with the fields that they can search (which were populated into the database by the Application) and sets limits on their queries (maximum number of returned hits, fields required for searches, etc.)
Considerations
These are items that should be given priority. Getting it right at this stage will mean a faster, easier, cheaper transition to CMOD at the end of the day.
Converting Document Classes
Content Manager OnDemand ("CMOD") has an entirely different architecture than the Filenet products. In CMOD vernacular, an 'Application' is analogous to an individual report. But in CMOD, the top of the hierarchy is the 'Application Group' -- a grouping of Applications (aka 'reports') where the index fields, storage, and retention requirements are all the same. Properly defined Application Groups can have multiple Applications (again, 'reports') that belong to it. The most rational way to design Application Groups is to combine reports together that fulfill a specific business need. Human Resources reports shouldn't intermingle with Accounts Payable (even if they have the same index fields), and are kept logically separate by keeping their reports in separate Application Groups.
Quantify index usage
OnDemand doesn't like to have indexes defined in the Application Group without a corresponding value appearing in the reports it processes -- it also wastes space inside the database. It seems common in the Filenet world to assign a report to a Document Class that has indexes configured that simply don't exist anywhere in the report. Yes, you can assign default values to the empty fields to get ACIF to stop complaining, but if you want to do this right, you'll want to look into your index usage. Not just which fields you're populating most often, but also which fields your end users are searching on. Eliminating unused fields from Application Group definitions will streamline indexing, reduce storage costs, and reduce complaints from end users at the end of the day.
Transfer in Original Formats
For some Filenet installations, upstream servers (or intermediate file transfer systems) convert report data (from EBCDIC to ASCII) and change the formatting of the report. Content Manager OnDemand doesn't need any data transformation, and can ingest EBCDIC reports (of fixed record length, stream, or variable record lengths) directly and without conversion. Some conversion tools (I'm looking at you, MQ Series File Transfer Edition) can be configured to change the report so drastically, that CMOD can't properly index it.
Wherever possible, remove any data conversion and deliver report data to OnDemand in its original format.
This means that you may require two different Applications ("Report Defintions") for each report -- one for the report in its original format (EBCDIC) and one for the converted version (ASCII). For this reason alone, you should always define an Application ID Field in EVERY Application Group.
Image Overlays for Reports
If a report has a graphic "overlay" (like, an image with boxes around columns, or shaded bars, or graphic logos) this should be documented as early in the process as possible. In order for these overlays to be displayed on all platforms, line data reports will need to be converted to AFP. This will require any overlay graphics not in AFP format to be converted -- a process which can take a considerable amount of time to complete, especially if there is not someone available to do the translation 'in-house'.
Review Report Types & Audience
There's no better time to review the contents of reports, and refer with end users to determine which reports should be stored, indexed, managed, and disposed of in the same manner. Put on your Business Analyst cap and strap on your most comfortable telephone headset, because this is the most time consuming and manual part of the whole process.
Consider how you'll ingest the data
You'll want to make sure that during the export process that you consider the information you'll need to get the exported data into Content Manager OnDemand.
Provide report names
In order to get specific reports into OnDemand, you need to provide the name of an report (likely as an Application). Make life easier for yourself by including the name of the report in the file names you output.
Group reports in chronological order
Due to the way table segmentation works in OnDemand, you'll want to load the data in chronological order. When you name files, consider including a date field in YYYY-MM-DD format, so it can be sorted numerically at load time. This ensures that when the production server goes live, that end users will get speedy and fast database queries.
Concatenate Reports
Concatenating reports together means fewer loads (and less overhead, as each load can represent up to 10k in metadata). It also means you'll get better compression for storage. Depending on the volume of data for a particular report, you may be able to group reports together by month -- and this also works perfectly with the point above, keeping groups of data with similar dates together inside database tables.
Produce output in manageable batches
When producing output, remember that you'll likely need to transfer this data between systems, possibly across the network, and onto different operating systems. There are limitations to different archiving and compression tools (32678 files for .zip archives, and 2GB file size limits for older versions of gzip and bzip2), and you don't want to lose too much time or effort if a file transfer is interrupted. It's best to produce managable, similarly-sized batches that you can use to develop Applications, test loads, and promote from your Development to Quality Assurance ("QA") and Production Servers.
Order of Operations
In order to find problems with reports as quickly as possible, follow these steps in order:
- Build a test / development Content Manager OnDemand Server
- Make sure you have some extra temporary storage space to queue up incoming report data
- Begin delivering duplicates of the report data to CMOD, in its original format.
- Make some test Application Groups and get some practise indexing these reports, and figuring out any strange or non-standard report types.
- Do you research -- everything under 'Initial Considerations'.
- Document your new structure - create new Application Groups, select the reports that will belong to them, and identify your sample data.