Tuesday, March 18, 2008

Monitoring OLAP Builds

Recently I was working on a project where the customer’s server was in out South Africa office and I was running various build configurations testing data loading performance. In the past when I have been monitoring build times I always used SQLDeveloper’s excellent auto refresh facility. Firstly, it is worth downloading the “Scripts for OLAP DBAs” created by Jameson White (who also contributes to this blog and is very active on the Oracle OLAP Wiki).

http://www.oracle.com/technology/products/bi/olap/OLAP_DBA_scripts.ZIP

These scripts are extremely valuable during tuning exercises. Jameson also provided me with two additional scripts for monitoring the XML_LOAD_LOG table, which holds all the build messages generated during a data load process. In the SQLDeveloper tree below you can see all these scripts



(Note, the OWB team has also provided a set of predefined scripts/reports as well). Here is the main report for the XML_LOAD_LOG table that generates a report based on the whole table:



The code for this report is here:

select XML_LOADID as "Load ID"
, XML_RECORDID as "Record ID"
, XML_AW as "AW"
, XML_DATE as "Date"
, TO_CHAR(XML_DATE, 'HH24:MM:SS') as "Actual Time"
, substr(XML_MESSAGE, 1, 9) as "Message Time"
, substr(XML_MESSAGE, 9) as "Message"
from olapsys.xml_load_log order by 1 desc, 2 desc

The other report allows you to focus on a single job, which is passed to the report as a parameter:



and the code for this report is here:

select XML_LOADID as "Load ID"
, XML_RECORDID as "Record ID"
, XML_AW as "AW"
, XML_DATE as "Date"
, TO_CHAR(XML_DATE, 'HH24:MM:SS') as "Actual Time"
, substr(XML_MESSAGE, 1, 9) as "Message Time"
, substr(XML_MESSAGE, 9) as "Message"
from olapsys.xml_load_log
where XML_LOADID = :i_LoadId
order by 1 desc, 2 desc

Once you have installed the reports into SQLDeveloper you can then use the auto refresh feature to keep each report up to date. SQLDeveloper lets you set the refresh rate for 5, 10, 15, 20, 25, 30, 60, or120 seconds



This works well if you have a good connection to your remote server and most importantly you can stay connected during the build process. Unfortunately, my connection to South Africa was very slow and sometimes it was necessary to run a job in background mode and just disconnect and walk away. Which causes a problem of know when the build has actually completed?

To help resolve this I created some utilities to help resolve this particular issue. Using the utl_smtp package that is part of the database I created a routine that would scan the XML_LOAD_LOG table and once a build was complete it would send me an email.

For ease of use I created three procedures to monitor:
  • Dimension builds
  • Cube builds
  • AW builds
Depending on what is being monitored the title and body of the email changes accordingly. For example when monitoring a dimension build:

The email that is sent has the title:

Data Load for PRODUCTS finished at 15:43:41

And the message body can contain either just three simple lines:

07-MAR-08 14:03:20 Started Loading Dimension Members for PRODUCTS.DIMENSION (1 out of 1 Dimensions).
07-MAR-08 14:03:20 Started Loading Dimension Members for PRODUCTS.DIMENSION (1 out of 1 Dimensions).
Total Time 00:00:00


Or the body of the message can contain the complete XML_LOAD_LOG for that job, for example:

07-MAR-08 14:03:33 Completed Build(Refresh) of SH_OLAP.SH_AW Analytic Workspace.
07-MAR-08 14:03:22 Finished Updating Partitions.
07-MAR-08 14:03:21 Started Updating Partitions.
07-MAR-08 14:03:21 Finished Loading Dimensions.
07-MAR-08 14:03:21 Finished Loading Attributes.
07-MAR-08 14:03:21 Finished Loading Attributes for PRODUCTS.DIMENSION. 6 attribute(s) LONG_DESCRIPTION, PACK_SIZE, SHORT_DESCRIPTION, SUPPLIER_ID, UNIT_OF_MEASURE, WEIGHT_CLASS Processed.
07-MAR-08 14:03:21 Started Loading Attributes for PRODUCTS.DIMENSION (1 out of 1 Dimensions).
07-MAR-08 14:03:21 Started Loading Attributes.
07-MAR-08 14:03:21 Finished Loading Hierarchies.
07-MAR-08 14:03:21 Finished Loading Hierarchies for PRODUCTS.DIMENSION. 1 hierarchy(s) STANDARD Processed.
07-MAR-08 14:03:20 Started Loading Hierarchies for PRODUCTS.DIMENSION (1 out of 1 Dimensions).
07-MAR-08 14:03:20 Started Loading Hierarchies.
07-MAR-08 14:03:20 Finished Loading Dimension Members.
07-MAR-08 14:03:20 Finished Loading Members for PRODUCTS.DIMENSION. Added: 0. No Longer Present: 0.
07-MAR-08 14:03:20 Started Loading Dimension Members for PRODUCTS.DIMENSION (1 out of 1 Dimensions).
07-MAR-08 14:03:20 Started Loading Dimension Members.
07-MAR-08 14:03:19 Started Loading Dimensions.
07-MAR-08 14:03:19 Attached AW SH_OLAP.SH_AW in RW Mode.
07-MAR-08 14:03:19 Started Build(Refresh) of SH_OLAP.SH_AW Analytic Workspace.
Total Time 00:00:14

The code will monitor both foreground and background jobs but for the background jobs access to the scheduler is required. For both types of job access to DBMS_LOCK.SLEEP function to allow the code to continuously loop while the job continues to process. During each loop of checking to see if the specified load has completed the DBMS_LOCK.SLEEP forces the monitoring process to sleep for 60 seconds (if anyone knows of a better way to do this please let me know).

Overview of the Code
The monitoring code is split into two packages with associated procedures:
  • AW_Monitor
    • Dim_Build
    • Cube_Build.
    • Aw_Build
    • Send_Complete_Log
    • Send_Mail
  • Monitor_Sched_Process
    • Create_Job
    • Drop_Job
Dim_Build
This procedure monitors the build process for a dimension, looking for the string 'Finished Loading Members for ' to determine if the build has completed.

Cube_Build

This procedure monitors the build process for a dimension, looking for the string 'Finished Auto Solve for Measure' to determine if the build has completed.

Aw_Build

This procedure monitors the build process for a dimension, looking for the string 'Completed Build(Refresh) of ' to determine if the build has completed.

Send_Complete_Log

Emails the complete log file for a build

Send_Mail

This procedure sends an email containing just the Start and End messages from the build being monitored

Create_Job

This procedure creates a new job within DBMS_SCHEDULER but does not enable the job

Drop_Job

This removes the job from DBMS_SCHEDULER


What to Monitor?
There are three monitoring options:
  • Dimension
  • Cube
  • AW
All this really does is determine the definition of the string used in the search criteria within the XML_MESSAGE column. If you want to know when a specific dimension has completed its refresh then use DIM_BUILD procedure. If you want to know when a specific cube has completed its refresh then use CUBE_BUILD procedure. The final procedure, AW_BUILD, monitors the refresh of the AW, which can be useful if you are refreshing lots of cubes and/or dimensions within a single job.

How to Monitor a Foreground Job
Monitoring a foreground job is relatively easy. If you want to run a job that maintains a dimension called product and then have an email sent once the refresh of the dimension has completed then you would use the DIM_BUILD procedure. The parameters for each procedure are much the same. You need to provide:
  • Schema name
  • AW Name
  • Object name (dimension name or cube name)
  • Report Type (Summary or Full)
  • Job Name (if the monitor process is being scheduled)
So the command would be as follows:

EXEC AW_MONITOR.DIM_BUILD('SH_OLAP', 'SH_AW', 'PRODUCTS', 'SUMMARY', null);


How Monitor a Background Job It is important to schedule the monitoring of XML_LOAD_LOG to start after the AW job has started. Therefore, you need to set the time passed to CREATE_JOB procedure to a point in time after the BuildDate details in the AW XML script.
  • Job Name
  • Script to run
  • Date and Time to run
  • Job Description
So the command would be as follows:

exec monitor_sched_process.create_job('MONITOR_PROD_1','aw_monitor.dim_build(''SH_OLAP'',''SH_AW'', ''PRODUCTS'', ''SUMMARY'', ''MONITOR_PROD_1'')', '07-MAR-2008 15:46:00', 'Starts the monitor of PRODUCTS dimension build');

It is important to schedule the monitoring of XML_LOAD_LOG to start after the AW job has started. Therefore, you need to set the time passed to CREATE_JOB procedure to a point in time after the BuildDate details in the AW XML script.

You can review the job details via the Scheduler Jobs page in Enterprise Manager. Here you can see an AW build process is scheduled to run at 3:45 and the monitoring job, ‘MONITOR_PROD_1’, is scheduled to run at 3:46.



You can use the features in Enterprise Manager to halt the job at any point in time via the delete button. Once the job itself has completed, i.e. the email is sent, the job is stopped and removed from the job queue.

Possible Code Changes
Before running the code you may need to change the recipient, from, and mail server details in the AW_MONITOR package. Each of the monitoring procedures has a call to SEND_MAIL procedure that includes the name of the “To” part of the email. This would need to be changed unless you want to send all your emails to me.

send_mail('keith.laker@oracle.com', v_title, v_body);

The procedure SEND_MAIL has the following lines that need to changed

msg_from VARCHAR2(50) := 'keith.laker@oracle.com';
mailhost VARCHAR2(30) := 'mail.oracle.com';


The Code

A note of caution - I am not a brilliant PL/SQL coder, therefore, I am sure most of the code I have created can be improved. I am not going to post all the code here, as I suspect it will cause problems. At the moment I cannot find a convenient location to host the Zip file containing the two PL/SQL packages, therefore, if you want the code send me an email (keith.laker@oracle.com) and then I will send you the zip file.

There are two basic packages:
  • AW_MONITOR
  • MONITOR_SCHED_PROCESS
The AW_MONITOR package contains the following procedures:
  • Dim_Build - This procedure monitors the build process for a dimension, looking for the string 'Finished Loading Members for ' to determine if the build has completed.
  • Cube_Build - This procedure monitors the build process for a dimension, looking for the string 'Finished Auto Solve for Measure' to determine if the build has completed.
  • Aw_Build - This procedure monitors the build process for a dimension, looking for the string 'Completed Build(Refresh) of ' to determine if the build has completed.
  • Send_Complete_Log - Emails the complete log file for a build
  • Send_Mail - This procedure sends an email containing just the Start and End messages from the build being monitored
Example code:
monitoring a build for a specific dimension. The parameters are schema, AW Name, Dimension Name, email type:
exec aw_monitor.dim_build('SH_OLAP', 'SH_AW', 'PRODUCTS', 'SUMMARY');

monitoring a build for a specific cube. The parameters are schema, AW Name, Cube Name, email type:
exec aw_monitor.cube_build('SH_OLAP', 'SH_AW', 'SALES', 'SUMMARY');

monitoring a build for a specific AW. The parameters are schema, AW Name, Cube, email type
exec aw_monitor.aw_build('SH_OLAP', 'SH_AW', 'SUMMARY');

The above procedures all finish by sending an email to the specified recipients, where the body of the email can either be a summary of the build (just the start and end times) or the complete build log for the dimension, cube or AW.

exec send_mail('keith.laker@oracle.com', 'Data Load for PRODUCT finished at 12:00pm', 'Start at...., Finished at.....);

exec send_complete_log(795, 'keith.laker@oracle.com', 'Data Load for PRODUCT finished at 12:00pm');

The result is an email delivered to your Inbox:

Then depending on the parameter that controls the body of the email, the body will contain either the complete log from XML_LOAD_LOG table:

or simply a summary containing the start and end times for the build process:


The MONITOR_SCHED_PROCESS package contains the following procedures:
  • create_job - This procedure creates a new job within DBMS_SCHEDULER but does not enable the job
  • drop_job - This removes the job from DBMS_SCHEDULER

Example code:

exec monitor_sched_process.create_job('AW_DIM_MONITOR',aw_monitor.dim_build('SH_OLAP','AH_AW', 'PRODUCTS', 'AW_DIM_MONITOR', 'SUMMARY'), '23-JAN-2008 12:15:00', 'Starts the monitor of the PRODUCTS dimension build');

exec monitor_sched_process.drop_job('AW_DIM_MONITOR');

No comments: