Monday, December 29, 2008

Now Available! Two new Oracle OLAP Demonstrations

Two new Oracle OLAP demonstrations have been added to the Oracle OLAP product page on OTN:

Fast Answers to Tough Questions Using Simple SQL :- Oracle OLAP is a world class analytic engine embedded in the Oracle Database. OLAP Cubes and dimensions are easily accessible thru a star-model. Using very simple SQL, Oracle OLAP delivers fast answers to tough, analytic questions. This demonstration shows how to query OLAP cubes using several tools, including: Oracle Business Intelligence Enterprise Edition, Application Express and SQL Developer.

Transparently Improving Query Performance with Oracle OLAP Cube MVs :- Oracle OLAP cubes may also be deployed as materialized views. Summary queries written to base fact tables can transparently leverage the fast query performance delivered by Oracle OLAP - without any changes to the application's query. The Oracle Optimizer automatically rewrites queries to cubes when appropriate. This demonstration shows how Oracle Business Intelligence Enterprise Edition seamlessly benefits from this capability. The demonstration then provides an "under the covers" view of how this improvement is achieved.

Tuesday, December 23, 2008

Oracle OLAP Newsletter - December 2008

The latest Oracle OLAP newsletter, December 2008, has been posted to the OLAP Home Page on OTN and is available by clicking here.

As usual, it contains very useful information about what is happening in the world of Oracle OLAP and includes regular features such as the OLAP skills corner and DBA tips, as well as useful links for those wanting to download the software or get training or assistance.

The featured customer this time is JD Sports in the UK. This is an excellent example of Oracle OLAP being used as part of a wider Data Warehouse solution. It is also a significant endorsement of Oracle's Data Warehouse strategy to bring smart, embedded analytics to the data and highlights the benefits of an embedded OLAP server; JD Sports has been able to leverage other advanced Oracle technologies such as Real Application Clusters (to deliver scalability and availability), whilst additionally benefiting from many of the features taken for granted by Oracle's RDBMS customers (DW integrated security, storage, backup, transaction control, etc) but often not so easily delivered by stand-alone OLAP engines.

This customer feature is also a good example of the scalability of the Oracle OLAP engine itself. Whilst far from being the largest implementation (there are customers managing several Terabytes of data in Oracle OLAP cubes), this example shows how relatively large volumes of DW data can be loaded and aggregated in Oracle OLAP, and how the cube compression and partitioning features first introduced in 10g OLAP have completely changed the game compared to what was previously possible. Taken in isolation, loading 300 million source records is not a major achievement (not for Oracle OLAP anyway), neither is having a 10 dimensional cube, or indeed is aggregating across 28 hierarchical levels. What is more impressive is doing all of these three things combined in a single cube (which is only one cube out of a total of six), and still being able to deliver all of the key benefits you would associate with an well implemented OLAP system - fast query performance and lots of advanced calcs (literally 100's in this case) serving a reasonably sized user community.

All things considered, it is easy to see why Oracle OLAP is a key, strategic component of the Oracle Data Warehouse platform.

Let's hope for some more customer features in the near future.

Greetings of the season to everyone!

BTW - you can have the OLAP newsletter sent directly to your email box each quarter by following the link at the top of the current newsletter (Unsubscribe/Subscribe to this Newsletter)

Sunday, December 21, 2008

Get hands-on with 11g OLAP

You may have already noticed but over the past couple of weeks some new 11g OLAP training material has been published on the OTN OLAP home page.

Two new tutorials have been added to the popular Oracle By Example (OBE) series.

The first is titled 'Building OLAP 11g Cubes' and covers using Analytic Workspace Manager (AWM) 11g to build and load an OLAP cube.

The second is titled 'Querying OLAP 11g Cubes' and is a guide to querying a cube via SQL, both directly using OLAP Cube Views, and indirectly using Cube Materialized Views.

Supporting both of the tutorials is a new sample schema which gives you the opportunity to get hands-on and experiment in your own environment. Remember, that patch level 11.1.0.7 is required and to always check the recommended release details for your chosen operating system.

Tuesday, December 16, 2008

ComputerWeekly.com : E.on transforms financial insight with Oracle OLAP Option

http://www.computerweekly.com/Articles/2008/12/11/233863/e.on+transforms+financial+insights+with+bare-knuckle.htm

The article above, published last week in ComputerWeekly, follows an earlier customer profile posted on OTN. E.on, one of Europe's leading energy suppliers, has gone live with a financial transformation software project that will give it an accurate view of the profits generated by each customer and provide better forecasting of future demand. E.on has accounts with over eight million customers in the UK. To quote directly from the project leader Lawrence Edwards: "It (accurate forecasting of demand, and, therefore, margin) is a massive problem because of the data volumes and complexity. Others have tried and failed, but we persevered."

It provides excellent proof points for the use of Oracle OLAP to solve business problems that require rapid data loading and analysis of large data volumes. The E.on system is currently operating at about 6Tb and is growing rapidly, with 150Gb of new data added every 14 hours. There is a great quote from Lawrence in the OTN PDF explaining what they have achieved using Oracle OLAP Option:

"The analytical power, centralized administration, and scalability of Oracle OLAP have allowed us to process and present data in a way that was not previously possible. This has provided us with an unprecedented depth of understanding of customers’ energy use and the demand for all our products and offerings"

New 11g OLAP Cube Materialized Views tutorial posted onto OTN

Another new tutorial has been added to the Oracle OLAP home page on OTN.

This tutorial is titled 'Oracle OLAP 11g: Setting Up Cube Materialized Views for Query Rewrite'

The tutorial describes how to enable cubes as Cube Materialized Views, and how to enable and troubleshoot Query Rewrite using Analytic Workspace Manager 11.1.0.7. It is intended as a quickstart for intermediate developers.

Monday, December 8, 2008

Oracle Database 11g: OLAP Essentials - First dates announced

Following the announcement last week about the new Oracle OLAP 11g Oracle University Training Course, the dates and locations for the first classes have been announced.

The very first class will be in Bridgewater, New Jersey, US from 20-Jan-2009 through to 22-Jan-2009.

The first class in Europe will be in Reading, UK from 21-Jan-2009 through to 23-Jan-2009.

The code for the course is D70039GC10 and more details on both events can be found on the Oracle University web site

Be sure to register early if you wish to attend as places are sure to be high in demand.

Thursday, December 4, 2008

New! Oracle OLAP 11g Oracle University Training Course

A brand new OLAP 11g training class has been added to the Oracle University schedule.

Here is a brief synopsis:

Oracle OLAP 11g, a fully-integrated component of Oracle Database 11g, provides a full featured multidimensional data model and calculation engine that is easily accessible to any SQL based business intelligence application or tool.

In this course, students learn to progressively build an OLAP data model to support a wide range of business intelligence requirements. Students learn to design OLAP cubes to serve as a summary management resource for existing SQL table queries. Students also learn to leverage the power of Oracle OLAP by adding rich analytic content to your data model.

Students learn to create sophisticated reports of OLAP data by using simple SQL queries. Students also create and execute OLAP queries in SQL Developer, Oracle Application Express (APEX), and in Oracle BI Enterprise Edition. Students learn to implement cube security, including how to authorize access to cube data and methods for scoping user views of data. Finally, students learn to design OLAP cubes for performance and scalability.

Learn To:

* Design and create an Oracle OLAP data model
* Enable query rewrite to OLAP Cube MVs for relational summary management
* Easily create OLAP calculations that enrich the analytic content of your data model
* Query OLAP data using simple SQL
* Implement cube security
* Efficiently design cubes for performance and scalability

More details and scheduling information can be found on the Oracle University Website

Tuesday, November 25, 2008

New 11g OLAP tutorial posted onto OTN

A new tutorial has been added to OTN.

The tutorial is aimed at newcomers to Oracle OLAP and is a guide to creating and populating an 11g OLAP cube.

This is perfect for people who are looking for a gentle introduction to using the Analytic Workspace Manager OLAP administration tool and understand the basic steps in building an 11g OLAP cube.

Tuesday, October 21, 2008

New article on 11g OLAP Cube-Organised Materialized Views published onto OTN

Oracle ACE Director Arup Nanda has published a series of articles onto OTN covering important new features in Oracle Database 11g titled 'Oracle Database 11g: Top Features for DBAs and Developers'.

The series includes a feature on 'Data Warehousing and OLAP' which looks at how Cube-Organized Materialized Views can be implemented alongside other features to deliver a compelling platform for data warehousing.

With Oracle's data warehousing proposition featured heavily in the news at the moment following the recent announcement at Oracle Open World on the availability of the Exadata Storage Server and Database Machine, this is an excellently timed reminder that Oracle OLAP is a core part of this data warehousing proposition.

Friday, October 10, 2008

New Oracle OLAP White Paper released to OTN

A new Oracle OLAP white paper has been released to OTN titled "Using Oracle Business Intelligence Enterprise Edition with the OLAP Option to Oracle Database 11g"

This contains a guide on how to configure the OBIEE metadata layer to leverage the 11g Oracle OLAP option, both indirectly via OLAP cube based materialized views, and directly via OLAP cube views. For those working with 10g OLAP (cube views only), the best guide to configuring OBIEE is found in the online tutorial on OTN.

Personally, I think that it is great that this white paper captures an explanation of how to write SQL that is optimised for Oracle OLAP cube views. This is something I find customers initially struggle with - they write what they believe is a simple query and then cannot understand why the performance is not good.

This is because there are a few golden rules to writing optimal SQL for OLAP cube views and while they are simple to understand, they are not obvious to those who are new to the technology. I hope to write a more detailed Blog entry on this subject very soon, but for the time being take a look at the white paper (particularly pages 10 & 11) to see what I mean.

More 11.1.0.7 ports now available

The Oracle Database 11g Release 11.1.0.7.0 Server Patch has now been released for several other ports, including Windows 32-bit.

The full list of operating systems currently supported is:
  • Linux x86
  • Linux x86-64
  • Solaris (SPARC) (64-bit)
  • IBM AIX (64-bit)
  • HP-UX Itanium
  • Microsoft Windows (32-bit)
I would recommend that all 11g OLAP users apply the 11.1.0.7 patch as soon as it is available for their operating system.

I would also recommend that all 11g OLAP users upgrade their AWM client to the 11.1.0.7A release which can be downloaded from Metalink or OTN

As always, the best source of information for recommended releases and patches is the Oracle OLAP certification page on OTN

I'm now going to download the Windows 32-bit patch (all 1.5GB of it!), and give it a roadtest....

Monday, October 6, 2008

Analytic Workspace Manager 11.1.0.7A released to Metalink

The 11.1.0.7A version of AWM has been released to Metalink as patch 7420490

It includes important fixes and new features including:
  • the ability to add multiple languages to a single analytic workspace
  • individual aggregation definitions may now be defined for each measure of a cube
  • the Create Dimension user interface has been modified to allow levels of the dimension to be created at the same time as the dimension
  • the functionality of dimension and cube mapping has been enhanced to allow the application to refresh the definitions of database objects interactively to reflect the current state of database schema tables

To take advantage of all the new fixes and features, the Oracle Database 11g Release 11.1.0.7.0 Server Patch must be installed as well. This is currently only available for Linux 32-bit & Linux 64-bit, but other ports are likely to be available soon.

Monday, September 29, 2008

Oracle OLAP Newsletter - September 2008

The latest edition of the excellent Oracle OLAP Newsletter has just been released here

Highlights this time include a customer feature on Oss Council in the Netherlands who use Oracle OLAP for management reporting, details of the new 11.1.0.7 release, and a guide to delivering summary management through cube materialized views.

Sunday, September 28, 2008

Oracle Open World 2008 - HP-Oracle Database Machine launch

The big news at Oracle Open World last week was that Oracle CEO Larry Ellison used his keynote, entitled "Extreme. Performance." to talk about Data Warehousing and launch the HP-Oracle Database Machine featuring the innovative new Oracle Exadata storage server.

You can learn all about these exciting new products on the oracle.com web site, so I won't bore you with that here, except to tell you that the performance of the Database Machine that we have seen from our Beta test customers is deeply impressive (the "10x" claims in the ads are pretty conservative from what I've seen).


The market reaction to the news that Oracle, together with HP, are now offering a 'DW Appliance' with superior performance, based on proven hardware components & including Oracle Database 11g, has reflected a common view that I heard while answering questions from customers at Open World: that the rationale for purchasing one of the niche vendors' DW Appliances, with their narrow sweet spot and less capable database software is now more questionable than ever.

But why would anyone say that? Surely 'performance' is a game of leap-frogging, and if/when one of the DW Appliance vendors releases a newer faster machine before HP-Oracle does, won't that mean that the Oracle Database Machine's advantage is short-lived?

Not at all.

The HP-Oracle Database Machine is a DW Appliance like no other. This is because Oracle Database has depth & breadth of functionality and power that none of the others can claim. When you purchase a Database Machine you get Oracle Database EE, Real Application Clusters and Partitioning pre-installed and preconfigured, with ASM being used to manage the storage grid. All of the features included in there are available for use as soon as you plug in your new machine. But you can leverage much more even than that, of course.


And this is a key point - all the capabilities, features and options available in Oracle Database 11g are available on the Database Machine. Security, High Availability, Manageability, support for a Mixed Workload, and Embedded Analytics - all of it. And because this is standard Oracle Database, it is easy to run any of your applications on it - no specialised knowledge of rarely used niche RDBMS's required. Anything that runs your existing Oracle servers will run on Database Machine without change.

This includes OLAP of course. Oracle Database OLAP is there too - pre-installed along with the rest of Oracle Database EE. You just need to license it for use on the DB Machine when you choose to use it. And this is great news as it adds multidimensional calculation and analysis sophistication to an already awsome piece of kit.

A majority of the Oracle OLAP Option customers that I have met who report disappointing performance turn out to be IO Bound on their servers. That is, the server and storage they are using is out of balance, and constraining the ability of the Database to process the data. The HP-Oracle Database Machine (like some of the Optimized Warehouses also available from Oracle and it's other hardware partners) provides excellent IO performance thanks to the balanced high speed infiniband interconnects between the storage and the database servers included in the machine, and is optimised for data warehousing queries across the spectrum - OLAP included. Good IO performance directly translates to even more effective OLAP implementations.

So, the HP Oracle Database Machine is a great fit for Oracle Database OLAP Option, and other embedded analytics features of the Oracle Database, including Data Mining, SQL Analytics and Statistical Functions, and the OWB (Oracle Warehouse Builder) Data Profiling and Data Quality features. Delivering BI applications that leverage these powerful capabilities is also much easier thanks to them being pre-installed on the Database Machine.

And Oracle OLAP is a great fit for the Database Machine, too. Which leads to the other question I have been asked by a couple of people: if Database Machine makes queries really really fast, does it mean that OLAP is no longer needed? Can I just dump all the data into tables and ignore all the other optimisations for Data Warehousing that Oracle Database provides?

This question misunderstands the primary reason that people invest in OLAP systems. It is not only about performance, but also (especially) about the calculation capability, and the ease with which even the most complex of business calculations can be expressed. Many business calculations are difficult to do in SQL on regular relational tables. Some are still not even possible. And in turn, many BI tools resort to the transfer of large amounts of data across the network to mid-tier servers, or even the client, where the calcs are performed. Database Machine will allow them to pull the raw data much faster than before, but you still have a more complex architecture than you need, and network performance will be impacted. The OLAP Option makes time series, shares, indexes, ratios and so on that businesses use on all their performance dashboards really easy to define, and really efficient to process. In the Database.

As regular readers of this blog know, the OLAP Option provides sophisticated multidimensional calculation and query functionality, accessible by pretty much any tools via a simple SQL query. There are hundreds of multidimensional-aware analytic calculation functions delivered by the OLAP Option - best of breed capability. This in turn leads to further performance benefits of course, but especially phenomenal ease of use and cost of ownership improvements. If all your KPIs are calculated within the Database, and surfaced to your BI tools, BI apps and Dashbards as simple columns that can be SELECTed; you reduce to trivial levels the amount of work you need to do in each BI Tools metadata layer with regard to (re)defining those calculations.

Other Data Warehouse vendors - appliance & non-appliances alike - cannot do this, and instead force you into further fragmenting your information asset across multiple servers and engines, with the added complications to the BI infrastructure that brings. The other vendors will require you to purchase, install, build and manage seperate standalone multidimensional data marts, usually from 3rd party vendors. None of them have embedded analytics into the core of the Database like Oracle has. And none provide simple high performance SQL access to the results of these analytics.

The combination of extremely fast performance for IO intensive queries (which characterise the work of some of the users of the data warehouse and are typicaly the queries targetted by the DW Appliance vendors) together with the multidimensional calculation power of the OLAP Option (which are commonly consumed by the masses via interactive dashboards etc, as well as used by the analyst users) in an easy to install, pre-configured, balanced hardware platform is very compelling.

Exciting times for Oracle BI and Data Warehousing.

Wednesday, July 9, 2008

Article : Closing the Ad Hoc Query Performance Gap for Good

I found an interesting article today in amongst my Google alerts. It related to a topic of conversation at the recent TDWI conference in May about the issue of adhoc query performance within data warehouse environments. The article, on the whole, was very good (in my opinion) and included comments by Oracle's vice-president of database marketing, Willie Hardie.

The full article in Enterprise Systems is available here.

The article examines the reasons why business users feel their queries are taking too long and what steps companies are taking to try to improve query performance. The basic reasons for poor query performance was given as being down to two issues:
  • success of pervasive BI - so more users are running adhoc queries
  • more data - users want access to more and more data, both in terms of level of detail and time span.
I would add a third reason, which is increasing sophistication. Simply presenting users with reports that show revenue and expenses for the latest month, quarter and year to date are not adequate in today's highly competitive environment. Business users want to know about trends - this period vs last period, this year vs last year, this period vs the same period last year, like-for-like, shares, ranks, forecast, customer segmentation, market basket analysis. The list goes on and on and on.

Here is a direct quote from the article:

......The RDBMS -- or, more specifically, the Oracle RDBMS -- is an unmatched analytic workhorse, Hardie argues. Oracle is one of the biggest data warehousing players in the business, he points out, and the Oracle database powers some of the largest DWs in existence.

"The Oracle database is proven to be the fastest database out there for both transactional systems and data warehousing systems, across all scales, from small to extremely large systems. You ask any Oracle customer out there and they'll all give you the same answer: Oracle is the fastest database out there on the market right now," he claims.

What I would add is the Oracle Database is the only database with embedded multidimensional OLAP, which is fine tuned for adhoc query performance, query scaleability and, most importantly, calculation power. As I stated above it is no longer about simply what is happened in the last trading period, BI analysis is now all about comparisons, trends and KPIs (calculations) at aggregate levels with the ability to drill right through to the lowest level of detail.

As the article quite clearly acknowledges there is a trend to opening up the corporate data warehouse to more and more users, which means query performance and scalability are becoming increasingly important. Only Oracle Database has specific built-in optimisations, such as OLAP and data mining, to meet these growing requirements.

For those of you new to Oracle OLAP Option and Oracle Data Warehousing you can get more information from these links:

Oracle OLAP Option on OTN
Oracle OLAP Option Forum on OTN
Oracle OLAP Option Wiki
Oracle OLAP Option on Oracle.com
Oracle Data Warehousing on OTN
Oracle Data Warehousing on Oracle.com
Optimised Warehouse Initiative

Monday, June 2, 2008

Best Practice Tips : SQL Access to Oracle DB Multidimensional AW Cubes (#2)

One of the most useful features introduced with Oracle Database OLAP is the ability for the powerful multidimensional calculation engine and the performance benefits of true multidimensional storage in the Analytic Workspace (AW), to be accessed and leveraged by simple SQL queries.

This single feature dramatically increases the reach and applicability of multidimensional OLAP – to a vast range of BI query and reporting tools, and SQL-based custom applications that can now benefit from the superior performance, scalability and functionality of a first class multidimensional server, but combined within the Oracle Database with all the other advantages that derive from that.


This post is the second in a series that I will use to share some general best practice tips to get the most out of this feature, so that you can deliver even better solutions to your business end-users:

Best Practice Tip #2: General AW Object Naming Conventions for dimensions, levels, hierarchies and attributes…(Oracle Database 10g and 11g)

The following advice will result in much easier to understand and use relational views over your AW. It makes the implementation much cleaner to visualise, and easier for other users to understand what they are looking at. It also saves a lot of typing for developers that are writing their own SQL queries!

The objective is to ensure that the generated column names in your views are easy to read, and also to avoid the possibility that generated column names may get truncated to fit within the limits for a column name in Oracle Database (when that happens your views get really ugly really quickly). Finally, it has the additional desirable side effect of making it easier and therefore quicker to do the mappings in AWM because the screens are less cluttered with long-winded object names!

Note: this advice follows both for Oracle Database 10g OLAP (eg views created by the AWM10g View Generator Plug-in) and for Oracle Database 11g OLAP, where views are auto-generated (eg when creating your Standard From AW via AWM11g).

Here is the idea:
  • Keep the names used for dimensions, levels, hierarchies, and attributes as short as possible, while still meaningful of course.
  • If possible (simply for readability in the resulting relational view and column names), avoid the use of the "_" char especially for dimension, hierarchy, level and attribute names.
  • If possible (also recommended if Oracle OLAP API clients such as OracleBI Spreadsheet Add-in , OracleBI Discoverer Plus OLAP and OracleBI Beans will be used on the same AW), create the AW in its own schema.

Don't be seduced into thinking it is a good idea to put "DIM" in the name of everything that is a dimension, or "ATT" into the name of all the attributes. You don't need to do this. The AW knows what objects are what, and you can very simply query the AW if you need, for example, to find out the names of all the Dimensions in an AW. (Another topic for another day is to walk thru all the Data Dictionary stuff that helps with this).

In other words: If you have a Product Dimension, it is self-evidently a dimension, so clogging up its name with "_DIM" or "_DIMENSION" is just extra wear and tear on your keyboard!


Example:


To illustrate the impact this advice can have, here are two Product Dimensions, which apart from the fact one follows best practice advice and one does not, are identical (example is from Oracle Database 11g AW) (you can click on the picture to see it full size):

First – two ways I could have created my Product Dimension:




Second – what the resulting dimension views for the Main hierarchy would look like in each case:


Third – how much harder it is to read and write the SQL to query the AW’s dimension as a result:

Which of these functionally identical examples is easier to read, easier to understand and easier to query?

I rest my case. Giving a bit of thought to the way you build your AW before you build it nearly always pays dividends later.

Saturday, May 31, 2008

Best Practice Tips : SQL Access to Oracle DB Multidimensional AW Cubes (#1)

One of the most useful features introduced with Oracle Database OLAP is the ability for the powerful multidimensional calculation engine and the performance benefits of true multidimensional storage in the Analytic Workspace (AW), to be accessed and leveraged by simple SQL queries.

This single feature dramatically increases the reach and applicability of multidimensional OLAP – to a vast range of BI query and reporting tools, and SQL-based custom applications – BI and operational – that can now benefit from the superior performance, scalability and functionality of a first class multidimensional server, but combined within the Oracle Database with all the other advantages that derive from that. Bottom line: if you have a tool or application that can (a) connect to an Oracle Database instance, and (b) fire simple SQL at that Database, then you can get benefit from the AWs in that tool or application.

This post is the first of a series that I will use to share some general best practice tips to get the most out of this feature, so that you can deliver even better solutions to your business end-users.

If any of you have tips and advice of your own that we can share, please contact us – we’ll be happy to publish your good ideas and experience with this feature of Oracle Database OLAP.

Anyway. Enough pre-amble. Let’s get on with it. Here goes:

Best Practice Tip #1: Creating your views (Oracle Database 10g and 11g)

Basically the first tip in the series boils down to two things:

1) Always build your AWs to Oracle Database OLAP ‘Standard Form’. This is what happens if you build them with AWM, OWB (10g-only at the time of this post, but support for 11g target AWs is due in OWB very soon), or the supplied AW API if you need to programmatically build and maintain your AW.
2) Use the free-ware “View Generator” plug in for AWM10g to build your 10g views, and leverage the automatically generated views in 11g, unless you have a very good reason not to.

Together, if you follow this advice you will save a lot of time on your project, and also increase your ability to support the application going forward. And it will be a lot easier for others (such as Oracle Support, or your local friendly Oracle OLAP Consultant) to help you if you have any problems.

More detail:

In Oracle Database 10g, there is nothing to stop you coding your own views using the SQL OLAP_TABLE() function. And, if you have an entirely custom built AW this is pretty much your only option. However, if you have developed your AW to Oracle’s OLAP Standard Form specification you can save yourself the time, by using a handy dandy little plug-in for AWM10g. The plug-in is free shareware for AWM10gR2 & can be downloaded from here, with the associated ReadMe here.

The plug in steps you thru a simple wizard within AWM, allowing you to choose which measures etc you need, and then creates the views for you (storing the biggest lump of syntax – the ‘limitmap’ parameter which describes which AW objects show up in what columns in your view – inside the AW itself, in a multi-line text variable/measure).

In Oracle Database 11g, while OLAP_TABLE() is still available for you to use if you like (and sometimes it is perfect for your needs as it has lots of very clever hooks by which you can trigger various OLAP actions whenever a user selects from the view), for most cases, the new CUBE_TABLE() function added in Database 11g is much easier and therefore recommended.

CUBE_TABLE() views are what AWM11g automatically creates for you when defining the objects inside the AW. Assuming you have a valid Standard Form 11g Database AW, such as you might build in AWM11g, CUBE_TABLE() is much, much easier to use than OLAP_TABLE().

For example, the entire syntax required to create a Dimension View, for a specified hierarchy of that Dimension in an AW (not that I even have to type any of this in, as the AWM tool does it automatically for me) is as follows:

CREATE OR REPLACE FORCE VIEW MYDIM_MYHIER_VIEW AS
SELECT *

FROM TABLE( CUBE_TABLE('MYSCHEMA.MYDIM;MYHIER') );

How easy is that?!

All you need to know about your AW is the name of the Hierarchy (MYHIER), Dimension (MYDIM) and schema that the AW is built in (MYSCHEMA). All the object mappings that you have to tell OLAP_TABLE about, in the limitmap parameter, are automatically done as a result of improvements in Database 11g’s Data Dictionary (which is now fully aware of the details of the contents of the AW).

Here (below) is what an example Product Dimension looks like in AWM11g, and the resulting View:



Note that the OLAP Option only allows one Dimension or Cube (and therefore Dimension View, or Cube View) of a given name in each SCHEMA. For this reason, it is our recommendation that each AW be built in its own schema if possible. This will allow you, if you ever need to, to have a PROD dimension or SALES Cube in more that one unrelated AW. This tip will be included again, in an upcoming Post on Best Practice AW Design practices, and naming conventions.

Monday, April 21, 2008

Tuning Guidance for OLAP 10g

My assumption with this posting is: you are familiar with all the basic OLAP terms such as, dimensions, levels, hierarchies, attributes, measures, cubes, etc. If this is not the case then go to the Oracle Wiki and checkout these links:

http://wiki.oracle.com/page/Oracle+Olap+Option

Most people when they approach OLAP for the first time, create a data model that either takes too long to build or too long to query. The “too long to query” is usually the first problem to arise and in trying to solve this issue they create the second problem “too long to build”. There is a balance that needs to be achieved when designing OLAP data models. That balance is between pre-solving every level across all dimensions, which increases build time, and providing users with fast query performance. Most people assume there is a direct relationship between the number of levels that are pre-solved and query performance. As one goes up so does the other: pre-solve more levels and query performance improves. Therefore, the answer to poor query performance is to pre-solve all levels across all dimensions correct? Yes and no. Most systems do not have an infinite window for building cubes. Fortunately, using Oracle OLAP Option it is possible to balance the amount of time taken to build a cube and still ensure excellent query performance. How is this achieved?

Oracle OLAP is the most powerful and scalable OLAP server on the market. Because OLAP is inside the database it inherits all the native scalability, security and performance of the Oracle database and it is because the database is so fast and scalable there is a tendency to ignore certain design principles when building an OLAP data model. If the original design and methodology is sound then tuning is very quick and easy to manage. But there is no silver bullet to make OLAP go faster, as one of our OLAP gurus states: there is no ”_OLAP_TURBO_MODE=YES” setting for the init.ora.

What follows is a series of recommendations and observations based on my experience on various OLAP projects to help optimize OLAP builds. This is not the authoritative guide to tuning OLAP data models; just my thoughts.

When asked to tune an existing OLAP data model I break the work up in to five sections:



Tuning a data load process needs to be done in a step-by-step process. Trying to rush things and changing too many settings at once can simply create more problems than it solves. It is also important to start at the beginning with the hardware and lastly look at the database instance itself. Most DBAs will be tempted to rip open the init.ora file and start tweaking parameters in the hope of making the build run faster.

However, the area that is likely to have the biggest impact is refining (or possibly even changing) the implementation of the logical model. But when making changes that improve the build performance you should also check the impact on query performance to ensure the amount of time taken to return a query is still within acceptable limits.

Below are the steps I use when I am asked to analyse the build performance on an OLAP schema. But before you start a tuning exercise, I would recommend reading the 2-Day Performance Tuning Guide that is now part of the database documentation suite. It provides a lot of useful information. It is available as an HTML document and PDF document. The PDF document can be downloaded and stored on your laptop/memory stick etc for easy reference.


Part 1 - Analysis of Hardware
In any situation the first challenge in a tuning exercise is to ensure the foundation for the whole solution is solid. This tends to be the biggest challenge because it can involve a working with a number of hardware and software vendors. Trying to make sure your environment is based on an adequate configuration can be time consuming and risky, and will probably end in a compromise between performance, scalability, manageability, reliability, and naturally price.



Configurations can be difficult to analyse and most of the time. This analysis typically tends to degenerate into each vendor in the hardware stack blaming the other vendor and/or the database.

Step 1 – Check Patches
When analysing an existing environment make sure all the latest firmware, drivers and O/S patches have been applied. Refer to the Oracle database installation guide, Metalink, and the hardware vendors web sites for more details.

Step 2 – Determine Workload
In a good environment you should be expecting to load about 1 million rows per minute via OLAP. This is the benchmark. Check the XML_LOAD_LOG table from previous builds to determine if this is being achieved. Here is a log from a data load for the Common Schema AW based on a relatively simple view that joins two fact tables together to load three measures. Approximately 900,000 records are loaded in 57 seconds.



In this case, we could conclude this is a reasonable starting point to begin the next phase of the tuning exercise. However, don’t forget the performance initially listed in XML_LOAD_LOG could be influenced by a number of factors, but if the data source is a table or a very simple view, then 1 million rows a minute should be achievable. Anything less tends to indicate some sort of I/O issue, or possibly the use of inefficient SQL to extract data from the source. The ADDM analysis of I/O performance partially depends on a single argument, DBIO_EXPECTED, that describes the expected performance of the I/O subsystem. The value of DBIO_EXPECTED is the average time it takes to read a single database block in microseconds. Oracle uses the default value of 10 milliseconds, which is an appropriate value for most modern hard drives. If your hardware is significantly different, such as very old hardware or very fast RAM disks, consider using a different value. To determine the correct setting for DBIO_EXPECTED parameter, perform the following steps:
  1. Measure the average read time of a single database block read for your hardware. Note that this measurement is for random I/O, which includes seek time if you use standard hard drives. Typical values for hard drives are between 5000 and 20000 microseconds.
  2. Set the value one time for all subsequent ADDM executions. For example, if the measured value if 8000 microseconds, you should execute the following command as SYS user:
  3. EXECUTE DBMS_ADVISOR.SET_DEFAULT_TASK_PARAMETER( 'ADDM', 'DBIO_EXPECTED', 8000);
Also review the Performance Tuning Guide, Chapter 8 : I/O Configuration and Design. Specifically, review these two sections:
Parallel vs Serial Processing
As part of this step some consideration needs to given to parallel vs serial processing. I find most people will start by running a build in serial mode and then assume if it takes X amount of time to process in serial mode, running in parallel mode will naturally take X/No of parallel jobs. Of course this is true up to a point. There is a definite tipping point in parallel processing where the law of diminishing returns sets in very quickly. As a starting point if I am going to process a load in parallel I will start by using a job queue = “No. of CPUs-1”. This is usually a good starting point and depending on where the bottlenecks start appearing (CPU waits vs I/O waits) I may increase or decrease this figure during testing.

Parallel processing is a very useful tool for improving performance but you need to use partitioned cubes and the data being load must map across multiple partition keys to result in parallel processing. As before, this is not a silver bullet that will simply make everything run faster. It needs to be used carefully.


Step 3 – Determine Best Reference Configurations
What can be useful is to work from a set of known configurations designed to provide a stated level of performance. Oracle has worked with a number of hardware vendors to provide documented configurations for data warehouse solutions. These configurations can be used as benchmarks and/or recommendations for your environment.
Each configuration combines software, hardware, storage, I/O and networking into an optimized environment for different scales of customer data warehouse requirements. Using extensive customer experience and technical knowledge, Oracle and its hardware partners have developed configurations for data warehouses with varying raw data sizes, concurrent user population and workload complexity. By offering customers reference configurations suited for different profiles, customers can select the one that best suits their business and price, performance requirements. And since they're built on scalable, modular components, these reference configurations enable customers to aggressively pursue incremental data warehouse growth.

One of the key questions for the performance tuning exercise is: Are you just tuning the model based on today’s data volumes or should the exercise look to maximise performance of future load volumes. This is a very tricky area to manage and difficult to plan and test. Which is why having a referenceable configuration is so important. Why? Because the reference configurations provide a clear upgrade path and levels of performance are certified along that upgrade path.

Step 4 – Match/Compare/Contrast with Existing Configurations
In reality altering your hardware configuration is going to be a given. To change a configuration is likely to be a costly and time-consuming exercise. However, it should not be ignored. If you have followed and tuned your data model based on the following recommendations and the load and aggregation phase is still too long then a full hardware review may in fact be needed and upgrades may need to be purchased. Hopefully, the next result of this whole exercise is to provide some sort of cost/benefit report to outline expected performance improvements based on additional hardware costs.


Part 2 - Analysis of Dimensions
The second stage is to review the logical model for the dimensions. This stage is largely to confirm the dimensions are correctly implemented and the source data is of good quality. Most of this analysis is really just making sure there are no big issues within the various dimensions and possibly making small changes based on experience from various projects. But it is important to ensure the dimensions are of “good quality” before moving on to review the cubes – there is no point building a house on a sand bank and then wondering why it gets washed away (if that makes sense?). Good foundations are needed.



Step 1 - Analysis of Attributes
Many customers implement dimensions when they only really need attributes. This usually happens when they are migrating from a legacy OLAP server to Oracle OLAP. We have a customer at the moment that has an existing OLAP data model in a legacy OLAP server based on 60 dimensions. Reviewing the queries the users make against the data model it became clear that many of these dimensions were in fact simple attributes within Oracle OLAP. This can have a significant impact on the design of related cubes and the whole loading process, since fewer dimensions within a cube will improve both load and aggregation times.

Can Oracle OLAP support extremely large dimensional models? Yes it can. The engine will support up to 256 dimensions within a single cube and within the AW you can have as many dimensions as you need. The key point here is: each cube can have its own dimensionality. Oracle OLAP does not implement hyper-cubes where every cube has to share the same dimensionality – one of the key benefits of Oracle OLAP over other legacy OLAP engines is that it does support cubes of different dimensionality.

There is an excellent customer case study in the December 07 OLAP Newsletter that examines how one customer managed a very large data models based on lots of dimensions. The E.ON data model contains multiple cubes (subject areas) each with between 6 to 12 dimensions. The cubes are updated weekly with many millions of rows loaded and aggregated, with about 1 million rows updated in the cube per minute. For more information, read the complete review by clicking here. There are many other customers with even bigger models.

From a UI perspective you need to think very carefully about the number of dimensions within a cube. Many UI studies have shown business users find it increasingly difficult to interpret the results from a dataset where there are more than nine dimensions. Although, as the above case shows, it is possible for some users to interact with larger more complex models providing the information is presented in usable format, it is worth spending some time clarifying the exact dimensionality of each cube.

If you think about a typical crosstab layout, a nine dimensional cube results in one row edge dimension, one column edge dimension and seven page edge dimensions plus the measure dimension. That is a huge amount of information to absorb and, in my opinion, makes constructing queries very difficult. Another, issue that frequently occurs, as the number of dimensions increases, is the game of “hunt-the-data”. Even with only nine dimensions in a cube it is likely the data set will be extremely sparse and drilling down only one or two levels across a couple of dimensions can result in crosstabs with little or no data. Some BI tools try to mask this problem by providing an NA and/or Zero row filters. The net result is usually a “no rows returned” message appearing in the body of the report at regular intervals.

My main recommendation is: Check the number of dimensions in your model and for the sake of your users and try to keep the number within each cube down to something intelligible. For example approximately nine – this is not a hard and fast rule; just a recommendation but do read the E.ON case study as well. If you are presented with a data model, do not be afraid to challenge the dimensionality of the cubes within the model. Make sure all the dimensions within a cube are really required because I can guarantee some are simply basic attributes.

Step 2 - Analysis of Level Keys
There is not much that needs to be done here except to make sure to use surrogate keys except when you are certain the dimension keys are unique across all levels. This is not always the case and using a surrogate key is a good way to ensure your hierarchies are correctly populated. OLAP creates a surrogate key by prefixing the original source key with the level identifier. Therefore, from a storage perspective it makes sense to make the level key as short as possible. For example, don’t create a level identifier such as “PRODUCT_SUB_CATEGORY_SKU_IDENTIFIER”. There is a limit of 30 characters for level names. In practice I have seen issues with both data loading and aggregation where very large dimension keys (i.e. greater than 400 characters) have been created.

In practice I recommend using simple level identifiers such as L1, L2, L3 although this does make writing SQL statements a little more challenging via the SQL Views as the level identifier is used in the column name along with the dimension and it is not exactly obvious what each column contains when they are called PODUCT_L1, PRODUCT_L2 etc.

Surrogate vs Natural Keys
The use of surrogate keys is an interesting area. During some projects it has been found that by not using surrogate keys build performance has increased. This does make sense since the source data for cube will have to be reformatted at load time to ensure the key is valid. In some cases the amount of time required to manipulate the incoming key values may be minimal. In other cases it has had a significant impact on load performance – the “1 million rows a minute” benchmark was not achieved and reverting to natural keys did improve load times. If the data source can be guaranteed to provide unique keys across all levels it is probably worth switching to natural keys. But be warned – you cannot switch between using surrogate and natural keys if the cube already contains data.

Step 3 - Quantitative Analysis of Members
This is an important step as it will allow us to determine which levels to pre-aggregate within the cube. In most cases the default skip level approach to pre-solving levels within a cube is a reasonable starting point. But it is possible to design a much better model by analysing the number of members at each level and the average number of children for each level.

Lets look at two real customer examples:

Dimension 1


A change as simple as this could have a huge impact on the amount of time taken to aggregate a cube.

However, in some cases the OLAP Compression feature can be useful in terms of allowing you to pre-compute additional lower levels within a hierarchy for little or no additional cost because the sparsity of the data allows higher levels to compressed out of the cube. If you have a situation where there is almost a 1:1 relationship between a level and the next level down in the hierarchy it would make sense to pre-compute that level since the compression feature will compress out the redundant data. For example:

Dimension 2


In this example the hierarchy is relatively flat and the number of children returned at each level varies quite a lot. But at the lowest levels, there is likely to be a large number of instances where a parent only has a single child and in these situations the compress feature can compress out the repeated values. Therefore, it might make sense to solve levels L5 and L4.

Dimension 3


In this example the hierarchy here shows the normal pyramid approach and is definitely bottom heavy. But the upper levels contain relatively few members and drilling typically returns very few members. The default skip level approach for this dimension may in fact be pre-solving too many levels. In practice it may take 2 or 3 builds to determine which are the best levels to pre-solve, with a good starting point being:
  • Run 1: L7, L5, L2
  • Run 2: L7, L6, L1
  • Run 3: L7, L4, L2
This dimension shows that it may be necessary to schedule multiple runs to test these various scenarios. Again we need to consider the impact of using compression, which allows OLAP to solve additional levels very cheaply.

Step 4a - Hierarchy Validation
Always, always check your hierarchies are functioning correctly. This involves using the Data Viewer feature within AWM. You should make sure the dimension is drillable and that selecting each level in turn returns the correct result-set.





A better approach is actually to make the database do the work, but this requires some additional SQL commands to be executed against the source tables. Ideally, try and create a relational dimension over the source table(s). Normally, the relational dimension object is used within query rewrite, which in this case we are not really concerned with for 10gR2 (in 11g the story is quite different as a cube can be registered as a materialised view and used for query-rewrite). But this does allow us to use the dbms_dimension.validate_dimension procedure verifies that the relationships specified in a dimension are valid. The rowid for any row that is found to be invalid will be stored in the table DIMENSION_EXCEPTIONS in the user's schema. The procedure looks like this:

DBMS_DIMENSION.VALIDATE_DIMENSION (
dimension IN VARCHAR2,
incremental IN BOOLEAN := TRUE,
check_nulls IN BOOLEAN := FALSE,
statement_id IN VARCHAR2 := NULL );

Note that before running the VALIDATE_DIMENSION procedure, you need to create a local table, DIMENSION_EXCEPTIONS, by running the provided script utldim.sql. If the VALIDATE_DIMENSION procedure encounters any errors, they are placed in this table. Querying this table will identify the exceptions that were found. To query this table you can use a simple SQL statement such as this:

SELECT * FROM dimension_exceptions
WHERE statement_id = 'Product Validation';


However, rather than query this table, it may be better to query the rowid of the invalid row to retrieve the actual row that has generated the errors. In this example, the dimension PRODUCTS is checking a table called DIM_PRODUCTS. To find any rows responsible for the errors simply link back to the source table using the rowid column to extract the row(s) causing the problem, as in the following:

SELECT * FROM DIM_PRODUCTS
WHERE rowid IN (SELECT bad_rowid
FROM dimension_exceptions
WHERE statement_id = 'Product Validation');

Step 4b - Hierarchy Order
The order of hierarchies within a dimension can have a significant impact on query performance. When solving levels at run-time the OLAP engine will use the last hierarchy in the list as the aggregation path. Consider this example using a time dimension:



Let’s assume we pre-compute the levels Month and Quarter. But decide not to pre-compute the Year level because the main hierarchy used during queries is the Julian Year-Quarter-Month-Day and, therefore, the total for each Year will be derived from adding up just 4 values. In fact, the aggregation engine looks at all the hierarchies to find the lowest common level across all hierarchies, which in this case is Day. It then selects the last hierarchy in the list containing the level Day, in this case the Week hierarchy. Therefore, the value for the each dimension member at the Year level will be the result of adding up 365/366 values and not simply 4 Quarter values.

The obvious question is why? The answer is to ensure backward compatibility with the Express ROLLUP command from which the AGGREGATE command is derived. When the Aggregate command was introduced one of our requirements was that it produced numbers that matched those of Rollup, thus in cases where an aggregate node was declared in multiple hierarchies we always produced numbers based on the LAST definition of the node because that would be the number that matched the procedural approach taken by rollup. Because of this feature an alternative approach to hierarchy ordering might be as follows:


Now the run-time aggregation for Year will be derived from the level Quarter, which has been pre-computed, and the result will be returned much faster.

Step 5 - Check the Data Quality
This last step is probably the most important, especially as OLAP style projects tend to be scheduled once all the ETL has been completed. But you should never take the quality of the source data for granted. Ideally you can use the Data Quality option of Warehouse Builder (which is a costed option for OWB) and analyse the source data for each dimension to make sure the data is of a reasonable quality. Things to check are:
  • Consistent data type
  • Number distinct values
  • Min and Max values
  • Domain members
  • Number of members not present in the fact table
OLAP stores all members as data type text. Even if there are inconsistent data types within the source data, everything gets converted to text. This can mask some issues where unusual dimension members are included in the source data such as –9999, or XXXX. In many cases the data owners are completely unaware these values exist, or, worse still – they are included to allow the data to balance correctly and used as journal buckets. It may not be possible to remove these values but it is important to know they exist and equally important to clarify if they are in fact needed.

The last one is an interesting check especially if you are using that dimension as a partition key. If you are creating lots and lots of empty partitions that will never contain data then should those members even be loaded? In a recent project I identified a dimension that contained over 300,000 leaf node members, but in the main fact table there was only data for 50% of those members. The obvious question is why load 150,000 plus members if you are never going to post data to them.


Part 3 - Analysis of Cubes
The next stage is to review the data model for each cube in turn. It is in this area the biggest impacts on load time are likely to be achieved.


Step 1 - Analysis of Storage Model
It is important to assign an efficient storage model to a cube, as this will have a significant impact on both the load and aggregation times.

  • Make sure compression is enabled.
  • Data type should be either DECIMAL or INTGER
    • Warning do not use NUMBER as this uses approximately 3.5 times the storage compared to DECIMAL, but it is the default. Number requires 22 bytes and Decimal requires 8 bytes (See OLAP Application Developers Guider, 10.2.0.3, Chapter 7 Aggregating Data).
Try not to use Global Composites. There is little need to use this feature, except in very special cases where you need to optimise the retrieval of rows via SQL access and you want to only report non-NA and/or non-zero rows. Note – if you are using compression it is not possible to use the “Global Composites” feature even though in AWM10gR2 the option box is still enabled even after you select to use compression. (In 11g there are database events you can use to optimise the retrieval of non-NA/zero rows. See the posting by Bud Endress on the OLAP Blog: Attribute Reporting on the Cube using SQL)


Step 2 - Analysis of Sparsity Model
Management of sparsity within a cube is critical. Firstly the order of the dimensions is very important. When using compression, which should always be enabled, dimensions should be ordered with the dimension with the least number of members first and the dimension with the most number of members last. The most common question is: Should time be dense or sparse?

Answer – it depends. This is where you need to have a deep understanding of the source data and the data quality features in OWB can really help in this type of situation. In some models time works best dense and in other models time works best when it is sparse. This is especially true when time is used as the partition dimension. Therefore, you need to plan for testing these different scenarios.



There is sparsity advisor package in the database, which analyses the source data in relational tables and recommends a storage method. The recommendations may include the definition of a composite and partitioning of the data variable. The Sparsity Advisor consists of these procedures and functions:
  • SPARSITY_ADVICE_TABLE Procedure
  • ADD_DIMENSION_SOURCE Procedure
  • ADVISE_SPARSITY Procedure
  • ADVISE_DIMENSIONALITY Function
  • ADVISE_DIMENSIONALITY Procedure
The Sparsity Advisor also provides a public table type for storing information about the dimensions of the facts being analyzed. I have to say this is not the friendliest package ever shipped with the database, but it can be useful in some situations. To use the Sparsity Advisor you need to follow these five steps:
  1. Call SPARSITY_ADVICE_TABLE to create a table for storing the evaluation of the Sparsity Advisor.
  2. Call ADD_DIMENSION_SOURCE for each dimension related by one or more columns to the fact table being evaluated. The information that you provide about these dimensions is stored in a DBMS_AW$_DIMENSION_SOURCES_T variable.
  3. Call ADVISE_SPARSITY to evaluate the fact table. Its recommendations are stored in the table created by SPARSITY_ADVICE_TABLE. You can use these recommendations to make your own judgements about defining variables in your analytic workspace, or you can continue with the following step.
  4. Call the ADVISE_DIMENSIONALITY procedure to get the OLAP DML object definitions for the recommended composite, partitioning, and variable definitions, or
  5. Use the ADVISE_DIMENSIONALITY function to get the OLAP DML object definition for the recommended composite and the dimension order for the variable definitions for a specific partition.


The OLAP Reference manual provides an example script for the GLOBAL demo schema to analyse the relational fact table. The amount of information required does seem a little excessive given that most of it could be extracted from the various metadata layers – may be some bright person will create a wrapper around this to simplify the whole process.

On the whole I still find the majority of models work best with everything sparse and to far I have only found a few cases where load and aggregation times improved when time was marked dense. But as with all tuning exercises, it is always worth trying different options, as there is no “fits-all” tuning solution with OLAP.

Step 3 - Analysis of Partition Model
Partitioning is managed at both the logical and physical levels. At the logical level, it is possible to partition a cube using a specific level to split the cube into multiple chunks. At the physical level, it is possible to partition the actual AW$ table and associated indexes that form the AW.

Logical Partitioning
Always start by use partitioning. Why? Because partitioning allows the cube to be broken down into smaller segments – much like relational table partitioning. This can help improve the aggregation phase of a build because the engine is able to load more related data into memory during processing. It also allows you to use the parallel update features of Oracle OLAP during a build. But there are some things to consider when setting up partitioning. When using partitioning you should:

  • Try to select a dimension that has balanced partitions, such as Time
  • Try to select a dimension level that is not too volatile, this is one of the reasons for electing to use a dimension such as time.
  • Select the Level based on the information collected during Step 3 of the analysis of dimensions. In 10g, the levels above the partition key are solved at run time (this is resolved in 11g) so select the level for the partition key carefully.
    • When selecting the partition key consider the impact this will have on the default partition, which contains all the levels above the partition level. For example partitioning on a level such as Day might generate nice small partitions but the default partition will contain all the other members such as Week, Month, Quarter and Year making the default partition very large.
It might be necessary to experiment with different partition keys to get the right balance between stored and run time aggregation. For example, if you partition using a Time dimension then Month is usually a good level to select as the key since each year only needs to aggregate 12 members to return a total, but if you have 30 years of data and most reports start at the year level displaying all 30 years the run time performance might not be acceptable. In this case the level Quarter or even Year might be a better option.

Parallel vs Serial Processing
Logical partitioning is required for cubes where you want to enable parallel processing. But be warned, running a job in parallel may not improve processing times. In fact using too many parallel processes can have the opposite affect. But used wisely, parallel processing can drastically improve processing times provided the server is not already CPU bound. As a starting point I always begin testing by setting the value MaxJobQueues to “No. of CPUs-1” in the XML file for the definition of the build. In some cases even this might be too high and reducing this figure can actually improve processing times. Tuning AW parallel processing is exactly the same as tuning relational parallel processing – you need to determine where the point of diminishing returns sets in, which can be a combination:

CPU loading
I/O bandwidth
Cube design

Do not assume throwing parallel resources at a performance issue will resolve the whole problem. Managed carefully this can provide a significant improvement in peformance.


Physical Partitioning
The aim of relational (physical) partitioning is to allow you to control the tablespace for each partition thus distributing the load across multiple disks, to spread data across a variety of disk types (see information on ILM on OTN) and to enhance query performance since it is possible to direct specific queries to a smaller subset of data.

Some, but not all of this applies to an AW. From a tablespace perspective it is probably easier to use ASM to manage and distribute the storage of an AW across multiple disks as opposed to creating a partitioned AW$ table spread across multiple tablespaces. The reason the AW$ table is partitioned is optimise the lob performance. Each partition has its own lob index, which manages its storage, and a separate slave process can update each partition.

The relational table that acts as a container for the AW, AW$xxxxx , can be partitioned to break the AW into more physical chunks which can reduce contention for locks on the relational objects (AW$ table and related indexes) during parallel data loading jobs. By default each AW is created using a range partition key of gen# and 8 subpartitions. The DDL below is from a default AW created via AWM. Note the clauses to manage the partition and sub-partitions:
  • PARTITION BY RANGE ("GEN#")
  • SUBPARTITION BY HASH ("PS#","EXTNUM")
  • SUBPARTITIONS 8

CREATE TABLE "BI_OLAP"."AW$SH_AW"
("PS#" NUMBER(10,0),
"GEN#" NUMBER(10,0),
"EXTNUM" NUMBER(8,0),
"AWLOB" BLOB,
"OBJNAME" VARCHAR2(256 BYTE),
"PARTNAME" VARCHAR2(256 BYTE))
PCTFREE 10 PCTUSED 40 INITRANS 4 MAXTRANS 255
STORAGE(
BUFFER_POOL DEFAULT)
TABLESPACE "BI_OLAP"
LOB ("AWLOB") STORE AS (
DISABLE STORAGE IN ROW CHUNK 8192 PCTVERSION 0
CACHE
STORAGE(
BUFFER_POOL DEFAULT))
PARTITION BY RANGE ("GEN#")
SUBPARTITION BY HASH ("PS#","EXTNUM")
SUBPARTITIONS 8
(PARTITION "PTN1" VALUES LESS THAN (1)
PCTFREE 10 PCTUSED 40 INITRANS 4 MAXTRANS 255
STORAGE(
BUFFER_POOL DEFAULT)
TABLESPACE "BI_OLAP"
LOB ("AWLOB") STORE AS (
DISABLE STORAGE IN ROW CHUNK 8192 PCTVERSION 0
CACHE READS LOGGING
STORAGE(
BUFFER_POOL DEFAULT))
( SUBPARTITION "SYS_SUBP16109"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16110"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16111"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16112"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16113"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16114"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16115"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16116"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP") ,
PARTITION "PTNN" VALUES LESS THAN (MAXVALUE)
PCTFREE 10 PCTUSED 40 INITRANS 4 MAXTRANS 255
STORAGE(
BUFFER_POOL DEFAULT)
TABLESPACE "BI_OLAP"
LOB ("AWLOB") STORE AS (
DISABLE STORAGE IN ROW CHUNK 8192 PCTVERSION 0
CACHE
STORAGE(
BUFFER_POOL DEFAULT))
( SUBPARTITION "SYS_SUBP16117"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16118"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16119"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16120"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16121"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16122"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16123"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP",
SUBPARTITION "SYS_SUBP16124"
LOB ("AWLOB") STORE AS (
TABLESPACE "BI_OLAP" )
TABLESPACE "BI_OLAP") ) ;

The best overall approach here is to ensure you have the correct number of sub-partitions to reduce contention during updates. For example, if you have a cube with three years of data partitioned using the level month, it would be sensible to add and additional 36 subpartitions to the AW$ table to spread the load and reduce contention during parallel updates. You can add more sub-partitions quickly and easily as follows.

alter table aw$test modify partition ptnn add subpartition ptnn_009 update indexes;
alter table aw$test modify partition ptnn add subpartition ptnn_010 update indexes;

Therefore, I recommend adding additional subpartitions at the physical level to match the number of logical partitions within the cube.

It is possible to go to the next level (if you really feel it is necessary) and directly manage the DDL used to create the AW and there are a number of commands that allow you to control the default tablespace and the number of partitions. You can increase the number of sub-partitions within each gen# partition using either the ‘aw create command’ as shown here:

exec dbms_aw.execute('aw create owner.aw_name partitions N segmentsize N K|M|G');

Note the key word “partitions” actually refers to the number of subpartitions.
It is in fact possible to define a target tablespace for the AW via the DBMS_AW.ATTACH procedure:

DBMS_AW.AW_ATTACH (
awname IN VARCHAR2,
forwrite IN BOOLEAN DEFAULT FALSE,
createaw IN BOOLEAN DEFAULT FALSE,
attargs IN VARCHAR2 DEFAULT NULL,
tablespace IN VARCHAR2 DEFAULT NULL);

For example, the following SQL statement creates the AW GLOBAL_PROGRAMS as the last user-owned analytic workspace in tablespace USERS:

SQL>EXECUTE DBMS_AW.AW_ATTACH('global_programs', true, true, 'last', ‘USERS’);

AWM 10gR2 (10.2.0.03A) also allows you to define the tablespace when you create the AW, but the tablespace name is not included in the XML definition of the AW.
If you think you need to get right down to the base DDL level to control the allocation of tablespaces used by the AW then you will need to manually define the AW$ table. The easiest method is to create another AW$ table using the DDL from the original AW$ table and modifying it to create your own placement statements for the tablespaces. To get the DDL for an AW (table and index) you can either use SQLDeveloper or use the DBMS_METDATA package as follows:

set heading off;
set echo off;
set pages 999;
set long 90000;
spool aw_ddl.sql
select dbms_metadata.get_ddl('TABLE','AW$SH_AW','SH_OLAP') from dual;
select dbms_metadata.get_ddl('INDEX','SH_AW_I$','SH_OLAP') from dual;
spool off;

These statements show the exact DDL used to generate the AW$ table and its associated index. Once you have the DDL you can then modify the tablespace statements for each sub-partition to spread the loading across different tablespaces and hence data files. But it is much easier to use ASM to manage all this for you.

There is one major issue with manually creating an AW – the standard form metadata is not automatically added to the AW and there is no documented process for achieving this. The only reliable solution I have found is to first create the AW via AWM and then export the empty AW to an EIF file. This EIF file will then contain the standard form metadata objects. Once you have deleted and re-created the AW with the based on your specific tablespace and subpartition requirements the standard form metadata can be added by importing the EIF file. Not the prettiest of solutions but it works – at least with 10gR2.


Step 4 - Analysis of Aggregation Model
The Summarize To Tab
This is where the biggest improvements to build time are likely to be uncovered. The “Summarize To” tab allows you to select the levels to pre-solve. Based on the analysis of the number of members and children at each level it should be possible to tune the levels to pre-solve only the most important levels.

This step will require lots of testing and many builds to finally arrive at the best mix of levels.



The Rules Tab
If you are using the same aggregation method across all dimensions, such as SUM, the aggregation engine will optimise the processing order for the dimensions by solving them in the reverse order from highest cardinality to lowest cardinality. Despite this I always manually order the dimensions myself anyway on the Rules Tab.

Where you are using different aggregation methods across the various dimensions it is important to ensure the dimensions are in the correct order to return the desired result. If you change the order to improve aggregation performance where different aggregation methods are used, check the results returned are still correct. Getting the wrong answer very quickly is not a good result.


Step 5 - Analysis of Data Quality
This is another area that can have a huge impact of load times. There are three key things to consider:
  • Number of NA cells
  • Number of Zero cells
  • Sparsity patterns
Quite often I see situations where hundreds of thousands of either NA or zero values are loaded into a cube and then aggregated. In a recent customer situation, over 40% of the data being loaded was either NA or zero. Removing just those records from the data load saved a huge amount of time both in loading and aggregating that data set. Now in some cases it may in fact be necessary to load a zero balance because the value “0” does actually mean something and having a cell appear as NULL in a report does not infer the same meaning. If this is the case, there are much better ways of managing zero balances than loading and aggregating those balances up across all the various hierarchies to return a value of 0. My recommendation is to remove all zero and NA/null rows from the source fact table.

Where there is a need to show a zero balance, create a separate cube load only the zero balances into that cube but do not aggregate the data. Use a formula to glue the non-zero balance data to the zero balance data, such as:

Nafill(CUBE1_NON_NA_DATA, CUBE2_ZERO_BALANCE_DATA)

This will significantly improve the performance of the main cube since the aggregation engine only has to deal with real balances.

Sparsity patterns are important when you have a cube that contains a large number of measures all sourced from the same fact table. In another situation, a customer had designed two cubes with about 30 measures in one cube and two measures in the other cube. The source fact table contained 75 million rows. The data load was taking about ten hours for just three years of data. Looking at the data and executing various SQL counts to determine the number of NULL cells and Zero cells for each measure, it was clear there were five different sparsity patterns within the fact table.

By breaking the single cube into five different cubes, creating views over the base fact table to only return the relevant columns for each cube and removing all NA and zero values the amount of data being loaded each month declined to the values shown below:



This change combined with changes to the selection of levels pre-aggregated reduced the build and aggregation time by over 50% with little impact on query performance.

It is critical to fully understand the source data and how it is stored. As the number of measures within a cube increases it is likely that the number of times an NA or Zero value appears will also increase. Breaking a large cube up into smaller more focused chunks in this type of scenario can provide significant benefits.

Part 4 - Analysis of Source Schema Queries
When loading data into a cube from a relational source schema you should be able to achieve about 1 million rows updated in the cube per minute. If you are not seeing that level of throughput from the source table/view, you need to look at:
  • Hardware issues
  • Cube design issues
  • Query design issues
The first two issues have already been covered. This area aims to review the tuning of the query fetching the data from the relational source table/view into the cube.



Tuning the queries used to load dimension members and data into cubes can be very important. When either a data load or dimension load is executed a program is created containing the SQL to fetch the data from the relational table. It is important to make sure the SQL being executed is as efficient as possible. By using views as the source for your mappings it is relatively easy to add additional hints to ensure the correct execution path is used. Note - with 11g this can cause problems if the cube is to be exposed as a materialised view. For query re-writes to function the cube must use the underlying fact table that is part of the end-user query.

Step 1 - Analysing SQL Statements
To optimise the SQL executed during a load you have use either, or both, of the approaches:
  • Enterprise Manager – via Tuning Packs
  • Manual analysis
If you are comfortable using PL/SQL and understand a little about OLAP DML you can follow the manual approach. However, I expect most people will revert to using Enterprise Manage as it makes the whole process so simple. However, note the Tuning Pack is a costed option for EM so check your license agreement before you start using the Enterprise Manager approach.

Enterprise Manager
Enterprise Manager can be used to monitor the results from a SQL statement. The Performance Tab provides the environment for tuning SQL statements as well as monitoring the operation of the whole instance. The easiest way to find the SQL statement used by the data load process is to search for a SELECT statement against the view/table used in the mapping. The SQL can quite often be found in the “Duplicate SQL” report at the bottom of the Top Activity page:



Once you have found the SQL statement, clicking on the SQL statement listed in the table will present a complete analysis of that statement and allow you to schedule the SQL Tuning Advisor. The output from the Advisor includes recommendations for improving the efficiency of that statement. Below is the analysis of the resources used to execute the product dimension SQL statement:



Scheduling the Advisor
Scheduling the advisor to analyse your SQL statement is very simple. Click on the button in the top right corner of the SQL Details screen. This will launch the SQL Advisor where you need to provide:
  • A description for the job
  • Set the scope to either limited or comprehensive (there are on screen notes to help you make this decision)
  • Time and date to run the Advisor, since it might not be possible to run the advisor immediately.
Once the Advisor has completed its review, it is possible to look at the recommendations that have been generated. After you have implemented the recommendations it is then possible to view the explain plan for your query:


Note: These features are costed extensions to the Enterprise Manager console and cannot be used on a production system unless your customer has bought these extensions.

Manual Tuning
So how do you capture the SQL being executed during a data load? In 10g, during a load process an OLAP DML program is created called '___XML_LOAD_TEMPPRG'.
This program contains the code used during the build process and it is relatively easy to capture this code either via Enterprise Manager or manually at the end of the build.
(For a data load in 11g, look at the CUBE_BUILD_LOG’s “output” column. The table is in the AW’s schema).

To manually capture the program code (in 10g), create your own job to manually execute a data load. For example below is a job to refresh the members in the dimension Products. Note the first three lines and last four lines that control the dumping of the program code so we can capture the SQL.


set serveroutput on
exec dbms_aw.execute('aw attach SH_AW rw first');
exec dbms_aw.execute('cda BI_DIR');

call SQL file to refresh cube

exec dbms_aw.execute('outfile loader.txt');
exec dbms_aw.execute('DSC ___XML_LOAD_TEMPPRG')
exec dbms_aw.execute('outfile eof');
exec dbms_aw.execute('aw detach SH_AW');

The resulting program looks like this, with the SQL code highlighted in bold:

DEFINE ___XML_LOAD_TEMPPRG PROGRAM INTEGER
PROGRAM
variable _errortext text
trap on HADERROR noprint
sql declare c1 cursor for -
select SH.VW_PRODUCTS_DIM.PROD_ID, -
SH.VW_PRODUCTS_DIM.PROD_DESC, -
SH.VW_PRODUCTS_DIM.PROD_DESC, -
SH.VW_PRODUCTS_DIM.PROD_PACK_SIZE, -
SH.VW_PRODUCTS_DIM.PROD_WEIGHT_CLASS, -
SH.VW_PRODUCTS_DIM.PROD_UNIT_OF_MEASURE, -
SH.VW_PRODUCTS_DIM.SUPPLIER_ID -
from SH.VW_PRODUCTS_DIM -
where -
(SH.VW_PRODUCTS_DIM.PROD_ID IS NOT NULL)
sql open c1
if sqlcode ne 0
then do
_errortext = SQLERRM
goto HADERROR
doend
sql import c1 into :MATCHSKIPERR SH_OLAP.SH_AW!PRODUCTS_PRODUCT_SURR -
:SH_OLAP.SH_AW!PRODUCTS_LONG_DESCRIPTION(SH_OLAP.SH_AW!ALL_LANGUAGES 'AMERICAN') -
:SH_OLAP.SH_AW!PRODUCTS_SHORT_DESCRIPTION(SH_OLAP.SH_AW!ALL_LANGUAGES 'AMERICAN') -
:SH_OLAP.SH_AW!PRODUCTS_PACK_SIZE -
:SH_OLAP.SH_AW!PRODUCTS_WEIGHT_CLASS -
:SH_OLAP.SH_AW!PRODUCTS_UNIT_OF_MEASURE -
:SH_OLAP.SH_AW!PRODUCTS_SUPPLIER_ID
if sqlcode lt 0
then do
_errortext = SQLERRM
goto HADERROR
doend
sql close c1
sql cleanup
return 0
HADERROR:
trap on NOERR1 noprint
sql close c1
NOERR1:
trap off
sql cleanup
call __xml_handle_error(_errortext)
END

Once you have the statement you can use SQLDeveloper’s explain plan feature to determine the execution plan. When dealing with cubes, it is likely the source fact table will be partitioned; therefore, you need to ensure partition elimination is occurring correctly.



If additional hints need to be added to make the query more efficient, these can be added to the view definition. This approach does not automatically generate recommendations so you will need to have a solid grasp of SQL tuning to ensure your query is based on the most optimal execution plan.

Step 2 – Managing Sort Resources
Sorting the source data is quite important for both dimensions and facts. By default, OLAP sorts dimensions alphabetically in ascending order based on the long description. Therefore, it makes sense for the relational source to provide the data in the required order, especially for the dimension loads.

Optimising cube loads requires making sure the sorting is based on the same order as the dimensions are listed within the partitioned composites. This will be the same order as shown on the implementation details tab.

OLAP load operations are sort intensive. You may need to increase the sort_area_size setting within the database to try and ensure the various sorting operations during a load are performed in memory and not disk. The default setting is 262,144. As part of a load process you can increase the amount of sort memory available as follows:

exec DBMS_AW.EXECUTE('SortBufferSize=10485760');

Executing this command before starting a data load will increase the amount of resources allocated to memory sorts, in this case providing approximately 10Mb of memory. To permanently set the SortBufferSize to 10Mb, issue the following commands:

exec DBMS_AW.EXECUTE('aw attach my_aw_name rwx');
exec DBMS_AW.EXECUTE('SortBufferSize=10485760');
exec DBMS_AW.EXECUTE('update');
exec DBMS_AW.EXECUTE('commit');
exec DBMS_AW.EXECUTE('aw detach my_aw_name');

Or you can simply set the option before executing the XML job definition:

set serveroutput on
exec dbms_aw.shutdown;
exec dbms_aw.startup;
exec dbms_aw.execute('aw attach SH_AW rw first');
exec DBMS_AW.EXECUTE('SortBufferSize=10485760');

call SQL file to refresh cube

exec DBMS_AW.EXECUTE('SortBufferSize=262144');
exec dbms_aw.execute('update;commit');
exec dbms_aw.execute('aw detach SH_AW');
exec dbms_aw.shutdown;
exec dbms_session.free_unused_user_memory;

For more information on this subject area refer to the next session on monitoring system resources.


Part 5 - Analyis of the Database
There are a number of areas that are important when tuning a data load process and the areas outlined in this section are really just going to tweak the performance and may or may need result in significant performance improvements. But this area can provide the “icing on the cake” in terms of extracting every last ounce of performance.



Step 1a – Relational Storage Settings
Make sure logging is switched off on the tablespace used to store the AW. Since the AW does not support redo there is not point in generating it. Make sure you have enough space within the tablespace before you start a build. A lot of time can be consumed extending the tablespace if you are not careful.

If you are using Data Guard, it will not be possible to switch of redo. The alternative is to increase REDO Log Size to between 100M and 500M, and also modify LOG_BUFFER parameter to 10M (for example) to allow for more efficient index lob creation, and also try to move TEMP, UNDO and REDO logs to fastest disk.

Step 1b –AW Storage Settings
If the cubes within an AW contain a large number of partitions, then performance can be improved by adding additional physical partitions to AWs. The AW should be logically partitioned and modelled well and then should also be physically partitioned as it improves update performance by reducing index lob contention. For example, if the main data cube contains 36 months of data and is logically partitioned by month in the AW, then the physical partitioning of the AW should match the number of logical partitions. To override the default of eight partitions it is necessary to manually define the AW and set the required number of partitions as show here:

SQL> exec dbms_aw.execute('aw create scott.product_AW partitions 36');

However, this approach does create some additional complications regarding the creation of standard form metadata. This metadata is required to make the AW visible to AWM and other OLAP aware tools. In the vast majority of cases it will be necessary to create a standard form metadata compliant AW. See the Part 3 Analyis of Cube Model, Step 3 – Partitioning for more information.


Step 2 - Temp Storage Settings
Pre-allocating space within the temp tablespace prior to running a build can make some performance improvements. When pre-allocating space make sure the temp tablespace is not set to auto-extend and the correct (most efficient) uniform extend size is used. The procedure below will pre-allocate TEMP Tablespace. Alter the for i in 1..1000000 are required. This example will pre-allocate approximately 1.5GB of TEMP tablespace. Make sure your default temporary tablespace/group is not set to auto-extend unlimited. It should be fixed to the required size.

create or replace procedure preallocate_temp as
amount integer := 26;
buffer varchar2(26) := 'XXXXXXXXXXXXXXXXXXXXXXXXXX';
done boolean := false;
out_of_temp exception;
position integer := 10240;
pragma exception_init(out_of_temp,-01652);
tmppre clob;
begin
dbms_lob.createtemporary(tmppre, true,d bms_lob.session);
dbms_lob.open(tmppre, dbms_lob.lob_readwrite);
for i in 1..130400
loop
if (done = true) then
dbms_lob.close(tmppre);
dbms_lob.freetemporary(tmppre);
end if;
begin
dbms_lob.write(tmppre, amount, position, buffer);
exception when out_of_temp then done := true;
end;
position := position + amount + 10240;
end loop;
dbms_lob.close(tmppre);
dbms_lob.freetemporary(tmppre);
end;
/

conn prealltemp/oracle
exec preallocate_temp;
disc;


Step 3 - ADDM Report
ADDM (Automatic Database Diagnostic Monitor) is a self-diagnostic engine built into the Oracle Database kernel, which automatically detects and diagnoses common performance problems, including:
  • Hardware issues related to excessive I/O
  • CPU bottlenecks
  • Connection management issues
  • Excessive parsing
  • Concurrency issues, such as contention for locks
  • PGA, buffer-cache, and log-buffer-sizing issues
  • Issues specific to Oracle Real Application Clusters (RAC) deployments, such as global cache hot blocks and objects and interconnect latency issues
An ADDM analysis is performed after each AWR snapshot (every hour by default). The results are saved in the database, which can then be viewed using either Oracle
Enterprise Manager or SQLPlus. For tuning OLAP data loads, ADDM is always a good place to start. In addition to diagnosing performance problems, ADDM recommends possible solutions. When appropriate, ADDM recommends multiple solutions, which can include:

  • Hardware changes
    • Adding CPUs or changing the I/O subsystem configuration
  • Database configuration
    • Changing initialization parameter settings
  • Schema changes
    • Hash partitioning a table or index, or using automatic segment-space management (ASSM)
  • Application changes
    • Using the cache option for sequences or using bind variables
  • Using other advisors
    • Running the SQL Tuning Advisor on high-load SQL statements or running the Segment Advisor on hot objects

ADDM benefits apply beyond production systems; even on development and test
Systems. ADDM can provide an early warning of potential performance problems. Typically the results from an ADDM snapshot are viewed via various interactive pages within Enterprise Manager, as shown below:



Alternatively you can access ADDM reports using the SQL*Plus command line by calling the new DBMS_ADVISOR built-in package. For example, here's how to use the command line to create an ADDM report quickly (based on the most recent snapshot):

set long 1000000
set pagesize 50000
column get_clob format a80
select dbms_advisor.get_task_report(
task_name, 'TEXT', 'ALL')
as ADDM_report
from dba_advisor_tasks
where task_id=(
select max(t.task_id)
from dba_advisor_tasks t, dba_advisor_log l
where t.task_id = l.task_id
and t.advisor_name='ADDM'
and l.status= 'COMPLETED');

The ‘ALL’ parameter generates additional information about the meaning of some of the elements in the report. The most interesting section of the report relates to the "Findings" for each issue. This outlines the impact of the identified problem as a percentage of DB time, which correlates with the expected benefit, based on the assumption the problem described by the finding will be solved if the recommended action is taken.

In the example below the recommendation is to adjust the sga_target value in the parameter file:

FINDING 3: 5.2% impact (147 seconds)
---------------------------------------
The buffer cache was undersized causing significant additional read I/O.
RECOMMENDATION 1: DB Configuration, 5.2% benefit (147 seconds)
ACTION: Increase SGA target size by increasing the value of parameter "sga_target" by 24 M.
SYMPTOMS THAT LED TO THE FINDING:
Wait class "User I/O" was consuming significant database time. (5.3% impact [150 seconds])
...


To get more information this feature refer to the Oracle® Database 2 Day + Performance Tuning Guide 10g Release 2 (10.2).

For the HTML version, click here for 10gR2 and here for 11g.
For the PDF version, click here for 10gR2 and here for 11g


Step 4 - Dynamic Performance Views
Each Oracle database instance maintains a set of virtual tables that record current database activity and store data about the instance. These tables are called the V$ tables. They are also referred to as the dynamic performance tables, because they store information relating to the operation of the instance. Views of the V$ tables are sometimes called fixed views because they cannot be altered or removed by the database administrator. The V$ tables collect data on internal disk structures and memory structures. They are continuously updated while the database is in use. The SYS user owns the V$ tables. In addition, any user with the SELECT CATALOG role can access the tables. The system creates views from these tables and creates public synonyms for the views. The views are also owned by SYS, but the DBA can grant access to them to a wider range of users.

Among these are tables that collect data on OLAP operations. The names of the OLAP V$ views begin with V$AW:
  • V$AW_AGGREGATE_OP
    • Lists the aggregation operators available in the OLAP DML.
  • V$AW_ALLOCATE_OP
    • Lists the allocation operators available in the OLAP DML.
  • V$AW_CALC
    • Collects information about the use of cache space and the status of dynamic aggregation.
  • V$AW_LONGOPS
    • Collects status information about SQL fetches.
  • V$AW_OLAP
    • Collects information about the status of active analytic workspaces.
  • V$AW_SESSION_INFO
    • Collects information about each active session.
For tuning the two most important views from this list are:

V$AW_CALC
This reports on the effectiveness of various caches used by Oracle OLAP and the status of processing by the AGGREGATE function. Oracle OLAP uses the following caches:

  • Aggregate cache: An internal cache used by the aggregation subsystem during querying. It stores the children of a given dimension member, such as Q1-04, Q2-04, Q3-04, and Q4-04 as the children of 2004.
  • Session cache: Oracle OLAP maintains a cache for each session for storing the results of calculations. When the session ends, the contents of the cache are discarded.
  • Page pool: A cache allocated from the User Global Area (UGA), which Oracle OLAP maintains for the session. The page pool is associated with a particular session and caches records from all the analytic workspaces attached in that session. If the page pool becomes too full, then Oracle OLAP writes some of the pages to the database cache. When an UPDATE command is issued in the OLAP DML, the changed pages associated with that analytic workspace are written to the permanent LOB, using temporary segments as the staging area for streaming the data to disk. The size of the page pool is controlled by the OLAP_PAGE_POOL initialization parameter.
  • Database cache: The larger cache maintained by the Oracle RDBMS for the database instance.

Because OLAP queries tend to be iterative, the same data is typically queried repeatedly during a session. The caches provide much faster access to data that has already been calculated during a session than would be possible if the data had to be recalculated for each query.

The more effective the caches are, the better the response time experienced by users. An ineffective cache (that is, one with few hits and many misses) probably indicates that the data is not being stored optimally for the way it is being viewed. To improve runtime performance, you may need to reorder the dimensions of the variables (that is, change the order of fastest to slowest varying dimensions).


V$AW_LONGOPS
This view will identify the OLAP DML command (SQL IMPORT, SQL FETCH, or SQL EXECUTE) that is actively fetching data from relational tables. The view will state the current operation based on one of the following values:
  • EXECUTING. The command has begun executing.
  • FETCHING. Data is being fetched into the analytic workspace.
  • FINISHED. The command has finished executing. This status appears very briefly before the record disappears from the table.
Other information returned includes: the number of rows already inserted, updated, or deleted and the time the command started executing.
For more information refer to the Oracle OLAP Option Users Guide, Section 7 Administering Oracle OLAP – Dynamic Performance Views.


Step 5 - Init.Ora Parameters
Checking the RDBMS parameters are appropriately tuned for your OLAP environment is relatively easy. Fortunately in 10g the majority of init.ora parameters are managed dynamically, however a few parameters that may need to be changed are:

SORTBUFFERSIZE
This should be increased since OLAP AWs use this parameter instead of SORT_AREA_SIZE. So, for every AW, to increase it do the following:

exec DBMS_AW.EXECUTE('aw attach SCOTT.MYAW rwx');
exec DBMS_AW.EXECUTE('shw sortbuffersize');
262,411
exec DBMS_AW.EXECUTE('SortBufferSize=10485760');
exec DBMS_AW.EXECUTE('shw sortbuffersize');
10,485,760
exec DBMS_AW.EXECUTE('aw detach SCOTT.MYAW');

OLAP_PAGE_POOL_SIZE
This should be set to 0 or unset so that auto dynamic page pool is on and is managed by the database (will be set to 50% of PGA size). However, if you have over 8Gb of memory available then you should set the parameter manually and a good value for data loading is to set to 256MB and for multiple users querying concurrently, 64MB.