Monday, December 10, 2012

Webinar: Using XMLA with Cognos and Oracle OLAP Cubes

When:  Thursday, Dec 13, 2012 at 9:00am PST / 12:00pm EST / 6:00pm CET.

To attend:    Sign up here.
 
 
If you use a business intelligence tool such as IBM Cognos, Microstrategy or SPA BusinessObjects Analysis that uses XMLA to connect to multidimensional data sources, check out a free webinar by Simba Technologies which offers a "sneak peak" of the Simba XMLA Provider for Oracle OLAP.  The Simba XMLA Provider for Oracle OLAP is an XMLA version for the Simba MDX Provider for Oracle OLAP, the gold standard in MDX connectivity to Oracle OLAP.   (The Simba MDX Provider for Oracle OLAP allows MDX based clients such as Microsoft Excel PivotTables to query Oracle OLAP cubes.  The XMLA version allows clients that use XMLA rather than ODBO to connect to Oracle OLAP.)

Simba will demonstrate using IBM Cognos using the XMLA provider to query Oracle OLAP cubes.  Here's a brief outline of the session.

See how:
  • Familiar business intelligence applications such as IBM Cognos can connect to an Oracle OLAP cube.
  • Ad-hoc querying and data analysis can be performed directly in IBM Cognos on your OLAP data.
  • The most advanced application that responds to XMLA requests available enables users to interactively build reports, drill into details and slice and dice data
  • Connectivity can be established without the need to install any software on the client machine.
    Simply connect to the XMLA service and everything works!
See you there!


Thursday, June 14, 2012

Creating Oracle BI Presentation Layers from Oracle Cubes

Beginning with Oracle BI 11.1.1.5, the Oracle BI Administration tool has the ability to import Oracle cubes, dimensions and hierarchies into the Physical layer of the Oracle BI repository.  This provides an alternative to the Oracle BI plug-in for Analytic Workspace Manager.  Each method works somewhat differently and each has certain advantages that might make it the best choice for different situations.

The Oracle BI plug-in for Analytic Workspace Manager pushes metadata into the  Oracle BI repository using UDML code.  The result is a ready to use subject area with the Physical, Business Model and Presentation layers fully built for you.  It creates hierarchies in the Business Model layer, but it does not add hierarchical columns to the Presentation layer.  You can easily add hierarchical columns to the Physical layer by dragging hierarchies from the Business Model layer.

Hierarchical column in the channel dimension.

The Oracle BI plug-in for Analytic Workspace Manager creates a Physical layer that maps to hierarchy and cube views (these views make the dimensions, hierarchies and cubes look like a star schema to SQL).  In the Business Model, it assigns the SUM aggregation operator to measures selected from the cube view.  This approach allows SQL to aggregate data above the cube.  This is useful for queries that first filter and then aggregate and attribute break outs.  For example:

  • Report by Year where Quarter of Year is 3 and 4.  Data at the Year level will be the aggregate of child quarters 3 and 4. 
  • Report by Year, Color and Size where Color and Size are attributes of Item.  Oracle will SELECT year, color, size, SUM(sales) .... GROUP BY year, color, size.

In both cases, Oracle BI will get the highest aggregate level data it can from the cube and do some additional aggregation in SQL.

When cubes are imported using the Oracle BI Administration tool the Physical layer is mapped directly to the cube and dimensions, bypassing the cube and hierarchy views.  The SQL created in this case selects from the OLAP_TABLE table function.  Oracle BI has optimized this SQL to take advantage of some features in OLAP_TABLE, which offers better query performance with hierarchical columns in some situations.

When the cube is imported with the Oracle BI Administration tool, aggregation is set to 'External Aggregation'.  This will disable the ability to create queries that require aggregation above the cube.  This is good in that it prevents users creating queries that might not work as they expect in some cases, but it prevents filter before aggregate and attribute break out queries.

Also, the import method only creates the Physical layer.  It does not create the Business Model and Presentation layers.  You do that by dragging the Physical layer to the Business Model layer and then dragging the Business Model to the Presentation layer.  That's all quick and easy, but there will be some things you will want to update in the Business Model and Presentation layers.  For example, deleting some columns you don't really need and arranging columns in hierarchical order.  I've posted the paper Creating an Oracle BI Presentation Layer from Oracle Cubes on OTN.  This paper will walk you through the process of refining the Business Model and Presentation layers.

My recommendation is to try each method, learn about the strengths of each and choose which is best for any particular application.  Since all the data comes from the same cube, there's nothing wrong with using both methods.

Tuesday, May 22, 2012

Introduction to Oracle OLAP Web Presentation Series

I've posted a series of three videos introducing Oracle OLAP.  This is a great series for people how are interested in learning about what Oracle OLAP is and what it's used for.  I suggest starting viewing these in order.  Here are the links:

Oracle OLAP Overview:  Part 1 - Architecture
Oracle OLAP Overview:  Part 2 - Key Features
Oracle OLAP Overview:  Part 3 - Use Cases

Monday, May 14, 2012

Creating Custom (Calculated) Members

If you have ever wanted to report on the total of two or more dimension members you have probably done so using in SQL using SUM ... GROUP BY. 

Let's look at an example using the OLAPTRAIN schema.  Suppose you want the total of the states in the northeastern United States.  In OLAPTRAIN, that would include Massachusetts, Rhode Island, New York and Pennsylvania..

If you did this with SQL, it might look something like this.

SELECT t.calendar_year_long_descr AS calendar_year,
  g.country_long_description      AS country,
  p.all_products_long_descri      AS all_products,
  c.all_channels_long_descri      AS all_channels,
  SUM(f.sales)
FROM time_calendar_view t,
  product_standard_view p,
  geography_regional_view g,
  channel_sales_channel_view c,
  sales_cube_view f
WHERE t.dim_key                 = f.time
AND p.dim_key                   = f.product
AND g.dim_key                   = f.geography
AND c.dim_key                   = f.channel
AND t.level_name                = 'CALENDAR_YEAR'
AND p.level_name                = 'ALL_PRODUCTS'
AND c.level_name                = 'ALL_CHANNELS'
AND g.level_name                = 'STATE_PROVINCE'
AND g.state_province_long_desc IN ('Massachusetts','Rhode Island','New York', 'Pennsylvania')
AND t.calendar_year_long_descr  = 'CY2009'
AND g.country_long_description  = 'United States'
GROUP BY t.calendar_year_long_descr,
  g.country_long_description,
  p.all_products_long_descri,
  c.all_channels_long_descri;

And the query would return as

CY2009    United States    All Products    All Channels    10828869.07

A different way to approach this is to create a custom member that is the aggregate of other members.  In this case, the custom member is added to the dimension and can be used just like any other dimension member.  The only real difference is that a custom member is not within a hierarchy and does not belong to a level.  The advantages are that the custom member is available to all users (unless you control access, more on that later), they work with all of the cube's aggregation rules (e.g., first, last, hierarchical weighted average and so on), they work seamlessly with calculated measures and they are available in all tools (e.g., Excel PivotTables).

Custom aggregates are created using the dbms_cube.import program.  Note that the dimension keys are numeric in OLAPTRAIN.  (Sorry for posting this sample as an image ... blogger wasn't happy about displaying XML.  To view the full example option the image in a new tab or window).




I can view custom members in the dimension or hierarchy view with a query such as this:

SELECT * FROM geography_view WHERE member_type = 'C';

I can query the cube view with a simplified query (note the lack of SUM ... GROUP BY).

SELECT t.long_description AS time,
  g.long_description      AS geog,
  p.long_description      AS product,
  c.long_description      AS channel,
  f.sales
FROM time_calendar_view t,
  product_standard_view p,
  geography_regional_view g,
  channel_sales_channel_view c,
  sales_cube_view f
WHERE t.dim_key                 = f.time
AND p.dim_key                   = f.product
AND g.dim_key                   = f.geography
AND c.dim_key                   = f.channel
AND t.level_name                = 'CALENDAR_YEAR'
AND p.level_name                = 'ALL_PRODUCTS'
AND c.level_name                = 'ALL_CHANNELS'
AND g.long_description          = 'Northeast US'
AND t.calendar_year_long_descr  = 'CY2009';

And the query returns Northeast US rather than a modified version of United States.

CY2009    Northeast US    All Products    All Channels    10828869.07

If it is appropriate for the custom members to be visable to all users, there's nothing else to do.  If you want to control access to custom member, Virtual Private Database is probably a good choice.  

If you would like a sample script that creates a few custom members and sets up a VPD policy to control access, send me an email at william.endress@oracle.com.







Friday, April 13, 2012

Fine Tuning Incremental Updates using LOAD PRUNE

If you are like most people, you probably use the LOAD_AND_AGGREGATE cube script that is automatically created by Analytic Workspace Manager. Fine tuning the update process simply involves filtering the fact table for new or changed rows.

With a little bit of effort, you can improve update times by writing your own cube processing script. You can also use MV log tables to automatically captured changes made to the fact table and use them as the data sources to cube updates.

AWM defines and makes the LOAD_AND_AGGREGATE script the default script of the cube. If you don’t specify a different script, LOAD_AND_AGGREGATE is automatically used as shown in the following example (note that the script references the OLAPTRAIN.SALES_CUBE but does not including the USING clause).

BEGIN
DBMS_CUBE.BUILD('OLAPTRAIN.SALES_CUBE','C',false,4,true,true,false);
END;
/

This script will run the LOAD PARALLEL and SOLVE PARALLEL commands. What this means is that for each partition, the database will LOAD data from the fact table/view and then SOLVE (aggregate) data. If you have specified a value for parallel that is greater than 1, partitions will be processed in parallel (in the example above, 4 processes). AWM also provides the ability to set the refresh method (C, or complete, in the above example).

LOAD_AND_AGGREGATE is a good choice for a full build, but it might not be the best choice for an incremental update. If you are simply updating the cube with changes within a few recent partitions (e.g., yesterday or this month), the LOAD PRUNE command is probably better than LOAD PARALLEL.

LOAD PRUNE will first query the fact table or view to first determine which partition will have new data using a SELECT DISTINCT. It will then only generate LOAD commands for those partitions that will have records loaded into them.
Let’s run through an update scenario. Make the following assumptions:

* The time dimension has months for 2008 through 2012 and the cube is partitioned by month. The cube will have 60 partitions.

* You have loaded data into the cube for January 2008 through March 2012.

* It’s now time to load data for April 2012. This data has been inserted into the fact table.

* You have mapped the cube to a view. For the April 2012 update, you have added a filter to the view so that it returns data only for April.

If you use the LOAD_AND_AGGREGATE script and choose the FAST SOLVE refresh method, the database will really to the following:

BEGIN
DBMS_CUBE.BUILD('OLAPTRAIN.SALES_CUBE USING (LOAD PARALLEL, SOLVE PARALLEL)','S',false,4,true,true,false);
END;
/

With LOAD PARALLEL, the database will process the LOAD command for each partition (all 60). Since it’s selecting from a view that’s filtered out all but April 2012, 59 partitions will have no new or changed data. Although it doesn’t take a long time to load 0 rows and figure out that a SOLVE is not required, it still adds up if there are a lot of partitions.

With LOAD PRUNE, the database will determine that a LOAD is only required for April 2012. The LOAD step is skipped for all other partitions. While you will still see the SOLVE for all partitions, it doesn’t really do any work because no rows were loaded into the partition. An example using LOAD PRUNE follows.

BEGIN
DBMS_CUBE.BUILD('OLAPTRAIN.SALES_CUBE USING (LOAD PRUNE, SOLVE PARALLEL)','S',false,2,true,true,false);
END;
/

If you would like a script that walked through a complete example using the OLAPTRAIN schema, including the use of an MV log table to automatically capture changes to the fact table, send me an email william.endress@oracle.com with a link to this posting.

Tuesday, April 3, 2012

Excel and OLAP: ODBC vs. MDX

A question that I often get is "what is the difference between using ODBC and the MDX Provider for Oracle OLAP (from Simba Technologies) to query Oracle cubes"? Given that the Oracle cube is easily queried with SQL, it's a reasonable question.
The answer really boils down to leveraging meta data and automatic query generation.
With ODBC, it's up to the Excel user to write a SQL query to fetch data from the cube. Data can be returned in tabular format or a pivot table. When the data is viewed in a pivot table Excel will aggregate data, sometimes with unexpected results. For example Excel might choose to aggregate a measure such as Sales with COUNT or might try to SUM a measure such as Sales YTD Percent Change. Neither make any sense. It's up to the user to get it right.
With the MDX Provider, Excel understands what all the columns mean. It understands dimensions, hierarchies and levels. It's understand the difference between a key and a label. It knows what a measure is. It allows the server to calculate the data. Query generation is automatic. Business users just choose hierarchies and measures and the MDX Provider does the rest.
Here's a list of some of the advantages of using the MDX Provider for Oracle OLAP as compared to using ODBC and writing your own SQL.


Monday, April 2, 2012

Oracle OLAP Exadata Performance Demonstration

For a great paper on Oracle OLAP running on Exadata, see:

http://www.oracle.com/technetwork/database/options/olap/olap-exadata-x2-2-performance-1429042.pdf

The Executive Overview section of this paper provides an introduction:

This paper describes a performance demonstration of the OLAP Option to the Oracle Database running on an X2-2 Exadata Database Machine half rack. It shows how Oracle OLAP cubes can be used to enhance the performance and analytic content of the data warehouse and business intelligence solutions, supporting a demanding user community with ultrafast query and rich analytic content.

The demonstration represents users of a business intelligence application using SQL to query an Oracle OLAP cube that has been enhanced with a variety of analytic measures. The cube contains data loaded from a fact table with more than 1 billion rows.

Utilizing Exadata features such as Smart Flash Cache, Oracle Database supported a community of 50 concurrent users querying the cube with queries that are typical of those executed from a business intelligence tool such as Oracle Business Intelligence Enterprise Edition.

With each user querying the database non-stop (without waits between queries) with median query times ranged from .03 to .58 seconds, average query times ranged from .26 to 2.32 seconds, and 95 percent of queries returned in 1.5 to 5.5 seconds, depending on the type of query.

Query performance can be attributed to highly optimized data types and Exadata Smart Flash Cache. Cubes are designed for fast access to random data points, using features such as array-based storage, cost-based aggregation, and joined cube scans. Exadata Smart Flash Cache contributes significantly to cube query performance, virtually eliminating IO wait for the high volume, random IO typically seen with cube queries.