Thursday, April 10, 2008

Attribute Reporting on the Cube using SQL

In my last post, I noted that nearly reporting on the 11g cube occurs within SQL-based relational reporting tools such as Oracle Business Intelligence Enterprise Edition (OBI EE), Business Objects, etc. That's no surprise given that these tools dominate the BI market and that the OLAP Option is positioned as making SQL-based BI tools better with improved query performance and analytic content. While the focus is how OLAP cubes make SQL-based BI application better, it shouldn't be overlooked that SQL-based tools can make the OLAP Option better.

A perfect example is attribute reporting. The cube has never naturally supported attribute reporting. For example, if a cube is dimensioned by time, product and customer and product has attributes such as buyer and marketing manager it has never been easy to report sales by time, buyer, marketing manager and customer with just the cube.

One of the reasons for integrating the cube into the Oracle Database is that it blends the dimension storage and calculation models with the relational query model (that is, SQL). Dimensional calculations can be defined in the cube and relational calculations can be added above the cube in a query.

Getting back to our example, the SQL view of the product dimension (with the buyer and marketing manager attributes) will look something like this:

desc product_classes_view;

DIM_KEY VARCHAR2(100)
LEVEL_NAME VARCHAR2(30)
LONG_DESCRIPTION VARCHAR2(100)
SHORT_DESCRIPTION VARCHAR2(100)
TOTAL_PRODUCT_LONG_DESCR VARCHAR2(100)
TOTAL_PRODUCT_LONG_DES_1 VARCHAR2(100)
TOTAL_PRODUCT_DAVE VARCHAR2(100)
CLASS_LONG_DESCRIPTION VARCHAR2(100)
CLASS_LONG_DESCRIPTION1 VARCHAR2(100)
FAMILY_LONG_DESCRIPTION VARCHAR2(100)
FAMILY_LONG_DESCRIPTION1 VARCHAR2(100)
ITEM_BUYER VARCHAR2(100)
ITEM_MARKETING_MANAGER VARCHAR2(100)
ITEM_PACKAGE VARCHAR2(100)
ITEM_LONG_DESCRIPTION VARCHAR2(100)
ITEM_LONG_DESCRIPTION1 VARCHAR2(100)
PARENT VARCHAR2(100)
TOTAL_PRODUCT VARCHAR2(100)
CLASS VARCHAR2(100)
FAMILY VARCHAR2(100)
ITEM VARCHAR2(100)


A sampling of the data follows.

SELECT dim_key,
item_long_description,
item_buyer,
item_marketing_manager
FROM
product_classes_view
WHERE
level_name = 'ITEM';



If I want to break out sales by report sales by time, buyer, marketing manager and customer I query those columns and use SQL group by. E.g.,


SELECT
t.calendar_year_long_descr,
p.item_buyer,
p.item_marketing_manager,
SUM(f.sales)
FROM
time_calendar_view t,
product_classes_view p,
customer_shipments_view cu,
channel_channels_view ch,
units_cube_view f
WHERE
t.dim_key = f.time
AND p.dim_key = f.product
AND cu.dim_key = f.customer
AND ch.dim_key = f.channel
AND t.level_name = 'CALENDAR_YEAR'
AND p.level_name = 'ITEM'
AND cu.level_name = 'TOTAL_CUSTOMER'
AND ch.level_name = 'TOTAL_CHANNEL'
GROUP BY
t.calendar_year_long_descr,
p.item_buyer,
p.item_marketing_manager,
cu.total_customer_long_desc,
ch.total_channel_long_descr
ORDER BY
calendar_year_long_descr;




There are two things you want to keep in mind when doing any aggregation above the cube in SQL:
  • Make sure the aggregation method used in SQL is compatible with the aggregation method used in the cube. For example, if the cube aggregates using SUM, then using SUM in SQL makes sense. If the cube aggregates using a hierarchical weighted average, it probably doesn't make sense to SUM in SQL.
  • Make sure it is appropriate to aggregate the measure in SQL. For example, it makes sense to aggregate measures such as sales, sales prior period, etc. using SUM in SQL. It probably doesn't make sense to aggregate measures that are percentages or rankings.

In products like OBI EE I like to set up the presentation layer to make it clear which columns are levels and which are attributes. This helps highlight the fact that they are different. I also like to group different types of measures into seperate presentation tables. While this is convenient in general, this also highlights differences between types of measures. For example, a report designer can easily be told not to include ranking measures in a report that includes attribute columns.

In the illustration below, note the presentation tables group columns in to level, attributes and measure types.



In products like OBI EE, I can now effortlessly make reports broken out by attribute that include many types of OLAP calculations.


No comments: