Let's Drill: Inner Workings

Saturday, December 10, 2011



This is the first of two entries documenting the process of adding drilling capabilities to OLAP-Faces. This entry starts explaining my reasons to rewrite the org.olap4j.query package and later describes the drilling capability added to that rewrite.

In the following entry I will apply this new package to add drilling capabilities to the <olap:cellSet> JavaServer Faces component and will provide you with a working example.

A Hard Decision: Rewrite the query Package


The current version of the specification provides the class DrillDownOnPositionTransform to drill a positioned member within a CellSet, it should work like:

// Create a query

Query q = new Query("myquery", salesCube);

QueryDimension productDim = q.getDimension("Product");

QueryDimension measuresDim = q.getDimension("Measures");

q.getAxis(Axis.ROWS).addDimension(productDim);

q.getAxis(Axis.COLUMNS).addDimension(measuresDim);

q.validate();

CellSet cs = q.execute();



// Generate the MDX to for drilling

DrillDownOnPositionTransform drillTransform =

new DrillDownOnPositionTransform(Axis.ROWS, 0, 0, cs);

SelectNode drilledMdx = drillTransform.apply(q.getSelect());



// Execute new query to get the drilled CellSet

OlapStatement stmt = salesCube.getSchema().getCatalog().getMetaData()

             .getConnection().createStatement();

CellSet drilledCs = stmt.executeOlapQuery(drilledMdx);

Unfortunately that class is not implemented in the released version of olap4j (1.0.0.445) and I’ve not been able to find a simple enough implementation; mainly because such an implementation must accept as input a generic MDX expression.

I think that drilling in a cell set is a must, so my proposal is: adding drilling support into the QueryAxis class. I’ll rewrite a highly simplified version of the org.olap4j.query package to produce a proof of concept for this approach.

Initially I’m going to ignore dimension selections, change notifications and query filtering (filtering has no sense without selections).

My proposed use case for drilling is

// Create a query

Query q = new Query("myquery", salesCube);

QueryDimension productDim = q.getDimension("Product");

QueryDimension measuresDim = q.getDimension("Measures");

q.getAxis(Axis.ROWS).addDimension(productDim);

q.getAxis(Axis.COLUMNS).addDimension(measuresDim);

CellSet cs = q.execute();



// Drill it

Member defaultMember =

   productDim.getDimension().getDefaultHierarchy().getDefaultMember();

q.getAxis(Axis.ROWS).drill(defaultMember);

      

CellSet drilledCellSet = q.execute();



In this code all the classes in org.olap4j.query package have been replaced by classes in the es.cgalesanco.olap4j.query package.

Translating QueryAxis.drill() into MDX


Drilling a one-dimensional axis is easy: just use DrilldownMember MDX function passing the initial member set for the dimension as the first parameter, the set of drilled members as the second parameter and request for recursive drill resolution.

For example:

DrillDownMember(

  {[Store].[All Stores]},    

  {[Store].[All Stores],[Store].[USA],[Store].[USA].[OR]},

  RECURSIVE)

Produces an axis with the following structure

|-All Stores

  |-Canada

  |-Mexico

  |-USA

    |-CA

    |-OR

    | |-Portland

    | |- Salem

    |-WA

An easy way to extend this to multi-dimensional axes is

1.       generate a DrillDonwMember call, grouping all the drills having the same prefix (drill specifications with the same length differing only in the last element)

2.       put these expressions in  CrossJoins to generate a set of tuples as required for the axis

3.       compute the union of all the previously generated cross joins and hierarchize

So for an axis with two dimensions ([Store] and [Store Type]) and the following sequence of drills

1.       [Store].[All Stores]

2.       [Store].[USA], [Store Type].[All Store Types]

3.       [Store].[USA]

4.       [Store].[All Stores].[Store Type].[All Store Types]

The generated MDX will be

Hierarchize(

  Union(

    CrossJoin(

      DrillDownMember(

        {[Store].[All Stores]},

        {[Store].[All Stores],[Store].[USA]},

        RECURSIVE

      ),

      [Store Type].[All Store Types]

    ),

    CrossJoin(

      {[Store].[USA]},

      DrillDownMember(

        {[Store Type].[All Store Types]},

        {[Store Type].[All Store Types]},

        RECURSIVE

      )

    ),

    CrossJoin(

      {[Store].[All Stores]},

      DrillDownMember(

        {[Store Type].[All Store Types]},

        {[Store Type].[All Store Types]},

        RECURSIVE

      )

    )

  )

)

This is a basic algorithm and clearly optimizable, but it will do for my proof of concept.

QueryAxis’ New Methods


So our revamped QueryAxis will contain an additional list of drilled positions, supported with these new methods

void drill(Member[] drilledPos);

This method adds drilledPos to the list of drilled positions. If that position was already drilled it’s a no-op.

void undrill(Member[] drilledPos);

This method removes drilledPos from the list of drilled positions.

boolean isDrilled(Member[] drilledPos);

This method returns a boolean value indicating if drilledPos is in the list of drilled positions.

List<Member[]> listDrilledPositions();

This method returns the list of drilled positions.


2 comments :

  1. Cesar,

    I'm very pleased that you decided to do this project. There is a lot of good stuff in JPivot and a rewrite like this can help liberate it.

    I am assuming that this is an open source project. Have you decided on a name for the project? Where is the code released? What license is it?

    One of the "features" of JPivot's architecture is separate models for Mondrian versus XMLA. One of the things I had in mind when I created olap4j was to have a single model, based on olap4j. Looks like you are making that happen.

    When you design the query model, be sure to work in terms of hierarchies, not dimensions. In mondrian 4, it will be common for dimensions to have many hierarchies. For example, Gender and Marital Status would typically be hierarchies within the Customer dimension. So, for instance, your getDimension methods should probably be getHierarchy.

    I support your decision to rewrite the query model. However, some advice to make it maintainable and reusable. Keep it as a separate module as far as you can. And write extensive unit tests for that module. The unit tests should not require any of the other modules; maybe there could be several tests for each operation, each of which starts with a model, applies the operation, then results in either an error (if the operation is invalid) or a piece of MDX.

    That way we could one day absorb your model into MDX, or at least move over some of the code.

    ReplyDelete
  2. Hi Julian
    Of course, this is an open source project, probably it will be released under the Eclipse License.
    I had started a project in SourceForge, named olap-faces, but I'm gonna keep it frozen till I have something with a minimum of usability.
    Thanks for the advice about dimensions, I initially designed QueryDimension based on hierarchies (because my models do have dimensons with several hierarchies), but finally used dimensions to keep "interface compatibility" with olap4j.
    Take a look on the SourceForge project, it has a separate maven module for query and do includes unit tests.

    ReplyDelete