As promised, here is the first pattern. If you like this pattern but you think there is something missing to gain better understanding please drop me an email: arnon at rgoarchitects.com . Naturally any other comments are also welcome :)




Getting an SOA right is very hard, not so much because of the technical problems (we know how to deal with those, don't we?), but rather it is very hard to figure where to put the borders and keep the right business alignment.  Assuming you somehow managed that, the real fun begins - you now have to produce reports, dozens and dozens of reports. Many reports will fall within the boundaries of single services (if you have a good partition), however many reports will also require adding data from several services. For example, in a Telco scenario, you may have a Customer, Billing and Provisioning Service (a real-life example would have dozens of additional services) now a customer is calling customer care and you want the CRM to show everything about the customer what outstanding invoices does she have, what equipments and services (GPRS, UMTS, friends and family etc.) she got, what her status as a customer (loyal , VIP, senior citizen …)  open service requests etc. Things get much more complicated when you need to summarize or group data from multiple services 

 

How do you get a decent cross business entities report with the data scattered about in all those services?

 

One possible solution would be to create the report at the consuming end (e.g. UI) visit all of the services involved then do all the grouping, cross-cuts etc. This solution is not very good from the performance perspective (you need to get more data then needed and you have to post-process it). It is also problematic from the flexibility perspective each service involved has to expose interfaces to get the data for the specific query (otherwise you mobilize even more data).

 

Another option is to go straight to the data, you may still need to hit multiple database servers to get to the data but the performance will be better. The problem is this is throwing your service boundaries down the drain and introducing a lot of dependency.

 

A third is to create interim Services ("Entity Aggregation") - this works fine as long as you have real business reasons to do the aggregations (there is an overhead with adding business logic to handle the aggregated data) and as long as you only have few of those  (or you might end up with a single "service" with all the business).

 

Create an Aggregated Reporting Service by building  an Operational Data Store (ODS) to enable creating sophisticated reports on otherwise dispersed data 

 

AggreagatedReporting.PNG


 

 

The ODS is similar in concept to a data mart e.g. data is subject based, integrated, scrubbed etc. However,  the main differences are that the data is up-to-date and that there is little or no history.

 

For incoming data the Aggregated Reporting Edge performs the data transformations from contract data into reporting data. The service updates the ODS by scrubbing the data (can be limited unless the data has to go to a data mart / data warehouse) and then integrating it and De-normalize into subjects.  Incoming report request fill parameters for the pre-prepared reports.

 

One problem with Aggregated Reporting is that it is not a Business Service (i.e. it is a technical solution rather than a business oriented one) - however since unlike Entity Aggregation the data in Aggregated Reporting is Read-Only this doesn't affect the business.

 

Aggregated Reporting is easier to implement when combined with  Inversion of Communication

 

Aggregated reporting with Data Mart/Data Warehouse

 

Instead of just storing recent operational data, this version enhances the depth and complexity of queries that can be executed against the service. The downside is the increased complexity in setting up the data mart - both from the operational costs perspective (e.g. additional storage) and from the design and development perspective (you need to think about long term aspects, indexing etc.) as you also need to scrub data and consider the structure of your schemas much more carefully.

 

 


 

Sidebar: Operational Data Store (ODS)

The ODS is probably the best kept secret of data warehousing technology. It has been around almost as long but it isn't as famous.

The data in the ODS is operational - live data and not static data. The ODS can be thought of the as the cache memory of the data mart / data warehouse.

It is important to note that while it doesn't need the same amount of planning and set-up as a data mart, an ODS still requires careful planning in order to bring real business value.

 

The figure below shows the classical usage of an ODS in an OLTP/Data Mart environment.

 

ODS.PNG

Originally it was thought there would be 4 types of ODS

 

Class I - Near Real-Time synchronization of the ODS with operational data from the OLTP databases.  an implementation of Class I is the preferred type for the Aggregated Reporting pattern

Class II - Update the ODS every four hours or so

Class III - Overnight updates of the ODS

Class IV - the ODS is updated from  the data mart / data warehouse

 

In reality there are more variants - for example a powerful (and complex to build) option is to merge a Class IV ODS with one of the other Classes and get.

 

 


 
Tags: Everything | SOA | Software Architecture
Tracked by:
"SOA Pattern: Aggregated Reporting" (David Strommer) [Trackback]
"Poll : SOA Patterns Book" (Arnon Rotem-Gal-Oz's Cirrus Minor) [Trackback]

Comments are closed.