January 7, 2007
@ 11:02 PM

[based on a few posts from my DDJ blog]

Implementing Business Intelligence (BI) solution on top of Service Oriented Architecture (SOA) is not a simple feat. A recent survey by Ventana Research shows that "...only one-third of respondents reported they believe their internal IT personnel have the knowledge and skills to implement BI services.". There's a good reason for that since there an inherent impedance mismatch between BI and SOA which takes some effort to overcome. The purpose of this paper is to look to explain the problem as well as look at the possible solutions.

Service-Oriented Architecture is about autonomous loosely coupled components. These traits gives you lots of benefits such as greater flexibility and agility but it also means that services have private data. Data that you don't want to expose to the outside as exposing it will decrease autonomy and increase coupling. This is why services only expose data and processes via contracts rather then exposing their internal structure.

That is all fine until you start to think about business intelligence. The cornerstone of any business intelligence initiative is gathering, collecting and consolidating data from all over the place. Once you have the data, you can use tools to analyze it, data mine it, slice, splice, aggregate, and whatnot. Traditionally BI builds on ETL (Extract, Transfer, Load) which goes directly to the database of the involved sources.

And here lies the problem: On the one hand we have services that want to keep their data private, and on the other we have a datamart or warehouse that wants that data badly.

What are our options?

  • If you go with traditional ETL, you introduce coupling into your service.
  • If you only rely on contracts that were constructed for business processes you may be missing out on important data.
  • If you build a specific contract that exposes "all" the data you are back at the point-to-point integration -- solving point-to-point integration is one of the reason we want SOA in the first place.

The second option seems to be the most reasonable choice of the three -- but it also has several problems. One problem is that the BI needs to know about all the contracts. The second was already mentioned -- important data might be missing. The third problem is that the BI system need to fetch data from the services which means it may miss out on data in the intervals between request. On the other hand, too frequent requests and you can congest your network easily as well as cause DOS on your own services.

Clearly we need a fourth option

In my opinion, the best way to tackle BI in SOA is to add publication messages into the contract. By "publication messages", I mean that the service will publish its state either in a periodic manner or per event to anyone who listening. This is a service communication pattern which I call "Inversion of Communications" since it reverse the request/reply communication style which is common for SOA.

To make the solution complete, you can add additional requests/reply or request/reaction messages to allow consumers to retrieve initial snapshots. Following this approach, you get an event stream of the changes within the service in a manner that is not specific for the BI. In fact, having other services react on the event stream can increase the overall loose coupling in the system - for instance by caching results of other services

Why is this better than the other three approaches? For one , you can get a good picture of what happens within the service. However the contract is not specific for the BI and can be used by other services to cache the service state (thus increasing their own autonomy), for reporting (you can see an early draft of the aggregated reporting pattern), and for BI purposes. By working against a steady stream of events, the BI platforms can Analise treands, keep history and get the complete picture they need.

The approach above is sometimes referred to as "Event Driven Architecture" (EDA) and while I (and others) see EDA as another facet of SOA, not everyone agrees. Gartner, for instance, sees EDA as another paradigm and SOA just for request/reply, or client/server. Recently, however, they published a paper that calls the approach described here as "Advanced SOA". I tend to agree more with the "advanced SOA" definition and don't see a contradiction with EDA and the SOA definitions. We are still using the same components and the same relations only adding an additional message exchange pattern into our toolbox.

A note on implementation: If you are implementing SOA over an ESB that is rather easy to implement as most ESBs support publishing events out of the box. Using the WS* stack of protocols, you have WS-BaseNotification, WS-BrokeredNotification and WS-Topic set of standards. If you are on the REST camp, then I guess you will need to implement publish/subscribe by yourself.

Once you have event streams on the network, The BI components grab that data scrub it as much as they like and push it to their datamarts and data warehouses. However, event steams can also enable much more complex and interesting analysis of real time events and real time trend data using complex event processing (CEP) tools to get real-time business activity monitoring (BAM)

You can also get post as as a presentation down loadable from the papers section on my site or directly from here. (The download is about 3MB.)



 
Thursday, January 25, 2007 5:46:20 PM (GMT Standard Time, UTC+00:00)
Many people think of SOA as synchronous RPC (mostly over Web Services). Others say EDA is SOA. And there are many people who say that the best of EDA and SOA is combined in SOA 2.0.

Everybody will agree that there is a request-and-reply pattern and a publish-and-subscribe pattern. It is easy to see that both patterns are an inverse of each other. In my article 'How EDA extends SOA and why it is important' I explained the differences between the two patterns and when to use the one or the other.

Because of the completely different nature and use of the two patterns, it is necessary to be able to distinguish between the both and to name them. You might say making such a distinction is a universal architectural principle. Combining both of the patterns into an increment of the version number of one of them is - IMHO - not a very clever act. I believe it is appropriate and desirable to use the acronyms SOA and EDA to make this distinction, because SOA and EDA are both positioned in the same architectural domain; SOA focusing on (the decomposition of) business functions and EDA focusing on business events.
Saturday, January 27, 2007 7:59:16 AM (GMT Standard Time, UTC+00:00)
There are so many opinions regarding Service Oriented Architecture. These opinions are wondering
around the same thing calling it by different names. Unlike jack, I see it the way
Arnon does. Request/Reply is just another (simple) way of connecting to a service, running
an operation and getting its result back.
What is called EDA is just another method of connectivity. It is more complex since it is
asynchronous but it enables you to do things differently and extends your capabilities.
When speaking of SOA, EDA is no longer a good representation of the solution. We are not
dealing with an entire different Architecture but, in my opinion, an extension to what we
call today SOA. Adding Pub/Sub to your implementation of SOA does not make it EDA, it is
still SOA equipped with some new capabilities, but it is still SOA.
Friday, February 02, 2007 2:24:13 PM (GMT Standard Time, UTC+00:00)
I see SOA and EDA is two architectural styles. SOA has (de)composition of business services as a foundation and EDA has business events as a foundation. It is not about implementation, but about viewpoints. In a strong design both styles are combined. Of course it is possible to implement a request/reply style of interaction using pub/sub: simple separate the issuer of a request and the consumer of the reply on one side and do the same with the consumer of the request and the producer of the reply. RSS is an example of implementing a pub/sub using request/reply. But that's not the issue. It's about a higher level of abstraction to recognize two inverse patterns. No matter how you name them. I named them SOA and EDA, which is perpectly suitable to me. But you may combine the both patterns in one acronym: SOA. My question then will be: how do you - as an architect - distinguish between the two in your communication?
Comments are closed.