September 7, 2007
@ 02:07 PM
In the previous post on the subject I wrote that the RDBMS is dead. I didn't mean that it is dead dead, but rather that it isn't well build to meet some of the newer challenges like linear scalability, high availability etc.
Well, it is one thing hearing it from me - and it is another thing hearing it from someone like Michael Stonebraker.
Michael, was the main architect for the Ingres prototype project at UC Berkely just one year after Codd's paper and (9 years before Oracle was released  and more than a decade before the commercial version of Ingress was released).

Well that was in 1970 - in 2007 Michael recently wrote :
 
In short, the world of 2007 is radically different from the world of the late 1970s. However, none of the major vendors have performed a complete redesign to deal with this changed landscape. As such they should be considered legacy technology, more than a quarter of century in age and "long in the tooth".
Among the new needs Michael cites are intelligence DBMSs (needs a lot of relations), textual and semi-structural data etc. He also said (promoting his own product) that 2007 customers expect high availability, linear scalability.
Michael's main point is that specialization can provide significant performance enhancements vs. the one-size-fits-all approach of RDBMSs. He gives his product (Vertica) as an example for how a column oriented database (vs. the RDBMS row orientation) can outperform RDBMs by a factor of 50. Google's Big table is another example.

Interesting...


 
June 24, 2007
@ 09:20 AM
Few months ago I wrote here about solving the mismatch between Service Oriented Architecture (SOA)  and Business Intelligence (BI) (see papers and articles section). Recently I got the following question from Ben:
One major question I have is around large data sets. As an experienced BI/DW architect and developer I have worked on a number of large scale data warehouses. Retrieving large data sets (i.e. millions of records) doesn't seem to fit well into SOA. As you state in your article, we could have another point-to-point interface, where the service which houses data we need gets a request and writes out a batch file (xml or plain ascii text). Then using typical ETL, we grab the file and load it. The underlying source system (service) can use optimization in generating a large data set (vs. record by record) and
the data warehouse can correspondingly load in bulk.
Like most architectural questions - the answer is "it depends"
For instance, if you do a run-of-the-mill ETL as a on-time setup then it is just that- a one time setup and I, personally, don't see any contradiction between SOA goals or tenets and that.

I do think that iit is better to enhance SOA with EDA interactions to provide a long term solution to the BI problem. You can also have a dedicated component that aggregated the information that flows in in these events and builds batch files that are suited for the ETL you've used during the setup phase (mentioned above).
It is true though that moving an SOA which is already in-place to EDA is not a small feat, but adding EDA layers does not have to mean that the old interfaces go away - especially not immediately (remember to treat services as products)

If you have a business that generated millions of records on a daily basis - then the situation is more complicated. Now you have to think about the trade-offs between "compromising" SOA and adding a dedicated interface (or a backdoor to the database) for the ETL vs. the implications of performance, bandwidth, transition costs, ROI  etc. of pushing that information with EDA.
 I, personally believe in pragmatism and the "no-silver-bullet" approach so I can't say that EDA is always the best solution (As an aside, this is part of the reason I write my book as patterns not as "best-practices guidance"). You may find that ETL is the best trade off in your situation. Yes I know that it isn't a definitive answer - but real life is (usually) a little more complicated than black and white solutions. As architects we need to find the best trade off for the situation at hand.