April 27, 2006
@ 02:58 PM

[crosspost from Dr. Dobb's Portal]

Test Driven Development (TDD) is, in a nutshell, writing a unit test up front--making it fail, making it work , refactor, and repeat until the product is finished. (If this is new to you, read more at testdriven.com )

So with TDD you get a bunch of unit tests that are also proven as regression tests. That's pretty cool.

TDD also lets you work in small increments while maintaining the working code. That's even cooler.

And lastly TDD has a very good influence on design:

  1. It encourage loose-coupling. When you want make something testable you want to remove the dependencies it has so you can test it by itself.
  2. It makes you think about the interface of the unit under test--how is the interface going to look?
  3. It makes you think about how the unit under test would be used--for example, the behavior of what you are writing (or designing).

Sounds great to me. I think TDD is a great way to do the detailed design. You specify the results (interface + behavior), then implement that design. One thing I don't buy though is that TDD alone will produce an "emergent design" for the whole system. The way I see it is that you have to do some design up-front (assuming your system is not a trivial one) since TDD, being a coding technique, keeps you working at sea-level.

There's also a fundamental matter of scale--it might be possible in theory to start that 100 man-year project as a single object, then refactor it in baby-steps until you'd get the perfect system. I believe that if you don't work at a higher level of abstraction (vs. code), you will not be able to partition the system in a reasonable time. This was true when we moved from assembly code to higher level languages which enabled us to write much more complex software--and it is true today as we need to answer the ever-changing requirements of modern enterprises.

To sum up, TDD is good for testing and it is a good design methodology for the detailed design level. It can be used to drive the overall design on smaller project--but on larger systems we need additional methods and tools to cope with the overall design and architecture.


 
April 25, 2006
@ 10:28 PM

Scott Bellware attacks Microsoft's cluelessness of modern development methodologies and tools. By talking about the 3 (in?)famous typecasts (Mort, Elvis & Einstein personas) used by Microsoft to model developers.

 

While I agree with a lot of the points Scott makes I think that (unfortunately) there are individuals and organizations that are suitable for to the Mort-Elvis-Einstein approach -  Where people are not as smart or competent as Scott is and can't handle agility. Also there are situations  where agility cannot be practiced -e.g.  clients insist on fixed price projects and waterfall-ish milestones where, against our better judgment, we were forced to do a lot of up-front planning (that had to be reworked later…), safety-critical system etc.

 

I much prefer the direction Scott took in an earlier post where he talked about a missing persona - Hugo the Agilist. I think Microsoft will be making a grave mistake if they will not pay attention to the needs of the growing community of developers that prefer agile methods and practices.


 
Tags: Everything | General

April 20, 2006
@ 02:25 PM

Dr. Dobb's has recently launched a new portal site, where they want industry experts to post their views - well, I guess it is not limited to experts only, as they've also offered me to write there - I'll be writing the blog on software architecture and design.

 

I am going to post there on a daily basis (read: ~5 posts a week). Naturally they are going to be shorter posts that will try to highlight and comment (hopefully) interesting things related to architecture and design (my thoughts, other peoples posts, news etc.).


I am still trying to decide on the balance between this blog and the new one, but  I guess most longer posts will go here (though I may cross-post them) and that whitepapers and presentations will continue to be posted here. Also note that the new blog has a wider spectrum as it also talks about design.

 

You will find my new blog "If you build it…will they come" here.


 
Tags: Everything | General

April 18, 2006
@ 07:11 PM

There are 113 domain models over at eclipse.org site of all sorts of things.  I guess most models don't have any practical value (what will I do with a metamodel of COBOL) but there are several interesting ones of things like RSS, WSDL, and SDM

 

It is also interesting to note that the models are expressed in several forms including MS DSL models  , UML 

  and images . The transformation from format to format was done by code.

 

(Found via Steve Cook )


 

As promised, here is the first pattern. If you like this pattern but you think there is something missing to gain better understanding please drop me an email: arnon at rgoarchitects.com . Naturally any other comments are also welcome :)




Getting an SOA right is very hard, not so much because of the technical problems (we know how to deal with those, don't we?), but rather it is very hard to figure where to put the borders and keep the right business alignment.  Assuming you somehow managed that, the real fun begins - you now have to produce reports, dozens and dozens of reports. Many reports will fall within the boundaries of single services (if you have a good partition), however many reports will also require adding data from several services. For example, in a Telco scenario, you may have a Customer, Billing and Provisioning Service (a real-life example would have dozens of additional services) now a customer is calling customer care and you want the CRM to show everything about the customer what outstanding invoices does she have, what equipments and services (GPRS, UMTS, friends and family etc.) she got, what her status as a customer (loyal , VIP, senior citizen …)  open service requests etc. Things get much more complicated when you need to summarize or group data from multiple services 

 

How do you get a decent cross business entities report with the data scattered about in all those services?

 

One possible solution would be to create the report at the consuming end (e.g. UI) visit all of the services involved then do all the grouping, cross-cuts etc. This solution is not very good from the performance perspective (you need to get more data then needed and you have to post-process it). It is also problematic from the flexibility perspective each service involved has to expose interfaces to get the data for the specific query (otherwise you mobilize even more data).

 

Another option is to go straight to the data, you may still need to hit multiple database servers to get to the data but the performance will be better. The problem is this is throwing your service boundaries down the drain and introducing a lot of dependency.

 

A third is to create interim Services ("Entity Aggregation") - this works fine as long as you have real business reasons to do the aggregations (there is an overhead with adding business logic to handle the aggregated data) and as long as you only have few of those  (or you might end up with a single "service" with all the business).

 

Create an Aggregated Reporting Service by building  an Operational Data Store (ODS) to enable creating sophisticated reports on otherwise dispersed data 

 

AggreagatedReporting.PNG


 

 

The ODS is similar in concept to a data mart e.g. data is subject based, integrated, scrubbed etc. However,  the main differences are that the data is up-to-date and that there is little or no history.

 

For incoming data the Aggregated Reporting Edge performs the data transformations from contract data into reporting data. The service updates the ODS by scrubbing the data (can be limited unless the data has to go to a data mart / data warehouse) and then integrating it and De-normalize into subjects.  Incoming report request fill parameters for the pre-prepared reports.

 

One problem with Aggregated Reporting is that it is not a Business Service (i.e. it is a technical solution rather than a business oriented one) - however since unlike Entity Aggregation the data in Aggregated Reporting is Read-Only this doesn't affect the business.

 

Aggregated Reporting is easier to implement when combined with  Inversion of Communication

 

Aggregated reporting with Data Mart/Data Warehouse

 

Instead of just storing recent operational data, this version enhances the depth and complexity of queries that can be executed against the service. The downside is the increased complexity in setting up the data mart - both from the operational costs perspective (e.g. additional storage) and from the design and development perspective (you need to think about long term aspects, indexing etc.) as you also need to scrub data and consider the structure of your schemas much more carefully.

 

 


 

Sidebar: Operational Data Store (ODS)

The ODS is probably the best kept secret of data warehousing technology. It has been around almost as long but it isn't as famous.

The data in the ODS is operational - live data and not static data. The ODS can be thought of the as the cache memory of the data mart / data warehouse.

It is important to note that while it doesn't need the same amount of planning and set-up as a data mart, an ODS still requires careful planning in order to bring real business value.

 

The figure below shows the classical usage of an ODS in an OLTP/Data Mart environment.

 

ODS.PNG

Originally it was thought there would be 4 types of ODS

 

Class I - Near Real-Time synchronization of the ODS with operational data from the OLTP databases.  an implementation of Class I is the preferred type for the Aggregated Reporting pattern

Class II - Update the ODS every four hours or so

Class III - Overnight updates of the ODS

Class IV - the ODS is updated from  the data mart / data warehouse

 

In reality there are more variants - for example a powerful (and complex to build) option is to merge a Class IV ODS with one of the other Classes and get.

 

 


 
April 13, 2006
@ 10:29 PM

I decide to write a short series of blog post on SOA patterns. These are not patterns that are only usable for SOA, however, I have found them particularly useful in implementing SOAs.

 

This isn’t an exhaustive list of pattern - on the contrary I'll try not repeat patterns which are well known (like  Entity Aggregation  http://patternshare.org/default.aspx/Home.PP.EntityAggregation )

 

I am a little busy these days (e.g. I have to complete an architecture document for one of my projects) - so this post will only introduce the (first batch of) patterns . And the following posts (in the series) will expand on each one (i.e. explain  What to do, usage context, consequences etc.). Then, if I'll get good feedback maybe I'll publish some more.

 

So, what patterns are we talking about here?

Well:

 

  • Gateway - How do you scale a service without exposing too many endpoints?

 

  • Inversion of Communication - How do I get the data from other services without too much coupling?

 

  • Biztalkize - How do I control volatile behavior inside the  service ?

 

  • Aggregated Reporting - How do you get a decent cross business entities report with the data scattered about in all those services?

 

  • Transparent Emergence - How do I know where to find a service?

 

  • Decoupled invocation - How can I handle peaks and high-loads without my service failing?

 

  • Orchestrated Choreography  - How do I expand the behavior of hard-to-change service (e.g. legacy systems exposed as services) ?

 

Well, I hope this sparkle enough interest to make you follow the rest of the posts on this subject :)


 

Uncle Bob (Apparently Robert C. Martin?)  writes about Architecture as a secondary effect .

The article postulate that : <Quote>
  1. The main goal of architecture is flexibility, maintainability, and scalability.
  2. But we have learned that the kind of unit tests and acceptance tests produced by the discipline of Test Driven Development are much more important to flexibility, maintainability, and scalability.
  3. Therefore architecture is a second order effect and tests are the primary effect

 

I had not thought about this before this round table. Here we were, a bunch of architects and designers, strongly debating the role and procedure of architecture, and the conclusion we come up with is that all the effort and struggle we go through results in a secondary improvement in flexibility, maintainability, and scalability. Writing tests (writing them first) has the primary effect

 

</Quote>

I think Bob is missing/downplaying one very important aspect - and that is level of abstraction.

 

  • Many agree (me included) that code is the final design artifact.
  • Many agree (again me included) that TDD is a powerful design technique (you may want to check out TDD Misconceptions or Rocky Lhotka vs. the world as summed by Jeremy Miler)

 

However since code is (obviously) detailed design you just can't go straight to code (test code or otherwise). In order to cope with a large/complex problem you need to tackle the problem at higher levels of abstractions first. Even more so there may be a need to go through several abstractions levels before you start coding anything. Unfortunately there aren't any real options to test models*.

 

In my opinion you cannot escape designing architecture in other (non-code) models for any, but the most trivial, system.

 

The way I see it the correct approach is to

  1. Work iteratively
  2. Test early - i.e. make sure that the architecture designed really works as soon as possible (see, for example, my post on evaluating architecture in code )

 

So - Is Architecture a secondary effect?

No, sorry, but I really, really don't think so.



* There are some options to allow simulation and validation (i.e. tests ) of models in the embedded world (e.g. http://www.embeddedplus.com/EmbPlusSMST.php or http://www.ilogix.com/sublevel.aspx?id=286) - however I find that these approaches don't scale well to IT problems (which have a lot more variables and are usually much larger than embedded systems) or even to complex embedded system. You just have to specify too much before you get a usable simulation rendering the whole effort useless.



 
April 9, 2006
@ 10:16 PM

One of the roles of the software architect is to act as a mentor/coach. Reviewing some of the designs in one of my projects' teams it seems the time was ripe for doing just that. Thus, last week I gave them a presentation on the basics of good OO design  - which I thought might also be of interest for other people (you can download a copy here - 312KB).

 

The presentation starts with the  7 deadly sins of software design:

  • Rigidity – make it hard to change
  • Fragility – make it easy to break
  • Immobility – make it hard to reuse
  • Viscosity – make it hard to do the right thing
  • Needless Complexity – over design
  • Needless Repetition – error prone
  • Not doing any

 

It is interesting to note that just yesterday I read an interesting piece on what makes good design (i.e. looking from the positive side) by James Shore (found via Sam Gentile

 

 

The main part of the presentation demonstrates the 5 basic design principles (drafted by people like Robert C. Martin , and Barbara Liskov ):

  • OCP open-closed principle - a class should be open for extension but closed for modifications
  • SRP single responsibility principle - a class should have a single responsibility
  • ISP interface segregation principle - there should be separate interfaces for different consumer types
  • LSP Liskov substitution principle - basically design by contract - a sub-class should fulfill the same expectations its suparclass set
  • DIP dependency inversion principle - classes should depends on abstractions, class consumers should depend on abstractions and abstractions shouldn't depend on details.

 

These principles are the basis for  some of the techniques widely used today - few examples include:

Inversion Of Control - builds on OCP

Dependency Injection - a mechanism to allow DIP

Contract First - building on LSP,DIP

 

At the end of the day following these principles helps managing classes dependencies, increase overall loose coupling and cohesion thus increasing the overall quality of design. It sometimes amazes me how using just a  few simple rules can improve maintainability, flexibility and usefulness of designs so much.