(This part III of a series -see Part I, Part II)

We need to face it, there are no absolute truths when it comes to software architecture( I guess that's part of the reason the term always looks so fuzzy. ) Should we use REST? it depends. Should we use OR/M or direct database access? it depends. Sometimes even a big ball of mud can be a good option. The good news is that we can always answer "It depends" to any architectural question and always be correct. The bad news is that it is our role to figure out what does it depend on and come up with a viable trade off.

The fact everything is a tradeoff doesn't mean that there aren't any cases where the trade-off is simple. E.g. if you just have to build a couple of data entry screens and a simple database you probably shouldn't spend a couple of months evaluating your options, just write the damn thing in your spare time. Nevertheless, just hacking it can mean it wouldn't be extensible, it that's what you needed then cool. if not, maybe you should have considered that.

This is one of the reasons I don't like the term "best practices" - it sends out a "you don't have to think anymore" message which is oh-so-incorrect. Patterns on the other hand, don't send  out this message, as they also include discussion on where to use them as well as their limitations and pointers to other patterns. Unless of course, patterns are seen as an end-goal rather than a means, in which case, they look pretty close to "best practices"

To sum this post - everything is a tradeoff. The most important bit here is to keep that in mind, even when the choices seems obvious. Awareness  is the key to better decisions.





Powered by ScribeFire.


 
August 31, 2008
@ 06:59 PM
Software architecture is not a three-layer diagram (UI-Business logic-Data)! As an architect you need to consider the project/solution at hand from a lot of different angles and take care for all sorts of concerns from the technical, team, managerial and event esoteric ones

  • Technical- you need to consider things like threading models, data flow data strucutre, testability, security , user interface. A solution is only as strong as its weakest link. If the UI is great but the system is not stable you will fail. if the system scales well but the security is flawed you will fail.
  • Team - You have to understand the limitations and capabilities of your teams and thier structure. If everybody knows Cobol, maybe rails is not a good option. if the teams are spread out geographically, you need to make sure you partition the system into chunks that wil allow them to progress independetly as possible.
  • Managerial - brings things like What's the budget for the project?, How much time do we have? so can you really plan on using a rule engine if it eats 80% of your budget (maybe ?), Can you plan a cool piece of infrastructure if it will take 4 months to build?
  • Esoteric - Sometimes you even need to consider less than common stuff. There aren't general examples here since its, well, esoteric - but a couple of examples I've seen include a project where we had to see how much power the hardware we use needs (since it was to be deployed on a truck) or on another project where we designed a multi-monitor UI we had to figure out if it is better to design the UI for side-by-side or one atop the other
You should note that most probably you will not be able to be master of all the needed domains. you do need to be aware that they exist and work with other experts to cover the other bases of the solution.



 
As a follow up on the previous post. There are few things I consider as important for software architects. I wouldn't go as far as to call them axioms but I still think they are worthwhile.
  • Think Hollistically
  • It is always a trade-off !
  • Pay attention to your soft skills
  • Get SPAMMED for architecture
I wrote a lot about the forth one (see for instance this article in Dr. Dobb's). I'll expand a little about the others in the next few posts


 
August 30, 2008
@ 09:44 PM
I recently stumbled on "97 Things - Things every software architect should know" (via Bobby Woolf). This is a list of axioms for architects (which will eventually be a book by O'Reilly) edited by Richard Monson-Haefel. While I don't agree with all the axioms, and some, which I feel, are a bit overlapping (e.g. one on trade-offs and one on balancing), there's a lot of great stuff ther.
For instance, here are a few of the ones I like:



 
Following my latest post on evolving the architecture Dru asked me for more details on our RESTful control channels.
For one you can take a look at slide 25 of my presentation on REST which talks about the Sessions resource. The session resource returns an AtomPub feed of the current active sessions and then if you follow a link to a session you get the current status, the URIs of the participating resources etc.
I guess the more interesting questions are (especially in the light of all the on going REST debate we now see)
  1. Why rely on REST for the control channel
  2. Why not use REST for the whole system
So, why is REST a good option for the control channel?

  • the REST architectural style in general and REST implementation using web standards (HTTP, AtomPub etc.) in particular brings a lot of benefits in integration (what easy for humans to understand is easier to implement).
  • Another reason for REST (over HTTP) is standardization over languages and platforms. Any language and platform I've used has an implementation that allows sending and receiving HTTP messages. We have few components running on Linux and components running on Windows and we're planning even more heterogeneity down the road.
  • Lastly, REST allows for easy debugging and run-time interaction. This proved invaluable during system integration test where we could easily understand the current state of each of the components in the system as well as the general picture.
Ok, if everything is so good, why not use REST for the whole system? Well, because like any architecture or architectural style (especially, when incarnated in a technology), REST has things that it does well and things that it doesn't (personally, I don't buy the Only Good Thing(tm) for anything or as Brooks puts it there's no silver bullet).
Let's look at message exchange patterns for instance. REST over HTTP support the request/reply pattern.
This works extremely well in many business situation. For instance is we have an Order service (or resource for that matter) and we need to calculate the discount for a specific customer we can go to the Customer service and get her current status and check if she a VIP customer, senior citizen etc.
There are, however, places where it doesn't work as smoothly. Returning to our Order, lets consider what happen once the order is finalized and we need to both start handle it (notify the warehouse?) and Invoice it
The order service does not care about these notifications it isn't its business.
My favorite way to solve this is to introduce business events (incorporate Event Driven Architecture) so that the interested parties will get notified. Another common way to solve this is to introduce some external entity to choreograph or orchestrate it (BPM etc.) both options have different constraints and needs compared with REST. In my organization we have a lot of processes that lend themselves to event processing much better than they do REST over HTTP (though the implementation might end up aligned with the REST architectural style - I am not sure yet)

Another reason not to use REST is when you have to integrate with stuff that isn't RESTful, for instance we need to integrate with systems that use RTP and other such protocols so we are bound to that - and we are a startup with "green field" development. In an established enterprise the situation is much more complicated.

To sum up, in my opinion when you take a holistic view of a complete business you are bound to see places where different architectural principles are a good fit. Architecture styles (and architectural patterns) are tools you can use to solve the challenges.There are places where a hammer is a great fit, but it is also wise to make sure the toolset has more than just a hammer.

PS

It isn't that you can't do events with  REST over HTTP. e.g You can implement the events as an ATOM Feed and have the "subscribers" check this feed every once in a while (the way this blog works). It can even check the HTTP header before getting the whole feed. Still push is a more natural implementation for this for various reasons like you don't have to know where to find the event source and you can more easily improve latency (when needed) etc.

 
Retrospectives, every "agile" team does retrospectives.What are retrospectives anyway?

A retrospective is a meeting where the team takes a look and inspect the past, in order to adapt and improve the future.

Agile or not, our team does a retrospective at the end of each iteration (every two weeks in our case). We try to look at what worked, what didn't , how we are meeting our goals etc, how is the product going etc.. These meetings provide a lot of value for steering us at the right direction.
On going retrospectives that look at the near past allows for suppleness and change adaptation and they are very powerful at that - However it is sometimes worthwhile to reflect over longer periods of time.

One area where longer perspective is important is the architecture of the project. Evolving an architecture you run the risk of accepting wrong decisions - mostly because architectural decisions have long term implications, while YAGNI, time constraints and life in general drive you toward short term gains.

Again, taking an example from my current project, working towards the first release, we took a few major decisions during the development e.g.
  • federated resource management - Taking into consideration the fallacies of distributed computing we decided that we'd have local resource managers that will take care of resource utilization and allocation. The resource managers will have a hierarchy where they'd communicate with each other to gain the "bigger picture"
  • Introduce Parallel Pipelines - handle image understanding by dividing the work between specialized components.
  • RESTful control channel - to use a "lingua franca" between all component types so that we can easily integrate across platforms and languages
  • local failure handling - resources and components handle failure by themselves
  • Communication technology (WCF in our case) is isolated from the business logic by an Edge Component
  • etc.
Once we finished delivering the first release. We took a few "days off" to consider what we've done thus far. updated our quality attribute list per our knowledge working with the system and looking at some customer scenarios. studies the things we liked/didn't like in the design and architecture of the working system. and revised a few of our decisions for instance
  • We found that rushing to a working system we introduced some excess coupling to a specific technological solution (for video rendering). We initiated a few proof of concepts and found out how to both isolate the technology from the rest of the system as well as allow more technology choices.
  • We found that the some of the data flows were not as clean as we thought they'd be - adding new features caused more resource interactions than we thought when we partitioned the resources. We redefined some of the resource roles to get less message clutter (and higher cohesion)
  • The federated resource management works well, but introduce needless latency in session initiation. We now opted for introduce "Active services" which are more autonomous.
  • Add a blogjecting Watchdog in addition to local failure handling to both increase the chances of failure identification and recovery as well as get a better picture in a centralized Service Monitor.
  • RESTful control channel worked well and will continue for later release
  • Some of the scale issues will be handled by introducing "Virtual Endpoints" while some would continue to use autonoumous endpoint creation and liveliness dissemination (hopefully learning from the mistakes of others)
  • etc.
The result of these and the other decisions we've maid is a rework plan that will (hopefully anyway) make our overall solution better.
What we see is that we evolved our architecture as we went forward. While all the the decisions we made seemed right at the time we took them, only through reviewing them in a wider perspective (architecture retrospective) we identified the decisions that we need to change and the ones that we have to enhance. The insight you gain after working on a project for awhile are much better than the initial thoughts you have or the understanding you master in the initial interations.
I think it is essential to review the architecture once you've gained more experience with the realities of the system you write (vs. the precieved realities you have on the get go)

By the way if you work with a waterfall approach your situation is worse. Since in this case you take your decisions before you write any code so, you don't even have the benefit of POCs, and working code to enhance your insights


PS
if you have the MEAP version of SOA Patterns you can read more on the patterns I've mentioned here: Active service in chapter 2, blogjecting watchdog in chapter3, Service Monitor in chapter 4, Parallel Pipelines in chapter 3, Edge Component in chapter 2


 
DZone recently published an interview with me on my  SOA Pattterns book. Along with the interview you can also download chapter 2 of the book (I think you need to be a DZone member to actually download it).

Chapter 2 includes  the Edge Component , Service Host , Active Service , Transactional Service and the Workflodize patterns. Additional downloads related to the book include
Lastly, you can ownload the first version of chapter 1, which I mention in the interview and the slides of a presentation on few of the patterns from Dr. Dobb's Architecture and Design World last year


 
July 24, 2008
@ 09:49 AM
Every Thursday we have this "happy hour", you know beers, snack etc. Every other week or so we also try to make it educational and after socializing for a while hear a presentation  or a webcast.

I used this week's slot to present the REST architectue style. I think the presentation turned out pretty well so I thought I'd share it online (note it is a 6M ppt)


 
July 12, 2008
@ 10:30 PM
My friend Gunnar Peterson asked about my opinion on SOA and security concerns. Here's what I wrote him:

In a paper I wrote a couple of years ago I examined the relevancy of the “fallacies of distributed computing” defined by Peter Deutsch almost 20 years ago. Writing about the “Network is Secure” fallacy I wrote that after all these years you would think that the fact you cannot assume the network is secure would be a no-brainer. Alas it still it happens all the time - and that's for "regular" distributed systems.

 In my opinion, assuming the network is secure for an SOA is not only naïve but negligence pure and simple. The whole premise of moving an organization to SOA is connectedness and integration. So, unless your SOA will fail it will be connected to other systems. Whether you  are building RESTful systems, WS-* SOAs, EDAs or any combination of these architectural styles, If you won’t treat the services boundary as a border and secure it – you will be sorry…

Security in SOA should be considered at the "grand-scheme" level with issues like authertication, authorization but also at the single service level, looking at issues like DDOS, SQL injection, elevation of privilige and what not. A trivial thing like exposing a transaction beyond service boundaries can translate to an attacker denying services in your system simply by locking out your database. Again, this is just a simple example.

The other thing about Security is that you have to consider it early. patching security "later on" can have devestating effects on a system's capabilites esp. in areas related to performance. I have seen even military systems that had to go through serious rework, just  because Security was added as an afterthought instead of handled early on


 
[It has been a little rough last week between a looming milestone @ work and my son fracturing his elbow @ home but hopefully I'll be back to the regular schedule this week]

Stateless services are da bomb right? they are easy to scale (since they have no state you can deploy as many as you like) they are easy to reuse (no state - no baggage) and what not.
The only problem with that is that the state doesn't really go away. Stateless services just suffer from NIMBYism ("Not in my back yard") when it comes to state. A stateless service needs to be stateful when it performs it action and since the state is not there, it has to get it from somewhere

There are basically two approaches to getting the state into the stateless service
The common way is to make the state someone else's problem (usually that would spell a database). With this approach the stateless service perform queries (database or otherwise) to get the state from the 3rd party. This is problematic in many ways e.g.
  • You need to pay network tax for getting the state (remember the fallacies of distributed computing..)
  • If that someone else is a single source (such as a database) it can easily become a barrier for scalability (I wrote about the RDBMS problem in the RDBMS is dead). If it isn't a single source you need to go to multiple sources so you have the network problem multiplies
  • You need to pay network tax for putting the state back at the state repository
The other way to get the state is to put the state on the message - or the "document" approach. This approach is superior to the previous one as you get to piggyback the data on the request. This is a good example of stateless communications*, which as a side effect, can save the stateless service the problems mentioned above.
The "state on the message" approach works when the handling of messages is serialized. ie. only one "station" in the flow can make changes to the state at any one time.  Unfortunately this only works for a subset of the interactions you can have. Inj most cases multiple consumers need to get to the same data or coordinate

You can also combine the two approaches and sometimes get good reults.
Another way altogether is to look at stateful services which I'll talk about in the next post



* Many times people fail to make the distinction between stateless services and stateless communications - I'll expand on that in another post.


 
Simon @ CodingTheArchitcture recently asked "How big is your software architecture document? (and who reads this stuff anyway?)"
He notes that in a UG meeting most of the attendees has SADs that were more than 50 pages long.
It would probably not be too surprising if I say than in my opinion the answer is that it depends. Reflecting back on some of my past projects I had SADs that varied in range from a 200+ "write-only"* document to a less than 10 pages lean document. And the sizes match the intended usage of the documents. for instance in the two extremes mentioned. The first case it was a huge mission critical project with a specific requirement from the customer to have an "official" SAD and it was written to satisfy some project milestone (PDR) . Where the second extreme is an agile project where the architecture document was a working document, written some 10 iterations into the development to highlight some of the emergent guidelines.

What is common to all the SADs I wrote (or was responsible to) is that they all tried to grasp the essence of the design, all used multiple viewpoints to describe the solution, all were focused on quality attributes and all explained the rationale behind the decisions.
  • If you drone endlessly with details you don't see the forest from the trees.
  • if you don't use multiple views - you are likely to miss important aspects of the solution
  • if you aren't focused on quality attributes that you are most likely documenting design and not architecture
  • and if you don't explain the rationale then the document doesn't have a lot of added value beyond the code itself
in any event, the important thing here is that when it comes to Software Architecture Documents "size doesn't matter" :). What matters is that the SAD satisfies the reason it was written for


*While this particular SAD was rather long it also had a section that helped potential readers find relevant chapters so that it can actually be usable, and not just as a"door stopper"

 
Someone calling himself r r left the following comment on part IV of my series of posts on SOA definition:

"I keep trying to read this series on SOA unfortunately suffers from the same disease as the rest of literature on the subject. stays general to a comfortable level so it can't really be applied anywhere, tends to complicate things where is not clear if it's needed, and encourages philosophical debate on what ultimately is a business (and so concrete) requirement. Meanwhile the serious (IMO) issues stay untouched - how does one actually approach an integration project with functionality, performance and security in mind. Which should be the standards used (considering the tens of standards on WS out there). How granular should the WS be (I'm done with answers like "not too much, but enough", or "well, depends on your project"). "
Before I talk a little about the "serious issues" mentioned above - I want to point out that the point of this series of post, as stated in the first post is to take a formal / semi-academic look at SOA. I started these posts as a reaction to a comment that Pete Lacey left on my blog stating that my view of SOA (as published in "What is SOA anyway?") does not demonstrate that SOA is an  architectural style. I don't pretense that this is some fully thought out academic dissertation or anything but I do try to look at the architectural roots of SOA.

That said let's take a look at the more interesting parts of this comment. First, the thing that bothers me about this reaction is (what seems to me as) the quest for final and concrete recipes. For instance consider the comment on service granularity
"How granular should the WS be (I'm done with answers like "not too much, but enough", or "well, depends on your project"")
The problems is - it does depend! and if you forgive me taking another philosophical detour, if you try to provide a hard definition for a service granularity you get  something like the heap paradox - When you remove individual grains  from a heap of sand is it still a heap when one grain remains. So while it is obvious that hiding a complete system as a single service is wrong and that exposing every little object as a service is wrong (even though for some inexplicable reason Juval lowy seems to thing that the latter is good practice) it isn't really obvious when you get too granular.

Nevertheless it is not a pure guess either. You can use some guidelines and measure them against your specific project/system/enterprise needs. Personally The set of guidelines I use is based on the fallacies of distributed computing :
  1.  The network is reliable
  2.  Latency is zero
  3.  Bandwidth is infinite
  4.  The network is secure
  5.  Topology doesn't change
  6.  There is one administrator
  7.  Transport cost is zero
  8.  The network is homogeneous
Since a service edge is boundary which may (usually is ) be accessed remotely you need to think about the incoming and outgoing interactions of the service within the fallacies stated above. if the proper behavior of the service depends on one of the above there's probably something wrong.

Regarding the other questions (how do you approach a real system), well, if you pardon me for banging my own drum, that's exactly why I started to write my experience on these matters as patterns. for instance if we look at the saga pattern (one of the patters I published online). you'd see that it is talking about achieving distributed consensus in a transaction-like manner. I talk about the problems of using distributed transaction etc., offer an architectural solution (the saga ) and then discuss relevant technology issues (e.g. WS-BusinessActivity ) as well as its implication from quality attributes perspectives (Integrity and reliability). Nevertheless even these patterns aren't an end-all solution. different circumstances require different solutions
Both my previous job and my current one involves building a scalable solution on-top of algorithmic engines. In my previous job I  managed the construction of a biometric solution that allows using multiple biometrics. In my current job I manage the development of  a mobile visual search solution . Again, while on the surface both needs to get some data, run a few  algorithms and produce an answer. These systems have very different quality attributes. On the first system we had to handle very large databases, hundreds of queries, an emphasis on modifiability and security, the current one needs millions of queries, almost no database, low latencies and emphasis on usability.  These differences result in radically different solutions, with different services, different interactions , use of different patterns etc. There's no "one right answer" (tm)


 
This post is part of a series of posts trying to define SOA as an architectural style. In the previous post I talked about SOA and the Layered architecture style (which generated a couple of follow-ups - one on layered architecture in general, one on its importance for SOA and on on layers in enterprise architecture vs. solution architecture)

The next architectural style SOA builds on is Pipes and Filters, Unlike Layers and Client/server which I described in previous installments, Pipes and Filter is not also a base style for REST. This basically, this style is where SOA and REST begin to diverge.
The pipes and filters architectural style defines two types of components - yep you've guessed it, Pipes and Filters.
Filters -  are independent processing steps they are constrained to be autonomous of each other and not share state, control thread etc.
Pipes - are interconnecting channels


Each filter exposes a relatively simple interface where it can receive  messages on an inbound pipe, process tthem and produce  messages on outbound pipes. The idea behind this is to allow easy composability thus allowing greater usage (also known as "reuse" - I'll discuss the difference in another post). Systems are composed of several filters working together, filters can be replaced with newer version (provided they keep the same interface) etc.
On the downside the overall latency is increased , since to accomplish a task you have to move from filter to filter.

The pipes and filters style brings to SOA things like the autonomy of services, the sense of explicit boundaries. For instance, this is the basis for why you wouldn't want to do distributed transactions across service boundaries, which I blogged about several times before.

The pipes part of the "pipes and filters" also means that the wiring can be taken care of outside of the services themselves and that you can control them externally, this works well with ithe use of middleware (service bus). Additionally Fielding (you know, the REST guy) also mentions that
"One aspect of PF styles that is rarely mentioned is that there is an implied "invisible hand" that arranges the configuration of filters in order to establish the overall application. A network of filters is typically arranged just prior to each activation, allowing the application to specify the configuration of filter components based on the task at hand and the nature of the data streams (configurability). This controller function is considered a separate operational phase of the system, and hence a separate architecture, even though one cannot exist without the other."
Which is the harbinger of the orchestration/choreography aspects of SOA.

So as you see, pipes and filters is one of the important pilars of SOA, in the next part (unless I'll have to clarify things about this post) I'll talk about the last architectural style SOA builds upon "Distributed Agents".


 
March 6, 2008
@ 09:14 PM
Jack Van Hoof has a different view than I have on the difference between Tiers and Layers. I am not sure I agree with his view, but it still provides an interesting read. I think  the main difference between our respective views is that Jack takes a look form the enterprise-architecture angle which gives him layers like
  • Technical infrastructure - OS, directory Services etc.
  • Application infrastructure - Apps, Portals, DBMSs
  • Application Landscape - SOA, EDA
  • Bushiness Processes - BPM
Jack uses the term tier for layers within the same level of abstraction. for instance he gives the following examples:
"E.g. the layer of business services may be arranged in the tiers: front-office, mid-office and back-office. At the next lower layer, the application layer, services may be arranged in the tiers: UI, business logic and data persistency. The interaction of services between two tiers may be bidirectional (but may also be constrained to unidirectional). "
The perspective I have (or at least try to maintain in this blog) is the solution/product line architecture - basically living within Jack's application layers. So in my view I want to know and differentiate between the difference of having a UI and business logic live on the same machine vs. having them distributed in the world. So I guess in the end both perspectives need to have their place and the problem is, like many other times,is  overloaded terms


 
Great news. Two of my friends and fellow DDJ bloggers, Eric Bruno and Udi Dahan have agreed to join my (now ours) SOA Patterns book which will be published by Manning.

Both Udi and Eric are competent and experienced architects who have experience designing SOAs . On the technology side -  Udi (“The software simplest”) specializes in .NET development e.g. his nServiceBus framework – which is a very good example for an endpoint-ware ServiceBus (vs. middleware ServiceBuses which is what most ESBs do).And  Eric, on the other hand, is a Java and C++ expert . Eric is the author of Java Messaging (one of the best books on JMS and web services ) and has also has a lot of experience in Financial systems. Together, the three of us bring a lot of real-life experience of building large and complicated system into this project.

The current game plan is for Eric to focus on the SOA pitfalls (“anti-patterns”) part of the book, Udi to provide a “putting-it-all-together” chapter , and for me to cover what’s left. I am sure however, that their experience and insight will also help make the other parts of the book (even? ) better.

If you are not familiar with the book - you may want to take a look at the first chapter and/or some of the published patterns like Saga, Service Firewall, Gridable Service, Edge Component and (a very early draft of) Aggregated Reporting pattern . Also you can take a look at the slides from my "SOA Patterns" presentation at Dr. Dobb's Architecture & Design world last year, which illustrates some additional patterns


 
February 23, 2008
@ 08:38 PM
In the previous post I mentioned a couple of questions on SOA and layers Udi left on an older post I made:

1. How does this [layers - ARGO] play with two services talking with each other? One pubs to the other's subs?The other requests to the first's response?
2. How valuable is the layered abstraction?


1. As I explained in the previous post. Layers does not necessarily mean unidirectional relation from a top layer to a lower level one - it does mean that a layer can only know a layer that is diretly above or below it. In other words the bidirectional interaction between two services  i.e. the request, reactions, events etc. flowing between them do not violate the layered style constraints.

2. So, how valuable is the layered abstraction to SOA? The short answer - very :). Again, as I mentioned in the previous post, the main reason layers don't seem that valuable is because they've been misrepresented and misused. Layers bring added flexibility to SOA. The fact that a service or any other SOA component cannot see beyond the next layer enables things like the  ServieBus, Edge Component, Service Firewall etc. Without layers it would be harder to have autonomous services as other services could (potentially) have access to the innards of the service adding more coupling and preventing independence.



 
February 5, 2008
@ 11:29 PM
Following the third part of my "Defining SOA" posts Udi Dahan left the following two questions:
How does this [layers - ARGO] play with two services talking with each other? One pubs
to the other's subs?The other requests to the first's response?
How valuable is the layered abstraction?

Considering Udi and me usually see things eye to eye. I guess that if I managed to get him confused, more clarification is warrant :) I'll do that in two posts, this one which will explain the concept of layers and the next which will explain why it is paramount  to SOA (and answer the two questions)

Usually when I review an architecture one of the first (sometimes the only) architectural artifact I am shown is a "layered diagram" of the architecture e.g. something like the following:


These sort of half layers/half block diagrams with or without the common 3 layers (which also appear in the diagram above) of "UI" "Business" and "Data" give the whole idea of layers a bad name

The key differentiator between layers and just a bunch of blocks is the limitations on the allowed communication paths between the components (layers vs. blocks). In the previous post I quoted an old (2005) definition I had for layers where I said the following
"Typically a layered is allowed to call only the layer below it and be called only by the layer above it (but there are variants e.g. a layer can call to any layer below it; vertical layers that can call multiple layers; etc. -- all is fine as long as the layers communication paths are limited by some rules)."
Alas, I was too quick on the Copy/past and everything in the brackets (bold) is wrong - it should actually say - "but there's another variant where layers are allowed to call the layer above it or below it". The other variants (like  a layer that can all any below it) just muddy the water, makes it hard to distinguish between layers and regular components and thus make layers seem unimportant. consider the following diagram:


So, in the above diagram the relations are the Component D and Component C know each other. Component D is made of two layers (A and B). Note that a more proper representation should also explain the relations allowed between the layers i.e. is it unidirectional or bidirectional (unless there's a common convention in the project)

Why is the distinction between layers and other type of components important? because Layering gives you some benefits which "just components" don't:
1. Layers allow information hiding. Since we don't know the inner working of what's beyond the layer
2. Layers allow separation  - Things beyond the immediate layer  are hidden from each other. This means that things which are beyond the layers are loosely coupled in a way that allows for  flexibility and the addition of capabilities. for instance adding a firewall between your computer and the internet.
3. Layers allows changing the abstraction level - since layers are hierarchical in nature, moving through the layer "stack" you can increase or decrease the level of abstraction you use. This allows expressing complex ideas with simple building blocks. The best known example for this is the TCP/IP stack moving from an abstraction level close to the hardware of the network interface layer to the application level protocols such as HTTP


On the downsides - layers hurt performance by adding latency. Also too many layers introduce added complexity to the overall solution (e.g. it is harder to debug).

It it interesting to note that Interfaces are in fact leaky layer abstractions (vs. for example SOA contracts which are not leaky) - since when you use an interface you still need to instantiate the object which is otherwise hidden behind the "layer" (interface). This is basically the reason we want something like dependency injection (DI) - to help make the abstraction complete and why languages like Ruby where the contract abstraction is complete - you don't need DI (I discussed Ruby and DI in another post)

Another issue which I mentioned here is the difference between logical layers and physical (or potentially  physical) layers. I usually call the first kind layers and the latter tiers. logical layers are local and can assume a lot about their neighboring layers. Tiers or physical layers can be distributed, which carries a lot of implications (something I discussed here recently in relation to MS Volta)





 
January 26, 2008
@ 11:31 PM
David J. DeWitt and Michael Stonebraker are at it again. There was a lot of buzz on the internet after their previous post (here is what I had to say about it).
Their first point on the new post tries to counter the claim that MapReduce is not a database so it shouldn't be judged as one. They claim that it isn't a matter of apples and oranges but rather
 " We are judging two approaches to analyzing massive amounts of information, even for less structured information."

The problem with that is they continue from there to define a problem in database terms and then show how MapReduce will not be as good as a database in solving it - well, duh.
The fact that isolated queries may run better in a pre-indexed database should come as no great surprise. As I noted in the previous post on the subject - MapReduce can be used to create the appropriate index or partition the data into smaller chunks that would be easier to use to answer the type of queries David and Michael mention.
As Mark Chu-Carroll explains Map/Reduce and databased don't solve the same kind of problems

Also what happens when the database is constantly updated ?!  - I don't mind how scientifically accurate are the measurements that say database scale like no other things. I am more comfortable with the empiric experience by companies like Amazon, Diggs, Google and ebay who found they have to shard their data to support their scalability needs and not use distributed transactions/distributed databased.


 
This post is part of a series of posts trying to define SOA as an architectural style. In the previous post I talked about how SOA builds on the Client/Server architectural style. In this post I'll talk about how SOA builds on the architectural style of Layered System.

Layered System or Layered architectural style is one of the most basic and widely used architectural styles. Here is a definition of Layered architecture I posted in the past
The layered style is composed of layers (the components) which provides facilities and has a specific roles. The layers have communication paths / dependencies (the connectors).

In a layered style a layer has some limitations on how it can communicate with other layers (the constraints). Typically a layered is allowed to call only the layer below it and be called only by the layer above it (but there are variants e.g. a layer can call to any layer below it;  etc. - all is fine as long as the layers communication paths are limited and restricted by some rules)
SOA takes the strict layers definition and restricts the knowledge of one service only to the service interface/contract of the other services. This means the services cannot be aware or care about the internal structure of other services. Services don't mind the internal structure of other services. This helps with introducing the  "boundaries are explicit" tenet  (although, it build on more than just layering)

The layered nature of SOA means you can also add additional layers between the services. One very common example is adding a servicebus (e.g. using an ESB or tools like NServiceBus) other examples can include load balancers, firewalls (see Service Firewall pattern) etc. Naturally, When you add intermediary layers  services don't talk to each other directly rather accept the services (such routing , message persistence etc.)  from the intermediary layer.

It should be noted, that in the context of SOA the layers are, in most cases, actually tiers. The difference is that tiers provide (potential) physical separation where as layers provide logical separation . When a layer is actually a tier it has extensive implication on the level of trust between the tiers (see my post "Tier is a natural boundary" for more details)

The next post in the series will talk about the "Pipe and Filters" style  and SOA. This is the first place where the REST architectural style and SOA diverge.


 
David DeWitt and Michael Stonebraker write about MapReduce in "The Database Column". Now I usually like what Michael Stonebraker writes (e.g. his piece on the RDBMS demise which I also wrote about myself). However I can't say that this time around.
David and Michael write that MapReduce is a big step backwards. before I'll talk about what they write, here is a (very high level) reminder what Map/Reduce is
MapReduce as Google's Jeffery Dean and Sanjay Ghemawat explain is a way to get automatic parallelization and distribution along with fault tolerance, monitoring and I/O scheduleing for tasks that need to work on complete datasets. MapReduce uses two functions:
  • Map - multiple instances of which run in parallel  to process a key/value pair and produce  produce a set of  grouping key(s) and intermediate values.
  • Reduce - which runs per grouping key and merge the intermediate values to a a set of merged outputs (usually one)
David and Michael claims that MapReduce is
1. a step backwards because it doesn't build on Schema
2. a poor implementation because it doesn't use indexes
3. not new
4. missing features - like bulk load, indexing, updates, transactions, integrity constraints, referential integrity, views
5. incompatible with DBMS tools - like report writers, BI tools, replication tools, design tools

Well, if anything, it seems that David and Michael don't really understand what MapReduce is. As I noted above MapReduce is a way to go over complete sets in an efficient distributed manner. In fact it can even be used to build the index of a traditional RDBMS. It isn't really competing wit databases Relational or other. Yep, comparing MapReduce and databse is the  apples and oranges thing...

I guess they might have meant to talk about another Google tool called BigTable - which is at least sort of a column database (Michael's company also makes a column database) for storing structured data in a highly distributed , high performance way. However David and Michael would still be wrong as BigTable is proprietary and targeted at a specific purpose so it isn't supposed to solve the same problems as a general purpose  database not to mention that it is highly scalable (ever heard of google's search engine ;) ) and does support things like indexes, updates etc.

Also as I mentioned in the "RDBMS is dead" post, the internet proved that RDBMS features (like transactions etc.)  can only only scale so much.  While Databases focus on the Consistency and Availability parts of the CAP conjecture and ACID tenets , internet scale systems pick Partitioning and Availability and BASE tenets instead.


 
Sam Gentile and myself exchanged a few blog posts on the definition of SOA, in the latest installment Sam disagrees with me that SOA should first be looked at in the pure architectural sense without bundling in the business and enterprise aspects.
In a nut shell I have two main reasons to prefer looking at SOA at the core as a pure architectural style.
The first is the when you bundle in enterprise-wide aspects of implementing SOA you loose out on the option (or the audience) that can use it to solve more local problem (i.e. at the product/solution level) using the same principles that bring the benefits on the enterprise scale.
The other reason I have  for separating the concepts is that the business encompassing definitions tend to be fluid, hand waiving ones and cannot be measured for compliance.
Consider the definitions Sam quotes from  Thomas Erl's books:
"SOA establishes an architectural model that aims to embrace the efficiency, agility, and productivity of an enterprise by positioning services as the primary means through which solution logic is represented in support of the realization of strategic goals associated with service-oriented computing." (emphasis by Sam)"

SOA represents a model in which functionality is decomposed into small, distinct units (services), which can be distributed over a network and can be combined together and reused to create business applications. [3]
Now what the hell is that? These are all noble goals but shouldn't this be the goal of any enterprise architecture ? What makes SOA unique in this sense?
Also how does these definitions help us build services? what makes a service a service ? Why is (or isn't) any web-enabled component a service?
Definitions that distance themselves from the architectural roots seems to me like smoke and mirror and contribute to the general confusion around SOA - to the point where even people like Harry Pierson wonder why we should even bother defining it

Personally, I still think it is worth while defining *** ( the architectural style, formerly known as SOA) since as I mentioned earlier it is (in my opinion) a useful architectural style for building distributed systems - whether the distributed system is a solution, a product, a product line or a complete enterprise





 
December 29, 2007
@ 10:49 PM

Sam Gentile comments about my attempts to define SOA (Part I, Part II, more to come..) and says that

"That's all well and true, but any definition of SOA must encompass the business drivers and business reasons, as SOA is not really about technology. It is about a better alignment of business and IT through business processes and services. The goal is to create a dynamic, more Agile and Dynamic IT that can respond quickly to new business opportunities and threats by quickly assembling new capabilities from putting together composite applications (and even Mash-ups) from reusable business services..."

I am sorry Sam, but I beg to differ, not about the importance of business drive behind implementing SOA, but about what SOA is. The culprit, in my opinion, is terminology overloading

 SOA is, as I said in the above mentioned post and numerous other times, is first and foremost an architectural style - as an architectural style it offers several architectural benefits and poses several architectural constraints. This has nothing to do with business drivers. it has to do with defining components, relations, attributes on relations and components as well as constraints. Now you can take those set of rules and use (or misuse) them as you like, in the context of a subsystem, single project, a product line or  an enterprise - this is your choice.

Applying SOA, on the other hand, has everything to do with the business . I'll take Sam's post word for word but instead of using the word SOA, I would prefer using the term SOA initiative. An SOA initiative is the effort of applying SOA in a wide context for an enterprise, aiming to increase the alignment of IT and the business etc. I would have to say though,  that in my experience, such an effort would rarely use SOA alone. It would also include other distributed architectural styles that also help with decoupling and loose coupling like EDA and REST to name a couple.


By the way, SOA has nothing to do with technology either. You can implement SOA using WS-*, Atompub, MSMQ, CORBA just as much as you can implement REST with quite a few technologies, it so happens that WS-* is a common implementation technology for SOA, and that HTTP is used as a common implementation technology for REST but both styles live independently of the technologies.


 
In the previous post  on defining SOA I claimed that SOA is an architectural style building on 4 other architectural styles. The first one of these is Client/Server.
Describing client/server is easy - not because I am such a genius (far from it) but it has already been done before numerous times. Let's take a look at the definition from  Roy Fielding  in his famous dissertation (The link is to chapter 3, REST is defined in chapter 5 if you are interested)

The client-server style is the most frequently encountered of the architectural styles for network-based applications. A server component, offering a set of services, listens for requests upon those services. A client component, desiring that a service be performed, sends a request to the server via a connector. The server either rejects or performs the request and sends a response back to the client. A variety of client-server systems are surveyed by Sinha [123] and Umar [131].

Andrews [6] describes client-server components as follows: A client is a triggering process; a server is a reactive process. Clients make requests that trigger reactions from servers. Thus, a client initiates activity at times of its choosing; it often then delays until its request has been serviced. On the other hand, a server waits for requests to be made and then reacts to them. A server is usually a non-terminating process and often provides service to more than one client.

Separation of concerns is the principle behind the client-server constraints. A proper separation of functionality should simplify the server component in order to improve scalability. This simplification usually takes the form of moving all of the user interface functionality into the client component. The separation also allows the two types of components to evolve independently, provided that the interface doesn't change.

The basic form of client-server does not constrain how application state is partitioned between client and server components. It is often referred to by the mechanisms used for the connector implementation, such as remote procedure call [23] or message-oriented middleware [131].

SOA takes from the Client/Server style the two roles - ie. in each interaction one party is the client (what I call service consumer) and the other is the server (service) which  handles the request coming from the client*. Unlike traditional client/server, the roles are held only for a particular set of interactions - a given interface that the service exposes. In another set of interactions the roles can be reversed and a component that once was a server can now act as a client even working with the very same component that was previously its client.

Like REST, SOA takes the constraint of separation of concerns which allow the service and its service consumers to evolve independently (as long as the interface is kept).
In order to support this, services should takes care of all its internal state without exposing its internal state or its internal structures outside of the service. This also allows the service to scale behind the interface but for that we also need constraints and capabilities from the next architectural style layered system, which I'll discuss in the next installment on this subject.


* You can compose SOA with other architectural styles to get different behaviors. E.g. compose SOA and  EDA and you can have the service also push data.This t isn't, however,  something SOA ,manifest in its basic form


 
December 20, 2007
@ 12:24 AM
Wes Dyer, one of the principal people behind the Volta tier-splitting was kind enough to leave a comment on my previous post. Here is one quote from that comment

"I do want to clear up a few things about Volta that we apparently
didn't make clear enough. We do not believe that you can develop an
application as if it will run on a single tier and then just sprinkle a
few custom attributes here and there and be done with it. More than
anything else, programmers need brains. Volta does not claim that
programmer brains can be checked at the door. When the programmer wants
to divide the application across a particular boundary then things like
network latency, new failure modes, concurrency, etc. need to be
considered at that boundary. What Volta does is make expressing the
transition between boundaries easier. It reduces the accidental
complexity of writing all of the boilerplate code to express the
programmer's intention. This allows the programmer to focus on the
essential complexity of his problem domain -- figuring out how to write
effective code for that particular tier boundary."

For one, it is good to hear that the architects behind Volta have a deeper understanding of distributed computing challenges - even if the first version doesn't seem to show it. I didn't use MS Volta enough to say that indeed the problem is not with the inherent capabilities and design  (let's just take Wes words for that). I am also not against saving the boilerplate code (though I would personally favor libraries rather than code generation and try to keep the "generation" gap to a minimum (i.e. the amount of generated code or the distance between the abstraction and the next concrete level)).
Lastly I am also in favor of trusting developers have brains and that it is ok to provide developer "sharp tools". So if all is good, where's the problem?

The problem is that you have to make it "easy to do the right thing" and provide the means to do the more complicated, less safe things. When I teach my young kids (and I can objectively say they are very smart :) ) to use a knife, I don't hand them the razor sharp, butcher knife first. They start with the plastic ones. When they've mastered that they can try something more dangerous. When you allow distributing something at a flick of an attribute and put marketing blurb on the site that makes it compelling to use it you create the wrong impression to the less experienced folks.

In one project which architecture I reviewed, the (very talented) architect/developer designed his own distributed transactions system (he shouldn't have been doing that in the first place - but that's for another post). When designing this he built in a lot ways to control the transaction behavior including the option to allow transaction participants to prevent rollback without failing the transaction.  Circumventing the transaction was as easy as making it work properly. Are there edge cases where you may need to have one participant violate the ACIDness of the transaction ? I guess  so - but that is not the general rule. Most of the time when you commit a transaction you expect it to be ACID. if for some reason it didn't behave that way - you want to know about it, even if it didn't actually failed. When you don't make it easy to do the right you get unexpected behaviors, you get hard to explain bugs, you get slow performance etc..
Developers using tools, smart as they may be, don't usually go and read all the source code of the tool/framework they are using (maybe they should). If two options are just as easy to use, it seems safe to assume they are just equally right. Things which are unsafe should be clearly marked as such to prevent mis-use by unexperienced users. This is especially true for tools that are targeted for common use and to ease the life of inexperienced developers


 
December 11, 2007
@ 11:27 PM
I got a couple of emails with questions reagarding  my previous post on Volta. So here's another go at explaining why dynamic-tiering is not a good move - this time in technicolor.

Let's start with a simple illustration. The diagram below represents a typical local component(A) in its environment. As a component that works locally, it has access to other local components which it interacts with. These can be objects it created by itself or objects that where injected to it. The likely design for local components is to have a chatty interaction - After all objects can talk to instances of other objects quite easily.



Now enters Volta (or any other  such framework - and I've seen a few. I am  ashamed to say but I even wrote one about 15 years ago) and says we'll just mark things we want to execute on a different server and everything would be fine. What you get is something like the illustration below:



We have the same number of interactions - only now all the interactions between A and its (used to be) near environment requires serialization, network interaction, possibly encryption, authentication, authorization and what not. You can imagine that this type of interaction can have a heavy hit on performance and scalability if it wasn't pre-designed somehow.

This is a bit of hand-waving so let me also give you an example from a real project. About 3 years ago I was invited to consult in a project. This was the kind of project that interacts with real things like sensors etc. I'll use an automated irrigation system to illustrate its architectural components. One type of component is "Things", these represent real devices you can interact with like sprinklers, soil sensors etc. Things represent the logical state of the real devices and cannot talk to each other. When two Things need to interact -e.g. we want to turn on the sprinkler if the soil is dry, we introduce another architectural component, we'll call it "Interaction" which looks at the state of the Things and can then act upon it. The last major type is "Services" (not services in the SOA sense) e.g. we can have a Service that reads the weather. Services can't interact with Things directly, but they can interact between them and they can interact with "Interactions". This particular system had dozens of Things, Hundreds of "Interactions" and "Services". And the tiers/process boundaries were as follows:


Interactions have to know about changes both in Things and Services so messages keep flying around this system to keep the Interactions in sync as well as propagate decisions made by Interactions. The outcome of this "smart" design is that every status change in a Thing results in an order of magnitude more messages to react to the change is status. I was brought in to find a way to find a way to get in-order reliable messages flow fast enough between the different tiers. I did my best and left -what they didn't want to listen to, and the better solution is to give a lot of thought about related Things , Interactions and Services and bundle them together into "tierable" component. The interactions within these "chunks" would be local and would then inflict a whole less messages on the system. In our example it makes sense to bundle the four components (sprinkler etc.) into a single tier and possibly the same process and increase the overall performance significantly while also giving us  more cohesive boundaries.




(as a side note I'll just mention that I ran into someone who is part of this particular pr