I am working on moving my blog to wordpress and as part of the effort I am cleaning up and rearranging some of my older posts. Since my readership has increased substantially compared with the time I started blogging I think some of them are worth republishing. I think that the series on SPAMMED, my software architecture meta-framework falls under the category.

Overview
There is very little guidance on how one can go about designing/developing an architecture for a software project. The SPAMMED architecture framework (SAF) aims  to help fill this gap. SAF  is a set of activities that an architect can follow when she sets out to design an architecture. These activities helps the architect to keep abreast of the project's needs and the drivers that affect the architecture. The Activities of SAF include

  • Stakeholders -identify the stakeholders   - Anyone with vested interest in the project (end users, clients, project manager, developers etc.) These are the people you will have to explain you architecture to. These are the people that have concerns that the architecture will have to satisfy (or most likely balance). Thus, the fist step is to identify and rank them.
  • Principles – list Principles, Goals and Constrains. These are the properties you wishyour architecture to have (or lack) based on your previous experience. Constrains can also come from your stakeholders (e.g. management decided that all project should be .NET, a deadline etc.)
  • Attributes –  discover quality attributes, the non-functional reqeirements, that (once prioritized) serve as the guide for the overall goodness of the system (Performace, Availability, scalability etc.)
  • Model – model (and document if needed)  the architecture as seen from different viewpoints (list of viewpoints is stakeholder driven). Example for viewpoints include package diagrams, deployment diagrams, DB Schema etc. etc.
  • Map – Technology mapping, buy vs. make decisions etc.
  • Evaluate – Since architecture is the set of decision that are hardest to change it is worthwhile to spend some time trying to evaluate if they are indeed correct before commencing on
  • Deploy – Software architectures are not a fire and forget thing. As an architect you still have to make sure that the guidelines set are indeed followed and even more importantly that the architecture chosen indeed match the project’s needs and doesn’t have to be reworked.
Download the SAF introduction presentation (pdf) or watch it on slideshare You can also  read the DDJ article and read all the posts I made on the subject here :



 
Tags: Requirements  | Software Architecture | SPAMMED Process

April 28, 2010
@ 11:41 AM

After a long hiatus, I guess it is time for another SOA anti-pattern to see the light. It is probably also a good time to remind you that I am looking for your insights on this project. In any event I hope you’d find this anti-pattern useful and as always comments are more than welcomed (do keep in mind this is an unedited draft :) )

-------------------------------------

There are many unsolved mysteries, you’ve probably heard about some of them like the Loch Ness monster, Bigfoot etc. However, the greatest mystery, or so I’ve heard, is getting the granularity of services right… Kidding aside, getting right-sized services is indeed one of the toughest tasks designing services – there’s a lot to balance here e.g. the communications overhead, the flexibility of the system, reuse potential etc. I don’t have the service granularity codex and deciding the best granularity depends on the specific context and decisions (e.g. the examples in the Knot anti-pattern above). It is an easier task to define what shouldn’t be a service for instance, calling all of your existing ERP system a single service should definitely be shunned. The Nanoservices anti-pattern talks about the other extreme… the smaller services

Consider, for instance, the “calculator service” which appears in samples web-wide (I’ve personally seen examples in .NET, Java, PHP, C++ and a few more). A basic desk calculator, as we all know, supports several simple operations like add, subtract, multiply and divide and sometimes a few more. Implementing a calculator service isn’t very complicated - Listing 10.1 below, for example, shows part of WSDL for a java calculator service that, lo and behold, accepts two numbers and adds them.

Listing 10.1 excerpt from a WSDL of a stateless calculator service example. The sample only includes the data needed for the “Add” operation. The add operation accepts two numbers and returns a result (http://cwiki.apache.org/GMOxDOC21/jaxws-calculator-simple-web-service-with-jax-ws.html)

<wsdl:types>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"

xmlns="http://jws.samples.geronimo.apache.org"

targetNamespace="http://jws.samples.geronimo.apache.org"

attributeFormDefault="unqualified" elementFormDefault="qualified">

<xsd:element name="add">

<xsd:complexType>

<xsd:sequence>

<xsd:element name="value1" type="xsd:int"/>

<xsd:element name="value2" type="xsd:int"/>

</xsd:sequence>

</xsd:complexType>

</xsd:element>

<xsd:element name="addResponse">

<xsd:complexType>

<xsd:sequence>

<xsd:element name="return" type="xsd:int"/>

</xsd:sequence>

</xsd:complexType>

</xsd:element>

</xsd:schema>

</wsdl:types>

<wsdl:message name="add">

<wsdl:part name="add" element="tns:add"/>

</wsdl:message>

<wsdl:message name="addResponse">

<wsdl:part name="addResponse" element="tns:addResponse"/>

</wsdl:message>

<wsdl:portType name="CalculatorPortType">

<wsdl:operation name="add">

<wsdl:input name="add" message="tns:add"/>

<wsdl:output name="addResponse" message="tns:addResponse"/>

</wsdl:operation>

</wsdl:portType>

<wsdl:binding name="CalculatorSoapBinding" type="tns:CalculatorPortType">

<soap:binding style="document" transport="http://schemas.xmlsoap.org/soap/http"/>

<wsdl:operation name="add">

<soap:operation soapAction="add" style="document"/>

<wsdl:input name="add">

<soap:body use="literal"/>

</wsdl:input>

<wsdl:output name="addResponse">

<soap:body use="literal"/>

</wsdl:output>

</wsdl:operation>

</wsdl:binding>

<wsdl:service name="Calculator">

<wsdl:port name="CalculatorPort" binding="tns:CalculatorSoapBinding">

<soap:address location="http://localhost:8080/jaxws-calculator/calculator"/>

</wsdl:port>

</wsdl:service>

Calculator services can be even more advanced and have memory - consider listing 10.2 below, which shows an interface definition for a .NET (WCF) sample that uses workflow services and accepts a single value at a time

Listing 10.2 a Service contract definition for a statufil calculator service (http://msdn.microsoft.com/en-us/library/bb410782.aspx). The service accepts a single number at a time and remembers the former state from operation to operation.

[ServiceContract(Namespace = "http://Microsoft.WorkflowServices.Samples")]

public interface ICalculator

{

[OperationContract()]

int PowerOn();

[OperationContract()]

int Add(int value);

[OperationContract()]

int Subtract(int value);

[OperationContract()]

int Multiply(int value);

[OperationContract()]

int Divide(int value);

[OperationContract()]

void PowerOff();

}

The calculator service (both versions of it) is a very fine grained service. Naturally, or hopefully anyway, the calculator examples are just over simplified services used to demonstrate SOA related technologies (JAX-WS in the first excerpt and WCF and WF in the second one). The problem is when we see this level of granularity in real life services

1.1.1Consequences

Problem? Why is “fine granularity” a problem anyway? Isn’t SOA all about breaking down monolith “silos” into small reusable services? More so, the finer grained a service is, the less context it carries. The less context a service carries the more reuse potential it has – and reuse is one of the holy grails of SOA isn’t it? The calculator service above seems like the epitome of a reusable service. There’s no doubt we can reuse it over and over and over.

Reuse is indeed a noble goal (I’ll leave discussing how real it is for another occasion), the culprit of fine grained services, however, is the network. Services are consumed over networks – both local (LANs) and remote (extranets, WANs etc.). The result is that services are bound by the limitations and costs incurred by those network. Trying to disregard these costs is exactly what ailed most, if not all, RPC distributed system approaches that predated SOA (Corba, DCOM etc.) - The calculator service and other similarly sized services are nanoserivces.

Nonoservice is an Anti-pattern where a service is too fine grained. Nanoservice is a service whose overhead (communications, maintenance etc.) out-weights its utility.

So how can nanoservices harm your SOA? Nanoservices cause many problems, the major ones being poor performance, fragmented logic and overhead. Let’s look at them one by one

Every time we send a request to a service we incur a few costs such as serialization on caller, moving caller process to the OS network service, translation to the underlying network protocol, traveling on the network, moving from the OS network service to the called process, deserialization on the called process – and that’s before adding security (encryption, firewalls etc), routing , retries etc. Modern networks and servers can make all this happen rather fast but if we have a lot of nano-services running around these numbers add-up to a significant performance nightmare,

Nano-services cause fragmented logic - almost by definition. As we break what should have been a meaningful cohesive service, into miniscule steps our logic is scattered between the bits that are needed to complete the business service. The fact that you need to haul over several services to accomplish something meaningful also spell increased chances of the Knot anti-pattern, mentioned above.

Proliferation of Nanoservices also causes development and management overhead. Just look at the amount of WSDL needed to define the calculator services in listing 10.1 above and for what? A service that adds a couple of numbers… There is a relatively fixed overhead associated with managing a service. This include things like keeping track of a service in a service registry, making sure it adheres to policy, writing the cruft (things we have to write around the business logic) for configuring it etc. Having nano-services around means we have to do this a whole-lot more times (i.e. per service) compared with having fewer coarser grained services.

The point of overhead out-weighing utility that appears in the Nano-services definition above is subtle but important. The fact that a contract does not have a lot of operations means we want to make sure we don't have a nano-service, but it doesn't automatically mean that it is. For instance, a fraud detection service contract might only accept transaction details and decide whether to authorize the transaction, deny it or move to further investigation. However the innards of this service involve a complex process like running the details in a rule engine checking for fraudulent behavior patterns, matching to black lists etc. In fact Fraud detection is such a complicated issue that these are actually systems and a SOA based one would be comprised of several services in itself.

The other side of the equation is also true a comprehensive contract does not guarantee a service is not a nano-service. For instance, in a system I designed on the initial iterations we developed a resource management service. It supported some very nice operations like getting status of all the services in the system, running sagas and of course allocating services. Allocating services meant that whenever an event went out that needed a (new) service instance to handle it, we had to make a call to the resource manager to get one. This provides for a neat centralized management and also for a performance bottleneck that slows the whole system. To solve this we went with distributed resource management but that't beyond the scope of this discussion. The point, however that is that the utility of the resource management (e.g. easy management of running sagas ) vs. overhead associated with the service (the number of calls and performance hit on the system) was not worth it – Hench a nano-service.

1.1.2Causes

From a more technical point of view, we get to nanoservices from not paying attention to at least a couple of the fallacies of distributed computing. Mentioned in chapter 1, the fallacies of distributed computing are a few false assumptions that are easy to make and prove to be wrong and costly down the road. Specifically, we are talking here about assuming that

§ Bandwidth is infinite – Even though bandwidth gets better and better, it is still not infinite within a specific setup. For instance in one project we were sending images over the wire and distribute them to computational services (a la map/reduce – see also Gridable Service in chapter 3). Things were working ok when we sent small images, but when we sent larger images we understood we were sending them as bitmaps and not as much more compact jpegs which caused a burden on the backbone of our switches which wasn’t ready for that load.

§ Transport cost is zero – As explained in the previous section every over-the-wire call incurs a lot of costs vs. a local call (also see figure 10.5 below).The costs of the transport can be considered both from the time it take to make each of these calls but even the real dollar value attached to making sure you have enough bandwidth (connection/routers, firewalls) to handle the traffic incurred

clip_image002

Figure 10.5 Local objects can “afford” to have intricate interactions with their surroundings. A similar functionality delivered over a network is more likely than not to cause poor performance because of the network related overhead.

Another reason to get Nano-Services, at least for beginners are poor examples – as, noted the calculator services above are taken from real examples provided by various vendors. SOA newcomers and/or people without a lot of distributed systems development experience can be easily take these samples at face value, and go about implementing services with similar granularity. The fact that that web-service framework mostly map service calls to object method calls makes this even more tempting.

Nano-services is also an inherent risk when applying the orchestrated choreography pattern. Adding an orchestration engine, capable of controlling flow and external to services tempts us to think that we can use it to drive all flow as little as it may seem. Couple this with the fact that the smaller the services are the more “Reuseable” they are (less context) and, again, you may end up with a lot of nano-services on your hands.

Lastly, since the nano-services boundary is soft (remember utility vs. overhead weight) behaviors that can look promising at design time can prove to be nano-services moving along (like the resource manager example above). This can be an acceptable if your SOA is developed iteratively (see 10.2.4 exceptions below) but it still mean that we have to come up with ways to refactor nano-services.

1.1.3Refactoring

There are basically two main ways to solve the nano-services problem. One, which is relatively easy, is to group related nano-services into a larger service. The second option, which is more complicated, is to redistribute its functionality among other services. Let's take a look at them one by one.

On one project I was working on we needed to send out notifications to users and admins via SMS messages. Since the software component that did the actual SMS dissemination was a 3rd party app we’ve decided to create a simple service (not unlike OO adapter) that accepts requests for SMS and talks to the 3rd party software. A nano-service was born, it even got a nice little name Post Office Service (ok, ok the original name was Spam Server but I thought it would look bad in presentations J).

Why is this a nano service? Well, it really doesn’t do much and it would be even simpler to package this as a library that other services can use and it does have all the management overhead of maintain as another system service.

What we did about it was to add similar functionality to the service so it also learned to send emails, tweets and MMSs. A serendipitous effect of this was that now instead of sending a request like TweetMessage or SendSMS to this service we could now raise more meaningful events such as SystemFailureEvent and have the service make decisions on how to alert administrators based on the severity of the problem etc. So combining the related functionality helped make the overall service even more meaningful.

Unfortunately it isn’t always possible to take the functionality of Nano-services and find suitable “other services” (nano-or right-sized) that can assimilate them. In those cases getting rid of a nano-service is more of an exercise in redesign than is a refactoring. For instance, in a project we’ve built we had a services allocation service (SAS). The SAS role was to know about other services location and health status and utilization and upon a request, such as beginning a Saga (see chapter 5 for the saga pattern) decide what service instances should be used. The service also provided “reporting” capabilities for active sagas, services utilization etc. This might not sound like a nano-service, and at first we thought so too, but as the project progressed we found that being a central hub, as seen in figure 10.6 below, made the SAS a performance bottleneck, incurring additional costs (in latency) on a lot of the calls and interactions made by other services. The utility of the SAS, of finding what service instance to talk to, was being diminished by the cost – yep it is a nano-service after all.

clip_image004

Figure 10.6 An example for a nano-service. The SAS service is a performance bottleneck as a lot of calls go through it. It provides an important service but the costs of its are too dear.

To solve the SAS problem we had to put in quite a lot of work. The solution, essentially was to move to distributed resource management, so that each service had some knowledge of what the world looks like so that it could decide what service instant to talk to by itself.

To sum this section, sometimes it is easy to notice that something is a nano-services, chances are that in these cases it would also be easy to take the functionality and group it with related functionality in other services. However on other occasions the fact that a service provides too little benefit is not as apparent and only becomes clear as we move along. In those cases it is also harder to fix the problem. One question we still need to cover is are there any situations where we would go with a nano-service even if we know it is one on the onset.

1.1.4Known Exceptions

When is it ok to have Nano-services? When you are starting out. When your approach to SOA is evolutionary and you don’t plan everything in advance (something that rarely work anyway, but that’s another story), there’s a good chance that first versions of services you build will not show a lot of business benefit, but they will already need the full overhead of a service. The post office service in the example above is a good example for that as starting out it only dealt with a single type of message and it didn’t do a whole lot with it either.

The post office service is also a good example for another reason to have a nano-service which is when you want to build an adapter or bridge to other systems be that legacy systems or 3rd party ones. In these cases you need to weight the advantage of using a service vs. building the same functionality as a library that can be used within services, but in many cases keeping the flexibility and composability of SOA can triumph over the overhead associated with having an additional service to manage.

Lastly, one point to keep in mind is that NanoServices is a rather soft pattern and the value of a small service can radically change from system to system or even in a certain system as time and requirements progress. It is worthwhile questioning our assumptions and looking at the services that we grow from time to time to validate the usefulness of what we’re building.


 
Tags: SOA | SOA Patterns | Software Architecture

It has been quite awhile since I added anything new to the book. I have my reasons (some would probably say excuses :) ) mainly that finding the energy and time to write is very hard with a wife, 3 kids and a startup.

Anyway, I’ve been talking with Manning lately trying to figure out what to do with this project. I was quite amazed to learn that 1000 or so of you purchased the MEAP edition even though it only contains 5 chapters and haven’t been updated in a long time. (by the way I’ve also recently learned that book pirating sites are offering the book for download, but that’s another story). Anyway, we’re trying to decide what we want to do and this is where I’d love to hear some feedback from you

1. Do you think the book is still relevant today?

2. Do you think the book should be restarted form scratch to reflect recent

3. Do we need to cancel the book and end this fiasco ? or push it though to completion?

4. How valuable do you find the information in the book so far?

I am ready to put in time to to add enough patterns/anti-patterns to make the book releasable. but since I know it takes oodles of time I don’t really have I really want to know I am not wasting my time

Please leave a comment/drop me an email if you have anything to say

Thanks (and thanks for continues patience so far)


 
Tags: SOA | SOA Patterns | Software Architecture

April 1, 2010
@ 09:58 AM

There seems to be some backlash building up against NoSQL with posts like Ted Dziuba  “I Can't Wait for NoSQL to Die” or Dennis Forbes’  “The Impact of SSDs on Database Performance and the Performance Paradox of Data Explodification (aka Fighting the NoSQL mindset)”.

These are interesting articles to read and yes RDBMSs are not going the way of the dodo yet (I even said that in  “The RDBMS is dead”, which by the way, was written before NoSQL was coined, but I digress ). Nevertheless, other options, namely various NoSQL choices, prove to be easier to work with, and  are  a better fit in some circumstances.

Few time (ok very few times), that’s because of sheer size (The “Google”s) but what about the rest of us, should we just stick to good-old-RDBMSs?

If you are have a good understanding of RDBMSs (and/or a great applicative DBA working in your team) then in many cases the answer would be yes. I had a chance to review a couple of systems recently that deal with large amounts of data (in the Petabytes) both systems had a very good DBA teams working on the solution to a point that even ORM was not that important for the solution.

However, in many other cases NoSQL can be a better option. For instance let’s take the Digg case Dennis Forbes talks about. Assuming, for the sake of argument that Dennis completely nailed Digg’s requirements(*) and his single computer +SSD solution is a reasonable solution for Digg. Still the fact is that the Digg’s team where not able to solve their problem with an RDBMS and the same team was able to pull it off with a NoSQL solution. So even if we say the team is mediocre (something they are probably not), NoSQL made it easier to solve a problem they couldn’t have otherwise – that’s a good thing in my book. 

There are additional reasons to use NoSQL besides  sheer size and better alignment with developers’ mind-share though. The more important ones of those are  to get cheaper (vs. comparable commercial RDBMS solutions) scalability and availability.  Again if we look at the Digg switch to Cassandra they cited  “the lack of redundancy on the write masters is painful” and the administrative overhead as main reasons.

In any event it It is important to keep in mind that  NoSQL is not a silver bullet. You need to assess whether or not it is suitable for the problem you have at hand for instance NoSQL solutions place emphasis on different parts of the CAP theorem than RDBMSs do is this what you need? etc.

 

 

* I think that Dennis over-simplify Diggs situation since they also, for example, have heavy concurrent writes etc. 


 
Tags: Software Architecture | Trends

March 23, 2010
@ 10:29 PM

A few days ago I was contacted by Lianping Chen a doctoral researcher from the Irish Software Engineering Research Centre. Lianping is doing research on “how to elicit architectural significant requirements” and he asked me a few questions, which I though, might be interesting to a wider audience.


1. Do you agree that architecture design and requirements elicitation are usually in parallel or have a big time overlap? In other words, Architectural design usually starts before requirements elicitation ends, and usually a big time overlap exists between these two activities.

While it seems to me Lianping is thinking mostly about large waterfall (or waterfall-ish) projects. I’d say this is true for all kinds of projects (agile, lean, iterative and waterfall). The tradeoff usually is waiting for some unknown point in time where the requirements will be set, known, approved and whatnot and starting to move forward. Architecture usually needs to start rolling when the data (in the sense of what exactly the architectural requirements are) is incomplete. This is exactly why I think Architecture should evolve over time (see the previous two posts on “Evolving Architectures” – part I, part II).

In my experience, in large projects, it is beneficial for the architects to be part of the requirements elicitation team. It is true some architectural requirements can be derived from “pure” functional requirements. However most of the architectural requirement can (and in my opinion, should) be formed  by a deliberate effort. I personally, like the scenario based approach of  creating a utility tree. Note that this does not contradict the point above, which says that some of these requirements will probably be off of the real requirements which will surface during the project.


2. Do you agree that when perform architecture design, making some decisions depends on the outcome of some other decisions? I other words, a sequence/order exists for the set of architectural decisions need to be made.

Yes some architectural decisions depends on others. For example, technological choices can change other architectural decisions – after all technologies usually come with their own set of architectural decisions made by the people that created them. On the other hand, it is important to understand that some decisions can progress or be made  in parallel as well. For example, UI architecture design can progress and form while the distribution architecture is still being debated.


3. Do you agree that, in most cases, for a particular architectural decision, only a mall portion of the whole requirements actually influences the decision making? In other words, there is mapping exist from the architectural decisions you need to make and the requirements required when you make those decisions.

Right, most of the functional requirements drive design and the implementations. Only a relatively small part of the requirements drive the architecture.


4. Do you agree that requirements engineers usually do not know clearly what items/aspects of requirements are architectural significant (i.e., it is not easy to distinguish architectural significant requirements from normal non architectural significant requirements)? Thus, they may ignore/miss some items of requirements (some of them may just look like trivial details) that are actually architectural significant during their requirements elicitation.

Yes, this is natural. “Requirement engineers” (or product owners in agile projects) are mostly concerned with the business aspects of the system  - and rightfully so. It is important to have architects involved in analyzing requirements. Also, as I mentioned in a previous answer, it is even more important to have architects work with the different stakeholders (req engineers included) to specifically elicit architectural requirements. Sessions dedicated to quality attributes and forming them as scenarios in the system can help bridge this gap (the scenarios provide the connection to the functional reqs, and architectures are build around quality attributes)


5. Do you agree that requirements engineers usually are not aware of which items/aspects of requirements are required urgently by architects and which items/aspects of requirements can be scheduled a bit later to elicit? Thus, they may first elicit the requirements required for making the decisions in the tail of the decision making sequence, and schedule the elicitation of requirements required for the architectural decisions in the front of the decision sequence to the end of the requirement elicitation phase.

I am not even sure architects are fully aware which items and requirements are important :)  . In any event, as mentioned above,there are basically two types of architectural requirements. The first kind is requirements that can be elicited by direct, intentional analysis of the system from quality attributes perspective (thinking about scenarios where security, availability, scalability etc. come into play). For example in my current system (xsights) one requirement under extensibility – effort to change (data) we have the  scenario “Under normal conditions, refreshing the system’s data (links, interactions etc.) shall not require a system restart.”

The other type of architectural requirements  are derived from functional requirements i.e.  specific functional requirements whose implementation can have significant implications on the system. For example in a previous version of our system we had the requirement to handle 3G video call. Components that participate in this need to be stateful (as there is constant streaming of video in and out)

The conclusions are that, again, architects needs to be involved in the requirement gathering effort; architects need to lead specific session(s) that involve eliciting requirements based on quality attributes analysis. Architectures need to evolve over time as requirements significant to the architecture may (and a lot of times do) pop up during the development effort


6. Do you agree the following statement: If the requirements engineers are informed of what items/aspects of requirements are required and when they are required by the architects for making architectural decisions, the requirements engineers will be able to properly schedule their requirements elicitation activities and elicit the required requirements in enough detail and precision. So they will have higher chance to provide the required requirements to the architects before the architects make those architectural decisions.

Yes to some extent, but architects involvement as mentioned above, would probably yield better results (in my opinion of course)


7. What are the main issues in eliciting architectural significant requirements? What researchers can do to solve these issues

I think I already answered that – but if not feel free to ask for clarifications


 
Tags: Agile | Q&A | Software Architecture | SPAMMED Process

This is part II of a series on agile architecture. You can read part I here.

In the previous installment I provided a definition for software architecture and raised the apparent friction between the up front design implied by software architecture and the YAGNI approach and deferred requirements prompted by agile development in  the large. This installment take a look at an additional angle of the problem which is the difference between design and architecture (while architecture is a type of design I cam calling design the code or close abstractions of it   that don’t yet fall under “architecture” as defined in the previous post). The difference between the two, as the title suggests, is that while design can be emergent, Architecture, unfortunately needs to evolve.

Unless you’ve been living under a rock, you’ve probably already know about Test Driven Development (TDD) or its cousin BDD. If you’ve actually used TDD, you’ve probably noticed the impact it has on the actual code. Writing code only to cater for defined tests coupled with refactoring that keeps the design tight we get to a result that, more often than not, simpler than going through the more traditional design first (you can take a look at Uncle Bob Bowling Kata for a simple, albeit synthetic , sample)

In fact, a more accurate expansion of the TDD acronym is Test Driven Design. Yes the tests are there to make sure the code adheres to the specified requirements. However, the iterative process (test,code & esp. refactor) makes the design emerge . The emergent design effect, works very well with the goals and tenets of Agile and Lean methodologies as it helps eliminate wasteful future-proofing  code that we don’t need; eliminate waste of extra component (due to pre-design) etc.

This sounds great, so it is very tempting to take that  to the architecture level, after all Architecture is a type of design, wouldn’t it be efficient to do TDA (Test Driven Architecture) as well. I think that in theory it might be possible. However as the old adage goes “Theory and practice are the same – at least in theory”. So in practice, the problem is that architecture is global and have solution-wide consequences. Time and size (of work) constraints make us want to set the playfield for the solution as soon as possible. Architecture decisions are hard to postpone on the one hand and have extensive influence on the other. E.g. a buy vs. build decision; 3-tier vs. SOA; RDBMS vs a NoSQL solution etc. if we try to have the architecture grew the way “regular” design can the ripple effects will be devastating to the development process.

What we have to do instead is evolve the architecture. Start with something that is in the ball park and then when we know more about the problem make changes, evolve the architecture over time. This is beneficial since requirements also tend to change over time so if we find a way to evolve the architecture we can deal with that as well.

In the next part I’ll try to give a couple of examples on how this can actually be done i.e. how you can evolve an architecture

 

 

 

 

*A better name is Test Driven Design but that’s not how it stated, as far as I know anyway


 
Tags: Software Architecture | TDD

I’m writing a short series of posts for MS Israel MCS blog (in Hebrew) and I’d thought I’d translate them to English, as it seems to me they are interesting enough.

In this series I am going to talk about Evolutionary Architecture or , some of the aspect of dealing with software architecture in agile projects. The topic is interesting since architecture and agile seems to have some conflicting forces at work to better understand that let’s start by defining software architecture

There are many definitions for software architecture, with the simplest one (attributed, I think, to Kent Beck) that software architecture is what software architects does. Leaving the fact that (unfortunately) sometimes software architects are very far from building software architectures, the definition doesn’t tell us much. There are many definitions around some are good and some are bad. My current definition is

“Software architecture is the collection of decisions affecting the system’s quality attributes; which have global effects and are hardest to change. Software architecture provides the frame within which the design (code) is built.”

Let’s review the components of this definition

  • “affecting the system’s quality attributes” – I’ve written a lot about quality attributes in the past. In a nut shell, quality attributes (often going by the name “non-functional reqs.”) includes aspects of the system like scalability, security, availability etc. Architectural decisions have a direct effect on the system’s ability to meet these types of goal;
  • “Global effects” – Design decisions effect the module or the class where they happen. Decisions with macro effect (e.g. choosing a technology, scaling approach) can completely alter a solution
  • “Hardest to change” – The most interesting part of the definition, at least in regard to evolutionary architecture.The definition mentions that the code is built within the frame and rules set by the architectural decisions. Change in these decisions can have significant consequences. As a (over simplified) example - You can’t take a standalone system developed in Access and move it to a service oriented, Hadoop based solution with out major changes in the code, data flows and what not.

The definition of Software Architecture above, seems to prescribe that for best results we need to do a lot of up front design to get the architecture right. If that’s true than we have a severe mismatch with agile and/or lean where handling requirement and design up-front is a big no-no (YAGNI- you ain’t gonna need it, comes to mind) – Is there any way to make them work together.

I think yes, and as you’ve probably guessed, the answer is evolving the architecture over time. While this may sound simple it isn’t – I’ll try to give a few strategies to make that work later in the series. Before that, the next part, we’ll examine why design can be emergent which architectures need to evolve


 
Tags: Agile | Software Architecture

Moving to architectures like SOA that increase the number of overall “moving parts” or components in the system means that reliability is going down. It is simple math really – if you have 10 components each with a 0.99 reliability then the total reliability is 0.99^10 or 0.904 and that’s before we take into account messages traveling over the wire and the network’s reliability (or lack thereof). What this does is leave us trying to build reliable systems from (a growing) bunch of unreliable components. I know, I know, there’s nothing new here. We’ve been using techniques like redundancy, statelessness etc. to help mitigate this since the beginning of times. With these techniques we decrease the “Mean Time Between Failure” (MTBF) but increase  the “Mean Time Between Critical Failure” (MTBCF) or the system’s overall MTBF.

Another aspect of reliability (and reliability calculations) is MTTR or “Mean Time To Repair” which in software mainly has to do with how much time does it take before we know something is wrong. The usual approach to that is monitoring which I’ve written about in the past (e.g. the blogjecting watchdog pattern). In this post I want to expand a little on another approach , which while not common in IT systems, can be useful at times.

Enter the BIT – which is short of “Built In Tests”. BIT is a technique I picked up when I worked on multi-disciplinary systems that also included embedded systems. Each and everyone of the embedded systems we developed (or integrated into the solution) supported BIT . Actually they usually supported several types of BIT at least PBIT, CBIT and IBIT

  • PBIT – Power-On Built In Test – usually a short test the system runs to make sure all of its components are ready to go. You actually saw this one a lot of times since this is what motherboards do as you turn them on (all the blips and lights etc.)
  • CBIT – Continuous Built In Test – Make sure the system is functioning, even when it isn’t really busy so we’ll know about problems before we actually try to use the system
  • IBIT – Initiated Built In Test – provides a way to find out exactly what’s wrong when one of the other test types failed

BIT is very understandable for embedded systems, after all these are closed boxes with limited access to their innards and inner workings. but isn’t that also true for SOAs? After all we are building a bunch of blackboxes that interact to provide some business benefit, how can we be sure that everything is working fine esp. when we don’t control fully control some of the parts?

As mentioned above, a system, especially a distributed one, is built from relatively unreliable components. A continuous test helps us make sure things are working as expected. What we are doing is taking some of the code we wrote to run integration and acceptance tests (which runs a scenario end-to-end) deploy it as a service into the system which we call “liveliness check” and have it run periodically. Every time the liveliness  runs it sends a notification (twitter message) so we know the test itself works. If it fails it sends more notifications (twitter, Email, SMS etc.) to an administrator.

This liveliness or CBIT serves as an early warning system. Since the end result is known in advance we can have a pretty good idea if something went wrong. E.g. we know how much time it should take for a test Id, we know what the result of that image is etc. The fact that it works even when the system is in low utilization means we can find out about problems and deal with them before they happen to end-users. That’s a big plus for us.

The advantage over regular monitoring solutions (this is not an either/or – monitoring is also needed) is that you know the specific business scenarios are properly working, which is a higher confidence that things are ok from knowing a specific server or service is running.

On the flip side, or the downside of adding a periodic liveliness is adding complexity into the system. In our case, we have to add a process to clean the traffic data added by the test messages. Also, while we try to make the system behave as usual as much as possible,  certain parts of the system will have to know about the test messages and handle them differently. Again, in our case the reporting has to know to disregard test messages and not count them. This is even more problematic in other types of systems, for instance if you simulate an order, you don’t want the purchase order to actually go out to a supplier.

To sum this up, adding a liveliness check as part of the system to create a continuous built-in-test can increase your confidence that things are working as they should. It can also help you identify problems earlier. Like everything in life, it doesn’t come without tradeoffs and you should weight your benefits vs. costs before utilizing it in your systems.


 
Tags: SOA | SOA Patterns | Software Architecture

November 15, 2009
@ 12:07 PM

Jesse Ezell  left the following comment on my previous WCF rant (Windows Trick-or-treat Foundation)

You wouldn't expect WCF to take care every TCP/IP registry setting as well would you? At some point, the things WCF exposes have to stop so transport specific settings come into play. What would really suck is if WCF completely abstracted every detail of every transport and came up with new names for things like cookies and specific http request and response headers in the name of creating a "unified" experience across every transport.
The point of WCF is to give you a unified communication API. You don't have to change your code to switch from HTTP, to MSMQ, to TCP, etc. However, the point of WCF is not to hide all the specifics of each transport from you. There are a lot of transport specific options in WCF that you won't find in a WSDL file. However... none of these settings actually require you to change any code to take advantage of them. IMO, WCF does a really good job of providing a uniform communication API and limiting transport specific details to configuration settings. Even in this case, you were able to resolve the issue by configuration settings rather than code changes.
If you had used http request manually, this is one of very few settings you would actually have control over via configuration. What about other popular competing APIs? Look at something like NServiceBus. NServiceBus has a transport model that abstracts communication to some degree... but what happens when you want to change the format of the messages on the wire? For example, maybe you want to switch from raw XML messages to messages with SOAP envelopes on an endpoint or limit the depth of XML node hierarchies to protect against XML attacks. Maybe you want to switch to a completely new transport that was provided by a vendor that has never seen your product. Maybe you want to add compression, or chunking, or certificate based security. Can you do that all via configuration files without ever touching code in any other .NET API? Maybe this is my own ignorance... but I don't know of any .NET communication API that offers even half of the flexibility of WCF.

Well, I would actually expect WCF to do one of two things either provide a complete API of all the communications need (next version of .NET communication , unified communication model and all that..) and retire the other .NET communication libraries or on the other hand provide a thin layer of abstraction that will make it clear you need to move to the specific underlying protocols.

What we’ve got now is something that isn’t quite there on the first and way away from the second – which means that when you try to do serious stuff with WCF you hit these unexpected (ok now they are) snags where you don’t know where to go – until you realize that this is a specific thing regarding TCP this or HTTP that which is not readily apparent and is not well documented or even worse you need to set it outside of WCF altogether.

In my experience , you can’t in fact, “just change the binding” and expect everything to work unless you are doing very simple stuff. For instance when  if you move from HTTP binding to TCP you’d find that the channels are suddenly getting closed after periods of inactivity and you need to “keep them alive” or if you move in the other direction from TCP to HTTP you’d find that the size of messages gets larger by an order of magnitude  etc.

Not to mention the  “training wheels” approach to setting defaults (at least some of it is fixed for .NET 4) which I talked about a few times in the past. Also there cryptic error messages that make you scratch your head looking for the configuration item you need to set. for instance if your send a large message (>8K) you won’t see any problem in the sending side but you’d get the following

“An exception of type 'System.ServiceModel.Dispatcher.NetDispatcherFaultException' occurred in mscorlib.dll but was not handled in user code

Additional information: The formatter threw an exception while trying to deserialize the message: There was an error while trying to deserialize parameter http://tempuri.org/:ClientPrintResult. The InnerException message was 'There was an error deserializing the object of type System.String. The maximum string content length quota (8192) has been exceeded while reading XML data. This quota may be increased by changing the MaxStringContentLength property on the XmlDictionaryReaderQuotas object used when creating the XML reader. Line 2, position 40523.'.  Please see InnerException for more details.”

I am sure all of you immediately understood you need to set  ReaderQuotas, MaximumReceiveMessageSize and MaxBuferSize on the binding e.g.:

var binding = new WebHttpBinding()
        {
            ReaderQuotas = { MaxArrayLength = 20 * 8192},
            MaxReceivedMessageSize = 20 * 8192,
            MaxBufferSize = 20 * 8192

        };

Don’t get me wrong, WCF isn’t all bad or anything like that but it does get annoying like hell at times.

As for the reference to other frameworks - I can’t speak for NServiceBus  because I didn’t write it (though I think the answer would be the same) - but if I consider the communication framework I did write for xsights (which builds on WCF by the way) it is not as presumptuous as WCF. It does not pretend to be an all-encompassing communication layer for .NET and  It is build to provide a specific architectural approach – which is why it shouldn’t be judged by the same standards.


 
Tags: .NET | Software Architecture | WCF

November 13, 2009
@ 05:42 PM

There’s no Architecture in Business Service Orientation ! There, I’ve said it, there are no two types of SOA.  I am not trying to say that business-level service orientation doesn’t exist or isn’t valuable. However I am trying to say that labeling that SOA harms both Service Orientation at the business level and SOA (a.k.a. “technical SOA”)

For the record here are my definitions for both Business Service Orientation and Service Oriented Architecture (SOA)

Business Service Orientation is an IT  paradigm at the enterprise-level  that aims to componentize and partition the business’s software and to get composability and flexibility (and thus achieve better business and IT alignment etc.). Service Orientation can be implemented using various enterprise architecture practices (around governance, portfolio management etc)  as well as various software architectures including (but not limited to )  SOA, EDA, BPM, REST and combinations of them.

Service Oriented Architecture as an architectural style for building systems based on interacting coarse grained autonomous components called services. Each service expose processes and behavior through contracts, which are composed of messages at discoverable addresses called endpoints. Services’ behavior is governed by policies which are set externally to the service itself. SOA is derived of four predating architectural styles, namely Client/Server, Layered System, Pipes and Filters and Distributed Agents.

image

To reiterate -  calling Business Service Orientation SOA serves only to muddy the water and make both terms nebulous. If you just have to have a TLA just call them  BSO & SOA

PS

I know that in “What is SOA anyway” I also refer to two SOAs. I didn’t have enough confidence to say it bluntly when I wrote that paper, but I did emphasized Service Orientation for BSO and Architecture for SOA. Also regarding a more formal definition of SOA, that shows how it is derived from the four other styles – I started explaining that quite some time ago (Intro, Client-Server, Layered, Pipes and Filters) – I still need to explain Distributed Agents and summarize (about time I’ll do that)


 
Tags: SOA | Software Architecture

Yesterday I gave a talk on SOA pattern on the European Virtual Alt.Net user group. You can find the recording of that talk here as well as download a pdf of the slides.

Before I’ll talk a little about the substance I want to say a few words about office-live meeting (the platform used for the presentation). To sum this in one word the experience was horrid. It took me more than 35 minutes just to upload my presentation. Then I had to switch to windows XP (VM in parallels) to speak since it has problem with Windows 7 (low sound volume). However, the worst thing is that throughout the presentation I constantly lost control of the slides progress (i.e. couldn’t move the slides forward), which was very distracting. 

Anyway, if ignoring all that, I think overall the presentation is still beneficial and  addresses a  few interesting issues that are challenging like flexibility, reporting and management of SOA. If I am to sum the presentation I’d say that  when you build a system on SOA you get a system built of (relatively) a lot of components of questionable reliability. You can reap a lot of benefits in the flexibility department, but you have to address several challenges in the performance, availability, management (etc.) departments. Additionally you  need to look at the overall solution from an holistic viewpoint since different parts of the solution can push in different direction or  only cover part of the picture.

Lastly thanks to Jan and Colin for organizing the event and for all the attendees for giving me an hour and half of their time


 
Tags: .NET | SOA | SOA Patterns | Software Architecture

September 25, 2009
@ 11:35 PM

I read this blog by Joel Spolsky on the “Duct Tape Programmer”, where he praises dropping any libraries or testing or proper coding in favor of getting something out there:

“He is the guy you want on your team building go-carts, because he has two favorite tools: duct tape and WD-40. And he will wield them elegantly even as your go-cart is careening down the hill at a mile a minute. This will happen while other programmers are still at the starting line arguing over whether to use titanium or some kind of space-age composite material that Boeing is using in the 787 Dreamliner.”

While I suspect the main reason Jo wrote the blog a little provocatively is to promote the book mentioned there (Hey, it worked, I am writing on that..), I guess some people will take this at face value so I’ll respond anyway.

Let’s start with the good stuff

  • Focus on shipping software – Tools, pretty code, fancy frameworks, patterns etc. are no excuse for not shipping your product. 100% agree
  • Overengineering is bad for your (project’s) health – We’ve been through that before – complex frameworks , big up front design, are bad all bad
  • Don’t use a technology/framework just because it is there – if you don’t have a real problem then KISS (keep it simple stupid)
  • Simple design is, well, less complex than complex design – so it is usually preferable

In a similar fashion to Joel’s post, I wrote that even the “Big Ball of Mud” which is an architectural nightmare can be considered a pattern for a pragmatic approach to building working software.

However, I also added an important caveat which is missing from Joel’s post:

“This is probably not acceptable in the long term  - but it can be a good option for short term if you are aware that that's what you are doing and willing to treat what you get as "Throwaway code".”

If you can’t afford to throwaway the code (or not sure you will be able to)  it is still ok to cut some corners in order to get things done. However you should keep in mind something Uncle bob said a few days ago : “A Mess is not a Technical Debt.”

Oh, and by the way, duct tape isn’t a good analogy anyway :) – in the first gulf war, the government in Israel recommended people to prepare their houses for Sadaam’s future missile attacks by duct-taping the windows and other openings. The only reason it actually did work was that there were hardly any missiles shot. So yeah duct tape may seem to be working and everything but you better not put that into a real test


 
Tags: Design | Programming | Software Architecture

I noticed that the images and code samples are a little off on the blog (I have to admit I just pasted it from word, and we all know the great HTML that produces…). To help remedy this I am also making this pattern available in PDF from.

The next pattern I am going to publish is actually an anti-pattern called “NanoServices”, which as the name implies is about making services too small. I hope to have that ready early next week. Next after that would be the “Aggregated Reporting” pattern. Aggregated Reporting is aimed at solving the dispersed data problem that autonomy and a lot of services creates.

Any thoughts (on the pattern or otherwise) are welcomed


 
Tags: .NET | Java | SOA | SOA Patterns | Software Architecture

September 8, 2009
@ 06:53 PM

1.1 Reservation

When you use transactions in “traditional” n-tier systems life is relatively simple. For instance, when you run a transaction and an error or fault occurs you abort the transaction and easily rollback any changes – getting back your system-wide consistency and peace of mind. The reasons this is possible is that a transaction isolates changes made within it from the rest of the world. One of the base assumptions behind Transactions is that the time that elapses from the beginning of the transaction until it ends is short. Under that assumption we can afford the luxury of letting the transaction hold locks on our resources (such as databases) and mask changes from others while the transaction is in progress. Transactions provide four basic guarantees – Atomicity, Consistency, Isolation and Durability, usually remembered by their acronym - ACID.

Unfortunately, in a distributed world, SOA or otherwise, it is rarely a good idea to use atomic short lived transactions (see the Cross-Service Transactions anti-pattern in chapter 10 for more details). Indeed, the fact that cross service transactions are discourages is one of the main reasons we would to consider using the Saga pattern in the first place.

One of the obvious shortcomings of Sagas is that you cannot perform rollbacks. The two conditions mentioned above, locking and isolation do not hold anymore so you cannot provide the needed guarantee. Still, since interactions, and especially long running ones, can fail or be canceled Sagas offer the notion of Compensations. Compensations are cool; we can’t have rollbacks so instead we will reverse the interaction’s operation and have a pseudo rollback. If we added one hundred (dollars/units/whatnot) during the original activity we’ll just subtract the same 100 in the compensation. Easy, right?

1.1.1 The Problem

Wrong – as you probably know, it isn’t easy. Unfortunately, there are a number of problems with compensations. These problems come from the fact that, unlike ACID transactions, the changes made by the Saga activities are not isolated. The lack of isolation means that other interactions with the service may operate on the data that was modified by an activity of other sagas, and render the compensation impossible. To give an extreme example, if a request to one service changes the readiness status of the space shuttle to “all-set” and another service caused the shuttle to launch based on that status, it would be a little too late for the first service to try to reverse the “all-set” status now that the “bird has left the coop”. A more down to earth (pardon the pun) business scenario is any interaction where you work with limited resources e.g. ordering from a, usually limited, stock.

Consider, for instance, the scenario in figure 6.1 below. A customer orders an item. The ordering service requests the item from the warehouse as it wants to ship the item to the customer (probably by notifying another service). Meanwhile on the warehouse service the item ordered causes a restocking threshold to be hit which triggers a restocking order from a supplier. Then the customer decides to cancel the order – now what?

6.1

Figure 6.1 Chapter 6 focus is about connecting Services with Service consumers in the levels and layers beyond the basic message exchange patterns.

Should the restocking order be cancelled as well? Can it be cancelled under the ordering terms of the supplier? Also a customer requesting the item between the ordering and cancellation might get an out of stock notice which will cause him to go to our competitors. This can be especially problematic for orders which are prone for cancellations like hotel bookings, vacations etc.

Another limitation of compensations and the Saga pattern itself, for that matter, is that it requires a coordinator. A coordinator means placing trust in an external entity, i.e., outside (most) of the services involved in the saga, to set things straight. This is a challenge for some of the SOA goals as it compromises autonomy and introduces unwanted coupling to the external coordinator.

The question then is

How can we efficiently provide a level of guarantee in a loosely coupled manner while maintaining services’ autonomy and consistency?

We already discussed the limitations of compensations, which of course is one of the options to solve this challenge. Again, one problem is that we can’t afford to make mini changes since we will then be dependent on an external party to set the record straight. The other problem with compensations is that we expose these “semi-states” – which are essentially, the internal details of the services, to the out-side world. Increasing the footprint of the services’ contract, esp. with internal detail, makes the services less flexible and more coupled to their environment (See also the white box services anti-pattern in chapter 10)

We’ve also mentioned that distributed transactions is not the answer since they both lock internal resources for too long (a Saga might go on for days..?) as well as put excess trust on external services which may be external to the organization.

This seems like a quagmire of sorts, fortunately, real life already found a way to deal with a similar need for fuzzy, half guarantees – reservations!

1.1.2The Solution

Implement the Reservation pattern and have the services provide a level of guarantee on internal resources for a limited time

6.2

Figure 6.2 The Reservation pattern. A service that implement reservation consider some messages as “Reserving” in which it tries to secure an internal resource and sends confirmation if it succeeds. When a message considered as “confirming” the service validate the reservation still holds. In between the service can choose to expire reservation based on internal criteria

The Reservation pattern means there will be an internal component in the service that will handle the reservations. Its responsibilities include

§ Reservation - making the reservation when a message that is deemed “reserving” arrives. For instance when an order arrives, in addition to updating some durable storage (e.g. database) on the order it needs to set a timer or an expiration time for the order confirmation alternatively it can set some marker that the order is not final.

§ Validation – making sure that a reservation is still valid before finalizing the process. In the ordering scenario mentioned before that would be making sure the items designated for the order were not given to someone else.

§ Expiration – marking invalid reservation when the conditions changed. E.g. if a VIP customer wants the item I reserved, the system can provision it for her. It should also invalidate my reservation so when I finally try to claim it the system will know it’s gone. Expiration can also be timed, as in, |we’re keeping the book for you until noon tomorrow”

Reservations can be explicit i.e. the contract would have a ReserveBook action or implicit. In case of an implicit order the service decides internally what will be considered as Reserving message and what will be considered as confirming message e.g. an action like Order, will trigger the internal reservation and an action like closing the saga will serve as the confirming message. When the reservation is implicit the service consumer implementation will probably be simpler as the consumer designers are likely to treat reservation expiration as “simple” failures whereas when it is explicit they are likely to treat the reservation state.

Reservations happen in business transactions world-wide every day. The most obvious example is making a ordering a flight. You send in a request for a room (initiate a saga) saying you’d arrive on a certain date, say for a conference, and check out on another (complete the saga). The hotel says ok, we have a room for you (reservation) – provided you confirm your arrival by a set-date (limited time). Even if everything went well, you may still arrive at the hotel, only to find out your room has been given to another person (limited guarantee). The idea of the reservation pattern is to copy this behavior to the interaction of services so that services that support reservations offer a sort of “limited lock” for a limited time and with a limited level of guarantee. Limited level of guarantee, means that like real life, services can overbook and then resolve that overbooking by various strategies such as fist come, first served; VIP first served etc

It is easy to see Reservation applied to services that handle “real-life” reservations as part of their business logic, such as a ordering service for hotels (used in the example above) or an airline etc., However reservations are suitable for a lot of other scenarios where services are called to provide guarantees on internal resources. For instance, in one system I built we used reservations as part of the saga initiation process. The system uses the Service Instance pattern (see chapter 3) where some services are stateful (the reasons are beyond the scope of this discussion). Naturally, services have limited capacity to handle consumers (i.e. an instance can handle n-number of concurrent sagas/events).

This means that when a saga initialized all the participants of the saga needs to know the instances that are part of the saga. As long as a single service instance initiates sagas everything is fine. However, as illustrated in figure 6.3 below, when two or more services (or instances) initiate sagas concurrently they may (and given enough load/time they will) both try to allocate the same service instance to their relative sagas. In the illustration we see that both Initiator A and Initiator B want to use Participant A and Participant B. Participant A has a capacity of 2 so everything is fine for both Initiators. Service B, however, has limited capacity so at least one of the Sagas will have to fail the allocation, i.e. not start.

6.3

Figure 6.3 : Sample for a situation that can benefit from the reservation pattern

The reservation pattern enabled us to manage this resource allocation process in an orderly manner by implementing a two pass protocol (somewhat similar to a two phase commit). The initiator asks each potential participant to reserve itself for the saga. Each participant tries to reserve itself and notify back if it is successful – so in the above scenario, A would say yes to both and B would say yes to one of them. If the initiator gets an OK from all the involved services (within a timeout) it will tell all the participants the specific instances within the saga (i.e. initiate it).

The participants only reserve themselves for a short period of time. Once an internally set timeout elapse the participants remove the commitment independently. As a side note, I’ll just say that the initiator and other saga members can’t assume that the participant will be there just because they are “officially” part of the saga and the system still needs to handle the various failure scenarios. The Reservation pattern is used here only to help prevent over allocation and it does not provide any transactional guarantees.

A reservation is somewhat like a lock and thus it “somewhat” introduce some of the risks distributed locks presents. These risks aren’t inherent in the pattern but can easily surface if you don’t pay attention during implementation (e.g. using database locks for implementation).

The first risk worth discussing is deadlock. Whenever you start reserving anything, esp. in a distributed environment you introduce the potential for deadlocks. For instance if both participants had a capacity for single saga, initiator A contacts participant A first and participant B next and initiator B used the reverse order – we would have had a deadlock potential. In this case there are several mechanisms that prevent that deadlock. The first is inherent to the Reservation pattern, where the participants release the “lock” themselves. However, for example, if there is a retry mechanism to initiate the sagas (as both would fail after the timeout) and the same resources will be allocated over and over there may be a deadlock after all

Another risk to watch out from when implementing Reservations is Denial of Service (whether maliciously or as an byproduct of misuse). DoS can happen from similar reasons discussed in the deadlock (i.e. if you incur a deadlock you also have a DoS). Another way is via exploiting the reservations by constantly re-reserving. Depending on the reservation time-out, regular firewalls might fail detecting the DoS so you may want to consider using a Service Firewall (chapter 4) to help mitigate this thread.

Besides the risks discussed above, another thing to pay attention to is that when you introduce Reservation, you are likely to add additional network calls. The system discussed above mention that when it introduce another call tell the Saga members which instances are involved in the saga.

In addition to the Service Firewall pattern, mentioned above, another pattern related to Reservations can be the Active Service pattern (see chapter 2). The Active Service pattern can be used to handle reservation expiration when implemented by timed. Note however, that sometimes better, resource-wise, to handle expiration passively and not actively as we’ll see looking at s implementation options in the next section.

1.1.3Technology Mapping

Unlike a lot of the patterns in this book, the Reservation pattern is more a business pattern than a technological one. This means there isn’t a straight one-to-one technology mapping to make it happen. On the other hand, code-wise, the pattern is relatively easy to implement.

One thing you have to do is to keep a live thread at the service to make sure that when the lease or reservation expires someone will be there to clean up. One option is the Active Service pattern mentioned above. You can use technologies that support timed events provide the “wakeup service” for you. For instance if you are running in an EJB 3.0 server you can use single action timers i.e. timers that only raise their event once to accomplish this. Code listing 6.1 below shows a simple code excerpt to set a timer to go off based on time received in the message. Other technologies provide similar mechanism to accomplish the same effect.

Code Listing 6.1 setting a timer event for a timer based on a message to set the timer (using JBOSS )

public class TimerMessage implements MessageListener {

@Resource

private MessageDrivenContext mdc;

.

.

.

public void onMessage(Message message) {

ObjectMessage msg = null;

try { #1

if (message instanceof ObjectMessage) {

msg = (ObjectMessage) message;

TimerDetailsEntity e = (TimerDetailsEntity) msg.getObject();

TimerService timerService = messageDrivenCtx.getTimerService();

// Timer createTimer(Date expiration, Serializable info) #2

Timer timer = timerService.createTimer(e.Date, e);

}

} catch (JMSException e) {

e.printStackTrace();

mdc.setRollbackOnly();

} catch (Throwable te) {

te.printStackTrace();

}

}

.

.

.

(Annotation) <#1 some vanilla code to process a message and get the interesting entity out of it >

(Annotation) <#2 Here is where we set the single action timer based on the info in the message we’ve just got>

Timer based cancellation, as described above, might be an overkill if the reservation implementation is simple. For instance the Reservation in listing 6.2 below (implemented in C#) is used by the participants discussed in the Saga and reservation sample discussed in the previous section.

Code Listing 6.2 Simple in-memory, non-persistent reservation

public Guid Reserve(Guid sagaId)

        {

            try

            {

                Rwl.TryWLock();

                var isReserverd = Allocator.TryPinResource(localUri, sagaId);

                if (!isReserverd) #1

                    return Guid.Empty;

//Some code to set the expiration #2

                return sagaId; #3

            }

            finally

            {

               Rwl.ExitWLock();

            }

        }

(Annotation) <#1 The allocator is a resource allocation control, which manages, among other things, the capacity of the service. If we didn’t succeed in marking the service as belonging to the Saga, we can’t allocate the service to the specific Saga>

(Annotation) <#2 Here is where we need to add code to mark when the reservation expired, the previous example (6.1) used timers , we’ll try to do something different here>

(Annotation) <#3 successful reservation returns the SagaId this assures the caller that the reply it got is related to the request it sent – a simple Boolean might be confusing >

Since the Reservation in listing 6.2 does not involve heavy service resources (like, say, a database etc.), we can implement a passive handling of reservation expiration, which will be more efficient than a timer based one. Listing 6.3 below shows both a revised reservation implementation which removes timeout reservation before it commits. Note that an expired reservation can still be committed if no other reservation occurred in between or the capacity of the service is not exceeded.

Code Listing 6.3 passive reservation expiration handling (added on top of the code from listing 6.2)

public Guid Reserve(Guid sagaId)

        {

            try

            {

                Rwl.TryWLock();

                RemoveExpiredReservations(); #1

                var isReserverd = Allocator.TryPinResource(localUri, sagaId);

                if (!isReserverd)

                    return Guid.Empty;

                OpenReservations[sagaId] = DateTimeOffset.Now + MAX_RESERVERVATION; #2

                return sagaId;

            }

            finally

            {

               Rwl.ExitWLock();

            }

        }

private void RemoveExpiredReservations()

        {

            var reftime = DateTimeOffset.Now;

            var ids = from item in OpenReservations where item.Value < reftime select item.Key;

            if (ids.Count() == 0) return;

            var keys=ids.ToArray();

            foreach (var id in keys)

            {

                OpenReservations.Remove(id);

                Allocator.FreePinnedResources(id);

            }

        }

(Annotation) <#1 Added a small method (RemoveExpiredReservations which also appears in the listing) to clean expired reservations. This method is ran everytime the service needs to handle a new reservation request and it cleans up expired reservations. Note that there is no timer involved, reservation are only cleaned if there is a new reservation to process>

(Annotation) <#2 Instead of a timer the reservation is done by marking down when the reservation will expire>

The code samples above show that implementing Reservation can be simple. This doesn’t mean that other implementations can’t be more complex. For example if you want/need to persist the reservation or distribute a reservation between multiple service instances etc., but at its core it shouldn’t be a heavy or complex process.

Another implementation aspect is whether reservations are explicit or implicit. Explicit reservation means there will be a distinct “Reserve” message. This usually means there will also be a “Commit” type message and that the service or workflow engine that request the Reservation might find itself implementing a 2-phase commit type protocol, which isn’t very pleasant, to say the least.

The other alternative is implicit where the service decides internally when to reserve and what conditions to commit the reservation and when to reject it. As usual the tradeoff is between simple implementation to the service and simple implementation for the service consumer

1.1.4Quality Attributes

As usual, we wrap up pattern by taking a brief look at some business drives (or scenarios) that can drive us to use the reservation pattern.

In essence, the main drive to reservation is the need for commitment from resources and since it is a complementary pattern to Sagas it also has similar quality attributes. As mentioned above Reservation helps provide partial guarantees in long running interactions thus the quality attribute that point us toward it is Integrity.

Quality Attribute (level1)

Quality Attribute (level2)

Sample Scenario

Integrity

Correctness

Under all conditions, failure receive payment within 5 business days will cancel the order and shipping

Integrity

Predictability

Under normal conditions, the chances of a customer getting billed for a cancelled order shall be less than 5%

Table 6.2 Reservation pattern quality attributes scenarios. These are the architectural scenarios that can make us think about using the Decoupled Invocation pattern.

Reservations is a protocol level pattern which that involves Reservation involves exchange of messages between service consumers and services. The next pattern is one of the enablers of such message exchange , it is also a one of the confusing pattern since a lot of commercial offerings which include it include gazillion other capabilities - yes I am talking about the ServiceBus


 
Tags: .NET | Java | SOA | SOA Patterns | Software Architecture

A lot have been written and said about multiple use (or reuse depending on your definition) of services. I want to touch one aspect of this with this post.

As a general rule, the more something is generic or small the easier it is to use it in different contexts, for example Hash tables are used all over the place in a lot of programs. The Hash table is a generic container and carries very little in terms of business context so it is very easy to use it. A corollary to the above mentioned rule is that the more specific is something the harder it is to use it in different contexts. Unfortunately (from the “use” point of view) specific domain logic is exactly what we strive to have with SOA.  The value of services is derived from the business value they can generate. To add insult to injury, there’s also a limitation on how small we’d want a service to be. The fact that  communicating with  a service requires communication over a network means that if we’ll make it too small, the overhead in getting to it (serialization, network traffic, security etc.) can out weight its utility (an anti-pattern I call nano-services)

Well, one thing you can try to do is remove the business context from the services. before you flame me about  how this matches my previous statement that services’ value comes from the business value or domain know how they provide, you should note that I said “business context” and not business logic.

Let me try to clarify this with a concrete example from my current system. At xsights we provide image identification services for mobile devices. for instance when you see a movie ad, you can take a picture with your mobile, send it to us, via MMS, a specific client or video call, and we provide related information such as the trailer, where to buy tickets etc. Our initial offering supported only video calls (for business reasons irrelevant for this post). In a video call you have a constant stream of incoming video from the handset (10-15 frames per second) so we (try to) identify frames as they come. We mostly use event driven architecture over SOA so (a partial) flow looks something like the following (the events occur in the context of a saga). An extractor service listens on an RTP stream, extract and preprocess images and raises a FrameArrived event on each new frame. An Identification GW decides how to handle an incoming frame and directs it to one or more algorithmic workers (this isn’t event driven). After a successful identification the Identification GW raises a LinkFound event. And a Call Flow service takes it from there:

image 

if we didn’t get an identification within a timeout we can ask the user to better aim the camera or whatnot (behavior controlled by the CallFlow service)

When we first added support for MMS  we wanted to use the same identification logic – there’s a slight difference though: in a video call you have a constant stream of low-quality images where as in an MMS you get a single high(er) quality shot. To add support for MMS we needed to add some logic to the identifier so that it will know whether the origin of the image is an MMS message or a video call. If it is the first then the Identifier needs to raise a “failed to identify” even when it finished processing the image (the video call can use a timeout instead)

But that’s the wrong way to do it – since now we need to know which sagas are MMS sagas and which are Video call ones. Not to mention we would probably need some other “special” logic to handle clients (which indeed we needed) . If we go down this lane and add more and more business context to the identifier we make it less autonomous – even though we are using events they are no longer about the business of the service events (like “FrameArrived” from the extractor) they are system context events (“MMSIdentificationFailed”) our identifier is gaining more and more “reasons to change” and is becoming tightly couples to specific contexts. So yes, we using it over and over again but the costs for that are getting higher with each such reuse

What’s a better way? Remove the business context from the service and focus on keeping the business logic and rules. In this case that would be a NoMatchForFrame event for each failed frame. In an MMS related saga there would be a service that listens to this event, in a video call related saga no service will listen on the event*. Once the business context is removed our identification GW focuses only on its core business activity (routing images to algorithmic workers and notifying the world on success/failure. Adding support for client behavior becomes much easier in this can, in fact the identification GW doesn’t need any changes to support this scenario.

To sum this post – if you want to increase the chance to use services in different contexts you should strive to remove the context specific bits outside of the services. This will simplify the services themselves as well as increase their autonomy


* Out communications framework allows for different event wiring (or route) depending on the saga “type” so actually the event won’t even fire in a video call  as our communication framework will identify there aren’t any subscribers. This is very good from the service point of view as it allows it to fire events and letting the the communications framework worry about the context. The saga initiator is the only place where the context has to be specified (I’ll expand on this in another post)
 
Tags: SOA | SOA Patterns | Software Architecture

June 23, 2009
@ 10:03 PM

In one of my previous posts (Rest: good, bad and ugly), I made a passing comment, about how I think using CRUD in RESTful service  is a bad practice. I received a few comments / questions asking why do I say that – so what’s wrong with CRUD and REST?

On the surface, it seems like a very good fit (both technically and architecturally), however scratch that surface, and you’d see  that it isn’t a good fit for either.

REST over HTTP is the most common (almost only) implementation of the REST architectural style - to the point REST over HTTP is synonymous with REST. I would say most of the people who think of REST in CRUD terms, think about mapping of the HTTP verbs.

CRUD which stands for Create, Read, Update and Delete, are the four basic database operations. Some of the  HTTP verbs, namely POST, GET, PUT and DELETE (there are others like OPTIONS or HEAD) seem to have a 1-1 mapping to CRUD. As I said earlier they don’t. The table below briefly contrast HTTP verbs and CRUD

Verb CRUDdy Candidate Actually
GET SELECT (Read) Get a representation of a resource. While it is very similar to SELECT it also has a few features beyond an out-of-the-box SELECT e.g. by using If-Modified-Since (and similar modifiers) you might get an empty reply.
Delete Delete Maps well
PUT Update Put looks like an update but it isn’t since:
1. You have to provide a complete replacement for the resource (again similar to update but not quite)
2. You can use PUT to create a resource (when the URI is set by the client)
POST Insert It can be used to create a   but it should be a child/subordinate  one. Furthermore, it can be used to provide partial update to a resource (i.e. not resulting in a new URI)
OPTIONS ? Get the available ways to continue considering the current state or the resource
HEAD ? Get the headers or metadata about the resource (which you would otherwise GET)

The way I see it,  the HTTP verbs are more document oriented than database oriented (which is why document databases like CouchDB are seamlessly RESTful). In any event, what I tried to show here is that while you can update, delete and create new resources the way you do that is not exactly CRUD in the database sense of the word – at least when it comes to using the HTTP verbs.

However, the main reason CRUD is wrong for REST is an architectural one. One of the base characteristics(*) of REST is using hypermedia to externalize the statemachine of the protocol (a.k.a. HATEOS– Hypertext as the engine of state). The URI to URI transition is what makes the protocol tick (the transaction implementation by Alexandros  discussed in the previous post shows a good example of following this principle). 

Tim Ewald explains this  nicely (in a post from 2007…) :

… Here's what I came to understand. Every communication protocol has a state machine. For some protocols they are very simple, for others they are more complex. When you implement a protocol via RPC, you build methods that modify the state of the communication. That state is maintained as a black box at the endpoint. Because the protocol state is hidden, it is easy to get things wrong. For instance, you might call Process before calling Init. People have been looking for ways to avoid these problems by annotating interface type information for a long time, but I'm not aware of any mainstream solutions. The fact that the state of the protocol is encapsulated behind method invocations that modify that state in non-obvious ways also makes versioning interesting.

The essence of REST is to make the states of the protocol explicit and addressableg by URIs. The current state of the protocol state machine is represented by the URI you just operated on and the state representation you retrieved. You change state by operating on the URI of the state you're moving to, making that your new state. A state's representation includes the links (arcs in the graph) to the other states that you can move to from the current state. This is exactly how browser based apps work, and there is no reason that your app's protocol can't work that way too. (The ATOM Publishing protocol is the canonical example, though its easy to think that its about entities, not a state machine.)

If you are busy with inserting and updating (CRUDing) resources you are not, in fact, thinking about protocols or externalizing a State machine and, in my opinion, miss the whole point about REST.

CRUD services leads and promoted to the database as a service kind of thinking (e.g. ADO.NET data services) which as I explained in another post last year is a bad idea since:

  1. It circumvents the whole idea about "Services" - there's no business logic.
  2. It is exposing internal database structure or data rather than a thought-out contract.
  3. It encourages bypassing real services and going straight to their data.
  4. It creates a blob service (the data source).
  5. It encourages minuscule demi-serices (the multiple "interfaces" of said blob) that disregard few of the fallacies of distributed computing.
  6. It is just client-server in sheep's clothing.

The main theme of this and the previous post is that if we try to drag REST to the same old, same old stuff we always did we wouldn’t really get that many benefits. In fact, the “old” ways of doing that stuff are probably more suitable for the job anyway since they have been in use for a while now. and they are “tried and tested”  (“You can’t win an argument with an idiot, he’ll just drag you down to his level and beat you with experience” …). REST is just  a different paradigm that RPC, ACID transactions and CRUD.


* I know I sound like a broken record on that but our industry has a history diluting terms to a point they almost stop being useful (SOA comes to mind..). The way I see it you can have 3 levels on your way to REST over HTTP:

  • You can be using HTTP and XML/JSON – this is level 1 or “Using standards”.
  • You can be using the HTTP verbs properly and/or applying document oriented communications – this is level 2 or “Rest-like” interface
  • You can conform to all REST constraints and be at level 3 or “RESTful”.

All levels can be useful and bring you merit but only the 3rd is REST


 
Tags: REST | SOA | Software Architecture | Trends

June 15, 2009
@ 11:10 PM

 

Yesterday I read an interesting paper called “RETRO: A RESTful Transaction Mode”. On the good side, I have to say, it is one of the best RESTful models I’ve seen thus far. The authors took special care to satisfy the different REST constraints, unlike many “RESTful” services (e.g. twitter that returns identifier and not URIs). On the downside is I think a distributed transaction model is bad for REST or in other words I don’t see a reason for going through this effort and jumping through all these hoops.

Why?

For the same reasons transactions are wrong for SOA and  why WS-AtomicTransactions is wrong for SOAP web services:

  • Service Boundary – RESTful or otherwise is a trust boundary. Atomic transactions require holding locks and holding them on behalf of foreign service is opening a security hole (makes it much easier to do a denial of service attack)
  • You cannot assume atomicity between two different entities or resources. Esp. when these resources belong to different businesses.
  • Transactions introduce coupling (at least in time)
  • Transactions hinder scalability – It isn’t that you can’t scale but it is much harder

For rest it is even worse - Since using hypermedia as the engine of state change means that the hypermedia actually  describes the protocol, we clutter the business representations (the representations of real business entities like customer, order etc.) with transactional  nitty-gritty as the authors say:

“our model explicitly identifies locks, transactions, owners and conditional representations as explicit, linkable resources. In fact, every significant entity in our model is represented as a resource in order to comply with this constraint.”

This also means the programming the resources themselves will get much more complicated

I think that if you want to reap the benefits of REST you should keep the protocol simple and focus on the business and technical merits you can get not bog it all with needless complexity. It seems to me that RETRO is a good mental exercise to show transactions can be RESTful. I think, however that it is an overkill for RESTful implementations.

RESTful architectures will be better off with BASE (Basically Available, Scalable, Eventually Consistent) and/or ACID2 (Associative, Commutative, Idempotent and Distributed) models –or at least the Saga model (which the authors intend to tackle next) which  is a better candidate (IMHO) for achieving distributed consensus.


 
Tags: REST | SOA | Software Architecture

I recently read a post by  Tim Bray where he states that building on web technologies let you get away with believing some of the fallacies of distributed computing.

I personally thinks he is a little optimistic in that claim.

On “The network is reliable” – Tim says that that the connectionless of HTTP helps (it does) and that GET, PUT and DELETE are idempotent helps as well. I say that GET, PUT and DELETE only if the people implementing the server side make them so – i.e. consider the fallacy. The fact that the HTTP says they should be idempotent doesn’t automatically make each implementation compliant

On “ Latency is Zero” – Tim says the web makes it worse – but, he claims, users got used to that. Even if they did I think that users are just part of the picture since the programmable web is also making strides. Also as Tim says it is actually worse. Not to mention that “Latency isn’t constant” either

On “Bandwidth is infinite” – Again Tim agrees that it is worse but people learn to note it. Again learning that it is there doesn’t mean the fallacy is gone just that people are less likely to presume it

On “The Network is secure” – Tim says its probably the “least-well-addressed by the web” – no argument here

On “Topology doesn’t change” – Tim says URIs help mitigate it – Again Tim is assuming people make URIs permanent or will always return a temporary redirect/permanent redirect when a URI change – good luck with that.

On “There is one administrator” – Tim says that yes that’s the case but who cares. Well, an example I usually give is that time when I deployed an ASP.NET which worked for a while – until the hosting company decided to change their policy to partial-trust (the app. needed full-trust) – when that happens to you. You care. If you mashup with someone else, you care etc.

On “Transport cost is Zero” – Tim says it is the same as for Bandwidth – i.e. worse.

On “The network is homogeneous” – Tim says that that’s this is the “web’s single greatest triumph”. I actually agree to that as long as all of you stick to using the web’s ubiquitous standards (http, XML/JSON ) if you have parts of your application that can’t use that you still need to pay attention

One thing I am really  puzzled by is Tim’s conclusion :

“If you’re building Web technology, you have to worry about these things. But if you’re building applications on it, mostly you don’t.”

Since even according to him only 4 fallacies are covered by the web… (I think only 1)

In any event, I agree that the web standards and REST in particular, do contain guidelines that take into consideration the fallacies. However it is still up to developers to understand the problems they’ll create if they don’t follow these guidelines. Assuming that that is indeed the case, is well, overly optimistic in my experience.

You can also read a paper I published a few years ago which explains the fallacies  and why they are still relevant today.


 
Tags: REST | SOA | Software Architecture

Michael Poulin @ ebizq doesn’t like the Active Service pattern I suggest you read his post first but in a nutshell Michael sees two possible ways to understand the term Active Service:

“a) service view - a service that actively looking for companions to complete its own task
b) consumer view – a service which triggers its own execution by itself”

…and he doesn’t like both…

I think that both of these definitions aren’t that far… and I like both :)

The way I see it there are two concern here

1. Are services only reactive (“passive”)  ? - i.e. The service only “works” when it gets a request from a service consumer (user/another service/an orchestration engine) ? If the service also has at least one thread working to do internal stuff (e.g. scavenging outdated data, pre-fetching data from other service etc.) then that’s what I call an Active Service (option “b” above)

2.  How do services get data they need to complete a request when they actually get a request – There are many possibilities here: events, pub/sub, an orchestration engine that takes care of that, services that check for a known contract in a registry and then go to that service, even hardcoded. The options where the service looks for other services (e.g. using a registry) is option "a” above.

So basically all the options are valid a service can be a+b just a or just b or none and, in my eyes, these are orthogonal concerns.

Regarding pre-fetching – I think this can be beneficial as a way to achieve caching. Note that if you control both sides and you’ve got the needed infrastructure then it is probably better to push changes (eventing or pub/sub) but that’s not always the case.

In the comment I left on Michael’s blog I talked about different strategies for services “There are several strategies for that - one is to take that knowledge out of the service (e.g. using choreography or orchestration), providing a subscription and/or wiring infrastructure i.e. something that will tell you where to find certain contracts, hard coding , registry , using uniform interfaces (e.g. REST) etc.”

lets take a concrete (albeit very very simplistic) scenario to illustrate some of the approaches

Business scenario: When a customer makes an order we want to give a 5% discount for preferred customers. A customer get’s a proffered status upon a business decision (annual orders of 1M$ or knowing the CEO or whatever) and the status lasts for a year from the date it was introduced.

For the sake of this discussion say we have two services (again this is overly simplified) an Ordering service and a Customer service.

Here are a few technical options

Technical Scenario 1.

Customer places and order, the ordering service talks to “the” customer service to check if the customer deserves a discount if she does. the ordering service then updates the order with the discount and present it to the customer to finalize the order.

Technical Scenario 2.

Same as 1, with the ordering looking for a service that matches the customer contract it knows about

Technical Scenario 3

The ordering service asks “the” Customer service twice a day for a list of discounts and caches the result. When the user sends her order. it calculates the price and present it to her

Technical Scenario 4

Same as 3, with the ordering looking for a customer service (not using a known service)

Technical Scenario 5

The customer service sends a message to known subscribers whenever a new customer status occurs. The ordering service listens on that and update its internal cache. When the customer places her order, the ordering hits the cache for the discount

Technical Scenario 6

same as 5 but publishing an event to unknown subscribers

Technical Scenario 7

The customer service publish an event with the discounts (or changes in discounts) twice a day. The ordering service listens on that and update its internal cache. When the customer places her order, the ordering hits the cache for the discount

Technical Scenario 8

The customer order is passed to an orchestrating service, which hits a customer service for a discount and then passes all the data to an ordering service

There are quite a few more options and variants on the options listed but which one is best?

Yeah, you’ve guessed it -  it depends.It depends since each option has its own strength and weaknesses which can work best in different circumstances . It also  depends on the available infrastructure, on the structure of other services, on the services being internal or external etc.

for instance scenario 1 is less flexible than most others but it is simple to implement. There is coupling in time between ordering and customer (both have to be up for the order to complete). Scenario 4 needs to solve the problem of finding other services (e.g. using some kind of registry, or other services “pushing” their existence or whatever) but when a customer makes her request it (most likely) have all the needed info to process that request, making the ordering service more autonomous. As a side note, the fact that different approaches to achieve the same end-goal work in different situations is why I decided  to write patterns in the first place

Lastly, in case you are wondering the scenarios are:

1 – choreography with pre-known (configured or hardcoded) companion services

2 – choreography with “active service” of type a (ordering is active)

3- choreography with “active service” type b (ordering is active)

4 – Choreography with “active service” type a + b (ordering is active)

5 – pub/sub (e.g. using an ESB)

6 – eventing

7- eventing with “active service” type b (customer is active)

8 - orchestration


 
Tags: SOA | SOA Patterns | Software Architecture

As I mentioned in the previous post I got a few interesting questions lately. The first from Colin regarding developing a customized solution for the blogjecting watchdog pattern vs. integrating/developing for a commercial monitoring suite (e.g. Unicenter/OpenView etc.). The second question I received was from Dru on running multiple versions of services (e.g. during upgrade) with active Sagas in the background. I think these questions are interesting enough to be answered as blog posts.Also since both these questions are related to the Blogjecting Watchdog pattern I thought it would be better to explain what it is actually first..

So here it is :)

Blogjecting Watchdog

Achieving availability is a multi-layered effort. I’ve already talked about how services should be autonomous (see for example Active Service pattern in chapter 2) , the Blogjecting Watchdog pattern will take a look at another aspect of autonomy. The Blogjecting Watchdog pattern shows how a service can proactively try to identify faults and problems and to try to heal itself when it identifies these problems.

1.1 The Problem

The Service Instance pattern (see section 3.4) for example, demonstrates a strategy that a service can implement to be able to cope with failure. The question is – is that enough? Is it enough for the service to try to cope with everything by itself? My answer is no, that is not enough. For one once we dealt with the failure within the service, the service ability to cope with the next failure would probably be diminished. For example if we found a failure in a server and moved to a standby server, the new server does not have another stand-by server to move to if another fault occurs.

Additionally, the failure might be too much for the service to be able to overcome it by itself. Like a switch going down - So we would have something external that looks after the service and could help the service (see Service Monitor pattern in chapter 4).

To increase the service autonomy and to increase the overall availability of our SOA we need both to try to identify and repair problem and to be able to notify the world about the service’s current status.

The question is then:

How can we identify and attend to problems and failures in the service and increase service availability?

One option is to try to infer the state of the service from the way it looks to the outside – yes this is as crude as it sound. You try to call the service, it doesn't respond you know it is down; you call the service, you expect to get a reply in 5 seconds you get it in 10 seconds, you understand that the service is congested. This is not a very good option as the external behavior only gives us coarse knowledge on the service's state. For example, if the services has a decent fault tolerance solution, we wouldn't know that anything happened – but the truth is that the service ability to handle the next fault might not exist anymore.

Another way is to install agents on the service's servers, this will give you a much better picture of what happens (vs. the option above). For example, you will also be able to get trend information (e.g. You can watch how much disk space is left and alert when it is getting low). There are several problems with this solution. One is that you need to actively install software on the service's servers which both decreases the service autonomy and creates a management hassle in itself. Another problem is that you still only get an external view of the service behavior (you just gain access more information). There are situations (see for example the Mashup pattern in chapter 7) where not all the services are under your control and you cannot access their hardware.

Yet another option is to actively question the service about it state. The has one big advantage over the two previous options since you also get some inside information regarding what the service has to say about its state. This enables the service to communicate trends in problems that will actually make it fail. For example if the service does not write any information into the local disk a low disk space is not a problem at all, if this is the disk where the database is located it is very much a problem. The solution is not perfect since it is the observers responsibility to go after the information. If the rate at which the observer samples the service is not fast enough it can miss on vital information.

As I mentioned earlier we want something that will help increase the service’s autonomy so a better approach in this regard would be for the service to watch over itself

1.2 The Solution

Watching over itself is also not enough as we also said we need the “world” to know what happening with the service, thus a combines solution is to :

Implement the Blogjecting Watchdog pattern and have the service actively monitor its internal state, try to heal itself and continuously publish its state and other important indicators.

clip_image002

Figure 3.14 The blogjecting watchdog pattern. The blogjecting. The blogjecting component that send the reports out and and listens for requests. The watchdog component monitor the status of the business service, tries to heal stray components and log any failure.

The pattern revolves around a single idea – to increase the service responsibility by using two complementary concepts reporting and self healing. The first is the Blogjecting concept where the service implements the Active Service pattern (see chapter 2 for more details) and a component which is in charge of monitoring the service's state. The component publish (see the publish/Subscribe interaction pattern in chapter 6) also the service's state on a cyclic basis or when something meaningful occurs. It is important to note that the fact that the service actively publishes its state doesn't have to mean it cannot also respond to inquiries regarding its health (akin to living a comment on a blog and getting a response from the author)

What are Blogjects

The term Blogjects was coined by Julian Bleecker back in 2005 (Bleecker, 2005) to describe "edgy designed objects that report themselves, or expose their experiences in some fashion" or in other words Blogject == Objects that blog. Julian Bleecker's vision for Blogjects is wider than the one suggested here. Jonathan's vision is for things that participate in the Web 2.0 sense of social-web or even further than that – to use Julian’s words :“Forget about the Internet of Things as Web 2.0, refrigerators connected to grocery stores, and networked Barcaloungers. I want to know how to make the Internet of Things into a platform for World 2.0. How can the Internet of Things become a framework for creating more habitable worlds, rather than a technical framework for a television talking to an reading lamp?” . I highly recommend taking a look at the full paper “A Manifesto for Networked Objects – Cohabiting with Pigeons, Arphids and Aibos in the Internet of Things” (Bleecker, 2006) to get the full picture.

 

The second concept that plays in the Blogjecting Watchdog pattern is the watchdog, The idea here is to have a component that listens in on the information gathered and published by the blogject component and then to acts on that information in a meaningful way to increase the reliability and availability of the service. The possibilities for implementing self-healing are endless, two simple examples for self-healing actions are restating failed components and cleaning temporary files.

Watchdogs

Watchdog (actually watchdog timer) is a term borrowed from the embedded systems world. A watchdog is a hardware device that counts down to zero, and when it gets there it reset the device. To prevent this reset the application has to “kick the dog” before the timer runs out. If the application does not reset the counter it means that the application is hanged and the idea is that the reset would fix that.

 

How is the Blogjecting Watchdog pattern better than the other options mentioned above?

Even if we just consider the blogjecting part of the pattern we can see several advantages over the other approaches. The Blogjecting Watchcdog combines the benefits of an agent that actively monitors the service's health with the internal knowledge of what's important for the service continuity and what's not. Unlike the external agents solution, using Blogjects, the service retains its autonomy. The autonomy is increased even further when you combine the self-healing features of the watchdog. Thus the end result is a service which is more resilient (and thus has higher availability), which lets the world know both its current state as well as future trends.

In one project I was working on we inherited a situation where there were interdependencies between executable installed on different servers (within a service) – for example when one process was down on server A the objects running on server B could not function well and other such dependencies (this isn’t the brightest design, but sometimes you have to compromise - in this case there was no time and budget to redesign these applications). What we ended up with, is something like the situation in figure 3.15 below:

clip_image005

Figure 3.15 a sample deployment of a blogjecting watchdog. The daemons on the servers monitor the running components on each server. The Watchdog edge exposes the current the current state both through a web-services API and as SNMP traps

The watchdog agents on each of the server nodes monitors the components. The agents communicate amongst themselves to examine the dependencies and actions taken. The watchdog Edge component provides a WSDL based endpoint where other services can query it for the service’s health. It also publishes SNMP traps to an external SNMP monitor (e.g. HP-Openview). As an implementation hint, I can suggest keeping the watchdog components in a separate very simple executable (preferably a daemon that runs when the OS loads). The simpler the component, the lower the risk it will fail in itself (you can of course have a backup in the form of a hardware watchdog ..). Let’s take a more thorough look at the technology mapping options

1.3 Technology Mapping

Implementing Blogjecting Watchdog in an enterprise will usually pre-determine the protocols you will have to use for your “blog”. The IT team will most likely already standardize on one of the leading monitoring suites (CA-Unicenter, HP-Openview, IBM-Tivoli or if you are an all Microsoft shop Microsoft Operations Manager). In these cases you can use the SDK of the monitoring software (e.g. the Unicenter Agent SDK or MOM management pack developer guides). There are even 3rd party software packages to help you build such agents (for example OC Systems have a Universal Agent that makes it easier to write agents for Unicenter).

Note, that this is not always the case though, and sometimes you do have the freedom to choose you protocols. Few projects I worked on chose to standardize on using web-services with specific messages for monitoring the health of service (so we had a specific endpoint for each service where these messages were supported). With the emergent of SOA specific tools like the ones by Amberpoint and Weblayers you will see more and more WS-* based monitoring.

Other ways for reporting your internal state can be to use standards like SNMP (Simple Network Management Protocol) or plainly the windows Event logs An interesting option, which will let your Blogjecting Watchdog literally blog is to use a product called RSSBus. Whish is an ESB implementation that uses RSS protocol for communications. At the time I am writing this, the product is still in beta, so I haven’t used it for a serious system yet. Nevertheless, it looks like an interesting direction which I’ll consider when it is released.

Regarding the self-healing part (watchdog), self-healing is still more prevalent in hardware then in software (watchdog timers, RAID, IBM , hot spare memories, hot spare drives etc.) in a sense any solution that builds on clustering technology also has some of that built-in. The virtualization trend will also help in this sense (see discussion on utility computing in this chapter’s summary). You can already read papers that talk about self-healing web services (G. Kouadri Mostéfaoui, 2006) or see some projects that tries to look into this problem (e.g. WS-Diamond - DIAgnosability, Monitoring and Diagnosis). Nevertheless, all of them are still in the research phase and if you want something now, you will probably need to implement something by yourself. In my experience, it won’t take you too much time to have a basic watchdog up and running , but it will take you sometime until you will have it predicting and acting as an advanced warning system.

1.4 Quality Attribute Scenarios

The Blogjecting Watchdog is an interesting pattern (and not just because of its odd name) as it can really help on the way to autonomous computing. The effect of this proactive approach is to increase the overall reliability of the service. A service which is self-healing can overcome (at least) minor problem which results in better availability overall. Additionally the monitoring aspects of the Blogjecting Watchdog also help enhance availability by notifying administrators that something is amiss (which will enable them to fix it).

Quality Attribute (level1)

Quality Attribute (level2)

Sample Scenario

Availability

Failure detection

Upon a failure or degraded performance, The system will alert the system admin (via SMS) within 3 minutes.

Reliability

Increased autonomy

During normal operations, the system will clear all its temporary resources (e.g. files) continuously

Table 1.1 Blogjecting Watchdog pattern quality attributes scenarios. These are the architectural scenarios that can make us think about using the Blogjecting Watchdog pattern.

Once we introduce a monitor and start to collect data, we can start to find new uses for that data, for  example we can use the information on incoming request to try to locate attacks on the service etc. Saved monitoring data can be used to analyze the service’s behavior over time, predict failures and thus increase its maintainability etc.



 
Tags: Q&A | SOA | SOA Patterns | Software Architecture

Earlier today I read a post by Michael Feathers Called "10 Papers Every Developer Should Read (At  Least Twice). I knew some of the articles mentioned there and learnt about few interesting ones.I liked it so much,  I thought I'd compile a similar list for software architects - based on stuff I read over the years.

1. The Byzantine Generals Problem (1982) by Leslie Lamport, Robert Shostak and Marshall Pease - The problem with distributed consensus
2. Go To statements considered harmfull (1968) - by Edsger W. Dijkstra - Didn't you always want to know why ? :)
3. A Note on Distributed Computing (1994) - by Samuel C. Kendall, Jim Waldo, Ann Wollrath and Geoff Wyant - Also on Michael's list but it is one of the foundation papers on distributed computing
4. Big Ball of Mud (1999) - Brian Foote and Joseph Yoder - patterns or anti-patterns?
5. No Silver Bullet Essence and Accidents of Software Engineering (1987) - Frederick P. Brooks - On the limitations of Technology and Technological innovations.
6. The Open Closed Principle (1996) - Robert C. Martin (Uncle Bob) - The first in a series of articles on Object Oriented Principles (you remember the debate on SOLID...)
7. IEEE1471-2000 A recommended practice for architectural description of software intensive systems (2000) various- It is a standard and not a paper but it is the best foundation for describing a software architecture I know.
8. Harvest, Yield, and Scalable Tolerant Systems (1999) Armando Fox, Eric A. Brewer - That's where the CAP theorem was first defined
9. An Introduction to Software Architecture (1993) - David Garlan and Mary Shaw - one of the foundation articles of software architecture field (although based on earlier work by the two)
10. Who Needs an Architect? (2003) Martin Fowler - Do we or don't we?

I could come up with quite a few more articles not to mention books that aren't in this list. However these are definitely some of the most influential papers I read.




 
Tags: data | Design | OO | Software Architecture

January 25, 2009
@ 11:42 PM
If you read this blog regularily you've probably heard/read about the 8 fallacies of distributed computing once or twice ... you know the assumptions architects and designers tend to make when designing distributed systems which prove to be wrong down the road, causing pain and havoc in the  project.  (indeed my paper explaining them is the second most poplar download on my site with just about 50K downloads)
Originally drafted in 1994 by  Peter Deutsch (with one more added by James Gosling in 1997). These fallacies still hold true today. I still see designers make these same old mistakes in modern  SOAs, RESTful designs and whatnot - but that's not the reason for this post.
What I want to talk about is the second fallacy "Latency is zero".

The more I think about it the more I think this fallacy should be updated to "Latency is zero or constant" (or add another fallacy for "latency is constant" on its own).

What's the difference?

Well, "latency is zero" fallacy means treating remote "things" as if they are the same as local "things". We can't do that - we need to build the API of remote things to take the fact the information takes time to get there into account (e.g. chatty interfaces vs. chunky interfaces). You can see more on that in a post called "Why arbitrary tier-splitting is bad" i wrote about a year ago

The "latency is constant" fallacy means thinking that if we send several batches of "stuff" to a remote "thing", they may arrive late but at least they'll arrive in order. Or to move from "things" and "stuff" to more concrete terms if you send messages over a network from one service to another they won't necessarily arrive in order.

But wait isn't it only true for  asynchronous messages? if we make synchronous calls we don't really care about this, now do we? That's only true if you and the service you are consuming are alone in the world. In all other cases (i.e. most of the time) even if you make all your calls synchronous, you can't know what other messages (from other senders) will arrive in between your messages - and how it will affect its state.

Unreliable latency can also mean we'll retry a message because we think it is lost and find out that the reciever gets it multiple times later.

These are things you really have to take that into account when you make multiple related calls - like,say, in a saga. One thing you can do to help is make messages idempotent (which also helps with the "network is reliable" fallacy). You can also increase latency even more and order the messages something that happens, for example, when  streaming video or audio.

What you really need to think about is  ACID 2. No, I am not talking about the database transactions ACID but rather on another term I first saw in "Building on Quicksand" (paper (pdf)/ppt) by Pat Helland. In this paper Pat talks about some of the implications of unreliable conditions (such as inconstant latency, failure etc.) on fault tolerance. ACID 2 (which apparently was  coined by Shel Finkelstein) stands for Associative, Commutative, Idempotent and Distributed. i.e. messages can be processed at least once , anywhere (same machine or across several machines), in any order.

That's harsh but I think that If you are building distributed systems today (SOA or otherwise) you can't ignore it.






 
Tags: REST | SOA | Software Architecture

January 16, 2009
@ 07:08 PM
In a post called "Rhino Service Bus: Saga and State" Ayende said
"In a messaging system, a saga orchestrate a set of messages. The main benefit of using a saga is that it allows us to manage the interaction in a stateful manner (easy to think and reason about) while actually working in a distributed and asynchronous environment."

I really don't agree with this definition of a saga. The Saga provides a context for set of messages to allow manging an effort for distributed concensus. It does not "orchestrate" messages (that's what workflows are for) - you can read more on Saga's in an excerpt from my SOA patterns book:  Saga pattern.

Here's the comment I left on Ayende's site:
"What you describe is nice except it isn't a Saga it is more of a workflow. The notion of Saga which is originated from databases relates to the overall coordination of state between the different services - or the context for the whole business process.
In the coffee shop example you use that would be the whole "transaction" from the point the customer orders her coffee until she either gets it or the transaction is canceled (e.g. it took too long and the customer leaves or the coffee shop is out of milk etc.)
Unlike database (or distributed) transaction when/if a saga is aborted the different component of the system might not return to their previous state e.g. if the customer complains that the coffee is not good and gets her money back. the milk is not separated back from the coffee beans and returned to the bottle - rather the coffee cup goes to the trash.

Workflow is one strategy a service can take to handle the long running interaction within a saga. In your case the BristaSaga class (which I think should be BristaWF) orchestrate the internal state transitions depending on the different messages that arrive within the saga. In your case you have a hardcoded workflow - but it is also possible to use a workflow engine for the job.

By the way, in the above example you could also use a statemachine instead of a WF to manage the process "
In another comment Kristofer asked me:

Arnon: I'm not 100% sure of how you distinguish a Saga from a Workflow, could you elaborate some more on this?

A Saga involves a number of underlying workflows?
A Saga might as well contain a number of underlying Sagas?

Isn't it just a question of at what level it is initiated?

If a Saga should represent the whole transaction / business process, then who should handle it? Couldn't it be implemented as a Saga, exactly as Ayende describes it, by the initiating service (in this case the ordering)?, which then also is given the responsibility to handle restoring the total state etc of underlying/involved services if the transaction is aborted? The possibility to restore state does of course depend on what the specific Saga is handling, some processes might not be able to "rollback" completely, it's rather a question of rolling back all involved parties to a known/acceptable state."

The answer is that ,again, Saga is similar to a transaction in the sense that it provides a shared context for an attempt to get a distributed consensus  Unlike a transaction which insures ACID properties. Sagas are not.
The concept of dissipating that shared context, having each party (service) affect whether the saga should be aborted or successful etc. is what I call a saga.
When a saga is aborted the only thing the coordinator can do is pass the status to the participants. Each of the services is responsible to do its best effort to handle the abort (either by rolling back, compensation or whatever)

Workflow is another thing altogether. which keeps a context between calls and means externalizing the decisions on the logic flow from the business logic (usually with a workflow engine). You can use workflows within a service (a pattern I call workflodize) or you can use them externally (a pattern I call orchestrated choreography e.g. BPM)
You can use either form of workflow to support the implementation of a saga but you can also implement sagas without workflows.
In our system we use an "event broker" (see www.rgoarchitects.com/.../EventingInWCF.aspx) the event broker infrastructure dissipates the saga context when you raise a saga event. A service that initialized a saga (by sending the first event) can choose to close the saga (commit) or abort it. etc. We don't currently have any workflow driven services (but some of them use a state machine as an alternative)

(I think the term Saga does not describe Ayende's class since the "barista" is just on of the participants in the saga there are other participants.)

Powered by ScribeFire.


 
Tags: SOA | SOA Patterns | Software Architecture

January 12, 2009
@ 08:42 PM
When describing the "known exceptions" to the Knot anti-pattern, I wrote the following:
Starting out on a large project, such as moving an enterprise to SOA, is difficult enough as it is. You can’t figure everything in advance; you need to deliver something – so as Nike says “just do it”. Get something done. You do need to be prepared to let go and redesign further down the road

In a comment to that post, Derrick Gibson wrote:
I have concerns about a "just do it" approach; it belies an assumption that at some point in the future the opportunity will be there to do things a "right way", whereas today time does not permit adherence to this mythical "right way".

One cannot put off til tomorrow that which should be done today. There is no guarantee of any future work to do "enhancements" or "architecture" and there is certainly no guarantee that even if there is a future project, you will be around to work on it. The next team will be starting from scratch and they will be left literally scratching their heads asking, "why did Team Alpha make this decision?"

So, if you make the first assumption that your team has to implement the best architecture it can with the time it has allotted, then will that not lead to other discussions along the way that prevent laying the seeds for this anti-pattern?

For instance, would not the use of a service bus and an approach that says each application makes calls to and receives responses from a service bus, free you from having services that call each other? Now, your services are no longer dependent upon other services or even other back-end data stores, so as new processes are defined and/or new systems are implemented (or others retired), your services remain agnostic to those changes.

This requires your service bus to have the logic which says, "this message needs to be routed here, while that message needs to be routed there." Wouldn't this approach resolve the knot anti-pattern before it ever originates?
The concrete  answer to this comment  is that service bus is one of the candidate solutions to solve/circumvent the Knot anti-pattern (as I also mentioned when I described the anti-pattern) - The question it begets  however is how do you know that the service bus is the right architectural decision for the project on the onset?! Ths question has much wider implications.

In "Who needs an Architect?" (a worthwhile reading in itself) Martin Fowler mentions that we can look at architecture as "things that people perceive as hard to change". The conclusion from that that an architect can do her work better if she doesn't impose these "hard to change things" or does that  as late as possible.
My experience is that when you start a "new grounds" project (such as moving an enterprise to SOA) there are a lot of moving parts. What I mean by that is that the uncertainty levels are very high e.g. the requirements are not set, the understanding of the technology and/or domain is partial, team is new and what not. Making a definitive architectural decision, which is "hard to change" and has a lot of effect on how you design your system and/or has substantial costs (both in licensing, training, adoption etc.) is not necessarily the right the decision. In fact, chances are you initial architectural decision will be flawed.

A phrase I heard from Ivar Jacobson once  is "plan to throw one away, you will anyway" - This is something I try to take with me and differ costly decisions if possible. Especially considering initial releases usually suffer from "time-to-market" constraints. To use a cliche - sometimes you need to go slow to go fast. By the way, this is one place where I don't agree with Uncle Bob who recently said "When is redesign the right strategy? ... Here's the answer. Never."

Like every guidance, this isn't always true. For instance, if this is your n-th similar project and you already know enough about it to say that an architectural pattern X (say service bus) or technology Y (say Hibernate) is good then, yeah go ahead and use that. You still want to consider the "cost to change" though since you can still be wrong.



 
Tags: SOA | Software Architecture

January 5, 2009
@ 08:02 PM
We are going to use some of our test code in production. Yes you read it right test code in production. Here are the details
In our system, among other things, we support visual search in video calls. i.e. an end user calls the system, points the camera at something she is interested, and (hopefully :) ) gets relevant information. Basically the system is made of several resources (image extraction, identification etc.) that collaborate via an event broker. We have a blogjecting watchdog that makes sure everything is up and running and we have applicative recovery service to handle failures.
The watchdog makes sure resources/services are up, resources report their liveliness and wellness so we know more about the resources than the fact that they are up. However, we still need a way to make sure that resource instances  can collaborate to provide the service.

Enter our automated acceptance tests. Part of our development effort included building a test runner for automated tests scenarios, e.g. load tests, verifying algorithms correctness etc. One of these tests is the smoke test (run after each successful build) which includes a sunny-day scenario of a video call- as described above. What we're going to do now is build on the test runner and the sunny day scenario a "keep-alive" tester that will periodically make test calls to the system (depending on the current load etc.) and make sure that everything is still working correctly.


So there you have it, an unexpected benefit of automated acceptance tests, who would have thunk it :)



 
Tags: .NET | SOA | Software Architecture | TDD | WCF

The year is almost done so I'd thought it would be a good time for a short retrospective into what I blogged here. The 13  posts below are the ones  I liked best this year. Turns out these posts touch on a lot of different subjects: requirement, software management, agile development, architecture, SOA and programming.



 
Tags: Agile | Project Management | SOA | SOA Patterns | Software Architecture | TDD

December 16, 2008
@ 10:36 AM
An initial draft for the Knot anti-pattern, As usual any comments are welcomed. You can also download it in PDF form

Everything starts oh so well. Embarking on a new SOA initiative the whole team feels as if it is pure green field development. We venture on - The first service is designed. Hey look it got all these bells and whistles; we are even using XML so it must be good. Then we design the second service, it turns out the first service has to talk to the second – and vice versa. Then comes a third, it has to talk to the other two. The forth service only talks to a couple of the previous ones. The twelfth talks to nine of the others and the fourteenth has to contact them all – yep our services are tangling up together into an inflexible, rigid knot

 

The above scenario might sound to you like a wacky and improbable scenario - why would anyone in the right mind do something like that?  Let’s take another look, with a concrete example this time and see how the road to hell is paved with good intentions. In Figure 10.1 below we see a vanilla ordering scenario. An ordering service sends the order details to a stock service, where the items are identified in the stock, marked for delivery and then sent to a delivery service which talks to external shipping companies such as DHL, FedEx etc.




Figure 10.1 a vanilla ordering scenario. An ordering service sends the order to a stock service, which provisions the goods to a delivery service which is responsible to send the products to the customer

 

If we think about it more we’ll see that when an item is missing from the stock we probably have to talk to external suppliers, order the missing items and wait for their arrival- so the whole process is not immediate. Furthermore since the process takes time, it seems viable to cancel the process if an order is cancelled.  It seems we have two options (see Figure 10.2) either the ordering service will ask the two other services to cancel processing related to the order or the two services call the ordering service before they decide what to do next.   Naturally the system wouldn’t stop here, we would want to introduce more services and more connections e.g. an Accounts Payable service  that interacts with the external suppliers, the stock service and the delivery  service(since we also need to pay shipping companies) etc. 



Figure 10.2 a little more realistic version of the Ordering scenario from figure 10.1. Now we also need to handle missing items in the stock, cancelled orders and paying external suppliers. In this scenario the services get to be more coupled. For instance the Ordering service is now aware of the delivery service and not just the stock service.

 

With each new service we draw more lines going from service to service, and with each new service we update the services’ business logic with the new business rules as well as knowledge of the other services’ contracts.

 

1.1.1 Consequences

Well, so we get more lines going from service to service that normal isn’t it? After all if the services won’t talk to each other they won’t be very useful? Isn’t that the whole point of SOA?

 

Well, yes – and no. Yes it is normal for services to connect to each other.  After all, creating a system in an SOA is connecting services together.  As for the “no” part, the problem lies with the way we develop these integrations   if you are not careful it is easy to  get all the integration lines in a big, ugly mess – a knot

 

A knot is an Anti-pattern where the services are tightly coupled by hardcoded point-to-point integration and context specific interfaces

 

For instance, what happens when we want to reuse the ordering service mentioned above? No problem, we just call it from the new context. Alas, the knot prevents us from reusing it without hauling in the rest of the baggage - all the other services we defined above (the stock, delivery etc.) if the new context is not identical in it ordering processes and matches what we already have we can’t use it. Or we can’t use it without adding one-off interfaces where we add specific messages for the new context and all sort of “if” statements to distinguish between the old and the new behavior. Another option is to make this distinction in the original messages, which either not possible or forces us to make sure the other services are still functioning. In any event it is a big mess.

 

Let’s recap. We moved to SOA to get flexibility, increase reuse/use within our systems, prevent spaghetti point to point integration – what we see here is not flexible, hard to maintain and basically it seems like we are back in square one and we invested gazillions of dollars to get there.

 

 

1.1.2Causes

How did that happen?  How can a wonderful, open standards, distributed, flexible SOA deteriorate to an unmanageable knot?

 

It is tempting to dismiss the knot as the result of lack of adequate planning. If we only planned everything in advance we wouldn’t be in this mess now. Well, besides the point that trying to plan everything ahead of time is an anti-pattern in itself (an organizational anti-pattern – which isn’t in the scope of this book). There’s still a good chance you’d get to a Knot anyway since the problems are inherent in the way business work.

 

If we take a look back at the Integration Spaghetti scenario discussed in chapter 1 (depicted as figure 10.3 below), we can see that the phenomena was there as well, when we our business processes evolve we find we need to interact with information from other parts of the system. The flow of a business process expands to supply that needed information or service and thus the Knot grows.



Figure 10.3 the Knot anti-pattern is similar in both effect and origin to the spaghetti integration in non-SOA environments

 

From the technical perspective, we have two forces working here. One is the granularity of the services. On the one hand, Services are sized so that a business process requires several of them to work together. On the other hand they aren’t small enough so that they would be an end-node in the process (i.e. only other services would call the service and it will just return a result). Note that this isn’t a bad thing in itself, after all if each process was implemented by a single service we’d have silos not unlike the ones we try to escape by using SOA and if we set the services too small we’d fall into another trap (see the Nanoservices anti-pattern later in this chapter).  The bottom line is that while the granularity is a force that drives us toward the Knot, there’s not a lot we can do about it without getting ourselves into worse problems.

 

The second, stronger, force that pushes a system into a Knot is the business process itself.  Since, as we mentioned above, the process flows through the services, the services needs to be aware of the flow and then call other services to complete the flow.  In order for a service to call another service it has to know about its contract and know about its endpoint. When another business flow goes through that service we not only add the new contracts and endpoints but also the contextual knowledge of which other services to call depending on the process. And that’s my friends, is exactly the thing that gets us into trouble – the services start to tie themselves to each other more and more, as we implement more business process and more flows.

 

Hey, you say, but SOA should have solved all that, surely there is something we can do about it – or is there?

 

1.1.1Refactoring

 

The previous section explains that most of the problem is caused by having the services’ code determine where to go next and what to do with the results of the services’ processing. If there was only a way to somehow pry these decisions away from the services’ greedy hands…  As you’d probably guessed there is such away, in fact there are several such ways and this book lists three of them: The Workflodize pattern (Chapter 2), Orchestrated Choreography (Chapter 7) and Inversion of Communications (Chapter 5). Let’s take a brief look at each of these patterns and see how they help.

 

The workflodize pattern suggests adding a workflow engine inside the service to handle both Sagas (i.e. long running operations, see chapter 5) and added flexibility. The “added flexibility” is the card we want to play here. When we express the connections as steps in the workflow they are not part of our services’ business logic. They are also easier to change in a configuration-like manner both of these points are big plusses.

Still, a better way to solve the service to service integration problem is to use an external orchestration engine. The idea of using the Orchestrated  Choreography pattern is to enable Business Process Management- or a way for the organization to control and verify it processes are carried out as intended (you need an orchestration engine for that but it helps…). In the context of solving or avoiding the Knot anti-pattern, Orchestrated Choreography is better than Workflodize since it centralizes and externalizes all the interactions between services and thus effectively removing all the problematic code from the services themselves. Note that there’s a fine line between externalizing flow and externalizing the logic itself (see discussion in Orchestrated Choreography pattern, in chapter 7).

 

The third pattern we can use to refactor the Knot is Inversion of Communications. Inversion of Communications means modeling the interactions between services as events rather than calls. Inversion of communications is, in my opinion, the strongest countermeasure to the knot. The two patterns mentioned above bring a lot of flexibility in routing the messages between the services. The inversion of communications pattern also helps the message designers remove specific contexts from the messages since when the service’s status is raised as an event it isn’t addressed to any other service in particular. Note that using inversion of communications doesn’t negate using  either of the two other patterns mentioned above since that once the event is raised we still need to route it to other services and using a workflow engine is a good option for that. Another implementation option is to use an infrastructure that supports publish/subscribe (see the pattern’s description in chapter 5 for more details.)

 

Going back to the ordering scenario we mentioned above. As I mentioned, the services grow with needless knowledge of specific business process. So for instance, the ordering service had to know both about the stock service and the delivery one. Refactored with the Inversion of Communications pattern, the same Ordering service doesn’t have to know about any of the other services. In Figure 10.4 we can now see that the Ordering service sends two business events (new order, cancelled order) and the routing of these messages is no longer the responsibility of the service



Figure 10.4 the Ordering service using the Inversion of Communications pattern. Now the service doesn’t know/depend on other services directly. It is only aware of the business events of new order and cancelled order which are relevant to the business function that the service handled

 

Refactorings aside, one question we still need to think about is whether there are any circumstances where having a Knot is acceptable.

 

1.1.1Known Exceptions

 

In a sense the Knot is a distributed version of an anti-pattern described by Brian Foote and Joseph Yoder as “Big Ball of Mud” – spaghetti code where different types of the system tied to each other in unmanageable ways. The reason for mentioning the connection is that the reason that “Big Ball of Mud” might be considered a pattern rather than an anti-pattern also apply here:

 

“[when] you need to deliver quality software on time on budget… focus first of feature and functionality, then focus on architecture and performance”

 

Starting out on a large project, such as moving an enterprise to SOA, is difficult enough as it is. You can’t figure everything in advance; you need to deliver something – so as Nike says “just do it”. Get something done. You do need to be prepared to let go and redesign further down the road. In the current system I’m working on – a visual recognition/search engine for mobile, we went with a “knot” approach for the first release. The simplicity of the implementation, i.e. less investment in infrastructure, ad hoc integration etc. enabled us to deliver a first working version in less than 6 months. These 6 months also helped us understand the domain we are operating in much better and more importantly get to market with the feature the business needed in the schedule the business wanted. We spent the next 6 month rewriting the system in a proper way, including applying the Inversion of Communications pattern mentioned above.

 

To sum this up, coding the integration code into services is likely to end as a Knot. It is acceptable to go down this path for a prototype or first version i.e. to show quick results. However you do need to plan/make the time to refactor the solution so you will not get stuck down the road.





 
Tags: SOA | SOA Patterns | Software Architecture

December 8, 2008
@ 10:56 PM
I am (finally) writing some new stuff for my SOA book - working on a few Anti-patterns
  • The Knot - The distributed version of "big ball of mud" basically point to point integration
  • NanoServices - designing/building fine grained services (methods != services)
  • 3-tiered SOA - dressing up 3-tier architecture in SOA clothing (e.g. database as a service)
  • Whitebox Services - exposing internal structure - comes in two flavors exposing technology and allowing access not through contracts
  • Transactional Integration - inter-service transactions (use Sagas instead)
  • RESToid- combing SOA and REST without understanding the full implication of either
I am going to publish one of them (probably the "knot") in a few days but I thought I might be able to get a little feedback before that. I chose to describe anti-patterns in the following format:

  •  Context - Presenting the problem (probably through an example)
  •  Consequences - Explaining what the problem is. i.e. what happens when the anti-pattern is prevalent
  •  Causes - discussion on the forces that lead to the anti-pattern
  •  Refactoring - The patterns (and/or other tips) that can be used to fix the design
  •  Known Exceptions - Are there any contexts where using the anti-pattern is acceptable
I'd be happy to hear any comment you have on the anti-patterns listed above as well as comments on the structure for describing them

Thanks
Arnon


 
Tags: REST | SOA | SOA Patterns | Software Architecture

While I am talking about presentations - if you live in Israel and want to learn about software architecture I recommend you check out the seminar that Rick Kazman will be giving on Dec. 15th and Dec. 16th.

Rick was a researcher in SEI is a co-author of "Software Architecture in Practice"  (probably the best book on software architecture until "Software Systems Architecture" dethroned it, but I digress) and one of the people responsible for "Architecture Tradeoff Analysis Method" -  ATAM (you can see my presentation on ATAM for an intro)
While I expect the seminar to be on the "process heavy" side (after all ATAM is heavy not to mention that SEI also brought us CMMI) it is probably the best architecture training you can get in Israel (Well, at least until I'd find the time to make my architecture training program into a reality ;)) In any event SEI's work on quality attributes provides, in my opinion, one of the most important tools for architects to capture and understand architectural requirement and the section on that is a reason enough to attend.

By the way, this seminar is organized by ILTAM, which is an organization that doesn't have much to do with software. Also judging the attendance in past events, seems to be little known organization. Yet, over the years, they've managed to bring here some of the best minds in the software field e.g. Ivar Jacobson, Jim Coplien, Don Ferguson,  Phillippe Krutchen and now Rick Kazman



 
Tags: Software Architecture

I got a question from Dru for my opinion of tow messaging subscription modes - subscription by message (type) and subscription by topic
The way I see it there are two different usages for Topics.

The first use for topics is for grouping or marking related messages. In this scenario you can actually break the subscription into three different levels of  generalization:
  1. Per message- interested parties subscribe to a specific type
  2. Topics - interested parties subscribe to a set of related types
  3. Topics hierarchy - interested parties subscribe to a set of sets
Here, when it comes to topics -on the pro side you get to easily subscribe to a lot of message types and on the con side you get to easily subscribe to a lot of message types...
The less specific the subscription, the harder it is to ensure it would work in open environments (i.e. when different organizations or groups get to integrate with your services). The problem lies in the number different messages you need to be able to handle/understand/parse and the control on new  types of messages. Getting versioning right with messages is hard enough when you have a hierarchy well that's just much harder

The second use for topics is routing.In this scenario a specific message type  can be sent using different topics.And the  topics basically become part of the  meta-data of the message. The supporting infrastructure can then use that meta-data to get messages to different subscribers. For example,  In a defense system project I participated in , we used Tibco Rendezvous support for topics to define interest regions on a closed set of messages e.g.  say you want only the messages related to the middle-east or the ones related to the US etc.
In the current infrastructure I am building I am going to implement something similar to topics (albiet without hirarchies) to allow different routings based on different saga types (so services that stay the same don't have to change thier behaviours)

To sum this, I would say that in my opinion the latter use for topics  is more useful for  general purpose use and the first use for topics is more useful in closed systems

P.S.
if you have an interesting question on SOA or architecture you can send it in and if I think it would interest a wider crowd I'll blog it here
 

Tags: ESB | SOA | Software Architecture | Q&A

Since the last post turned out kind of lengthy (3000+ words) I thought it would be more comfortable to read as a PDF whcih you can  download directly here or  from the Papers, Presentations and Articles section of my site
Also it is probably a good opportunity the "Architect soft skills" is the fourth and last part of the series of posts:
 

Powered by ScribeFire.


 
Tags: General | Software Architecture

October 26, 2008
@ 11:55 PM

Introduction

There's a lot of discussion about the hard skills software architects needs to have; for example, see one example at the Software Engineering Institute (SEI). Architects need to be familiar with a wide range of technologies, methodologies, understand the software lifecycle, have design experience, and some say an architect must write code, and so on and so forth. Indeed, the hard skills are important, very important. However it doesn't stop there. There are also several soft skills that you need to master if you are to be a good architect.

I believe that the minimal skill-set for an architect should include capabilities from the following areas:

  • Leadership. Influencing others to accomplish tasks and following your guidance
  • System thinking. Understand decisions and constrains in the wide scope pertaining to whole of the solution at hand. This includes the ability to abstract problems.
  • Strategic thinking. Understanding decisions and constrains and their alignments to the overall business of the company.
  • Organizational politics. Understand the environment you operate in and how it influences you.
  • Communications. Making sure you get your point across.
  • Human relations. Understand the "people" aspects or human factors and dynamics. This includes things like pragmatism, understanding team dynamics and personal dynamics

Let’s take a look at them one by one

Leadership

Solutions architects are the "technical managers" of projects. This means they are responsible that all the designs and code are aligned to the functional requirements and that the quality attributes are kept.

However, architects are seldom the direct managers of a development team - and even if the architect is the manager, you still need to inspire your workers. Tyranny? Well, that just doesn't work. You might think that establishing yourself as a technical authority will be enough (that's why they made you the architect in the first place, right?) but it isn't. You need to cultivate your leadership skills as well.

What is leadership anyway? Leadership is about exerting influence on people and increasing the chance that people will follow your vision and decisions. To do that you need to gain the respect of the teams you work with, communicate your vision and designs clearly and are trustworthy.

So how do you do that? Unfortunately, I don't have a definitive answer to that but here are a few things to think about which I think are useful:

  • Provide direction. To lead you need to know where you are going and make the decisions that will get you there.
  • Explain your decisions. Deus ex machina doesn't count. You work with intelligent people, they may not have your experience or the same depth of knowledge but they want to know why they are doing something.
  • Listen to what others have to say. They may actually say something valuable you know :)
  • Don't postpone decisions and don't avoid conflict. This will not make them go away. Do try to manage your conflict though.
  • Motivate people. This can be done by things like mentoring and teaching, allowing people design freedom. (Letting others make the decisions in their relative fields/areas even if the solution they propose is not perfect.)
  • Set an example. E.g., you can sit (pair with) other developers in the team to design/code important things together

Leadership is one of the most important soft skills. As I mentioned earlier, architects are usually not the managers but they do need to lead if they want to ensure the stakeholders’ needs (the system quality attributes) will indeed make it into the solution. To better grasp the quality attributes you’d need “System Thinking”

System Thinking

If you managed to make yourself a leader, it only increases your responsibility to actually know where to go. Two soft skills that can help you with that are System Thinking and Strategic thinking (which is described in the next section)

Microsoft Encarta defines system as "any collection of component elements that work together to perform a task." (You may want to take a look at some of the characteristics of systems by Donald E. Gray). When we get a problem, that is, a software solution that needs to be built, we tend to think about breaking it down into manageable "parts" (the subsystems/components/services/objects) that makes that solution.  This, however, will only take us so far if we don’t also employ "Systems Thinking".

System Thinking originated as a way to think about social systems but has emerged as a way for problem solving problems for systems in other practices. In essence it is about understanding that system parts that work together behave differently then each part alone ("The whole is greater than the sum of its parts"). Which is, by the way, the reason "loose coupling" is such a holy grail--as it helps reduce the parts interdependence and interactions, and thus simplify the system.

One important trait for understanding systems behavior and component interaction is the ability to create abstractions and models; that is, simplifications of the reality which contain enough detail to be useful (Another thing that is needed is the ability to communicate that to the different stakeholders which is another soft skill I'll expand upon). It is important to remember that mental models limit the perspective we have which is why we need to have several models and why it is beneficial to have more than one person working on a problem

As it happens this is also aligned with my definition of software architecture:

Software architecture is the collection of the fundamental decisions about a software product/solution designed to meet the project's quality attribute requirements. The architecture includes the main components, their main attributes, and their collaboration (i.e. interactions and behavior) to meet the quality attributes. Architecture can and usually should be expressed in several levels of abstraction (depending on the project's size).

This definition gives interactions and the environmental implications on the system as a whole the same weight as designing the parts themselves. If the architect doesn't (or can't) understand the effects of the components playing together the quality attributes (performance, availability etc.) of the system will suffer and the system will not operate as planned.

For an introduction to the subject, I recommend Gerald M. Weinberg's Introduction to General Systems Thinking. It lives up to its name.

The second part of understanding “where to go” is strategic thinking.

Strategic Thinking

While system thinking takes care of the system in its environment, strategic thinking is about understanding where the organization is heading and its long term goals so that the solution being developed is in sync with them. While I guess it is easy to see why this is important for internal projects, I think it is also important for product/solution companies.


Strategic thinking involves the techniques and thinking processes essential to setting and achieving the business's long term priorities and goals. Strategic thinking (understanding the problems) is the preamble to strategic planning -- but we'll leave planning to management and focus on the understanding which is important for the architects.

Ruth Malan and Dana Bredemeyer define the Strategic Thinking soft skill (which they call "Strategic Perspective"):

“[An architect who has strategic perspective (ARGO)] understands the industry, market, customers, competitors, suppliers, partners and capabilities of the business. Identifies opportunities and threats, and actively identifies trends and future scenarios. “

This is a good definition because it really explains why this skill is so important for architects. One of the architect's core roles is to understand what the different stakeholders’ need, then balance these needs to create a usable, robust solution. Their needs expressed as quality attributes are the driving force of the architecture.

Furthermore, if you think about all this "align business and IT" stuff we constantly hear about these days (especially in regard to SOA), it is evident that all the careful planning of how technology and software can help in getting that alignment is useless unless we really have a good understanding of where the business is going and what this alignment really is. Thus, the architect should really “get it”. The architect should first understand what the business is about and where it is going. Then, armed with this understandings and insights, she can translate them to technological and architectural decisions that ensure these needs are met.

In an ideal world, gaining these insights would be enough. However, the architects do not operate in a vacuum. Architects should also understand the organizational forces that can make the solution work (or break)

Organizational Politics

If strategic thinking helps you understand where the organization is going, Understanding Organizational politics helps you understand how e organization is working.

Consider the following anecdote; A while ago I co-architected a Naval Command and control system. One of the key elements of that system was a service bus component which we wanted to base on a commercial messaging middleware. We thoroughly explained why choosing messaging was the best choice for the to the project’s management. Nevertheless, another solution based on a proprietary (and fundamentally flawed) distributed objects middleware was constantly suggested and eventually made a constraint we must follow. It took us several iterations (and a lot of rework later) to prove that a messaging middle-ware was indeed the (much) better solution for that project). What happened here? Two experienced architects gave ton of good reasons justifying a technical decision, but somehow that decision was overruled, why?

Decision making, especially technical decision making, seems like such a logical process. You just look at the alternatives; analyze the merits vs. the problem at hand, and may the best option win. This works out well if you are the king (or work alone which makes you the king by default) -- otherwise there are other people and they won't necessarily agree with you. One reason for that may be they really have another solid opinion, in which case you need to negotiate with them , but that has to do with the leadership skill (we already mentioned) or communications skill (discussed later). The other reason for people to disagree is that they may have other interests and agendas, which run much deeper than the positions they externalize (i.e. their disagreement with you).

Organizations (and the larger they are, the more complex they get) tend to get to decisions by employing a system of rules which encompasses a lot of interests on top of the rational reasons to agree or disagree. Understanding Organizational Politics is about understanding these non-rational influences on the decision making processes.

To return to the anecdote above, it didn't take us too long to understand the real motivation there. It turns out both the project leader boss and a few others recommended buying that flawed component which also happened to cost a small fortune. As that component was already bought, it had to justify itself by being used everywhere. In this particular case the only way we found to reverse the decision was to prove that it was flawed and to minimize its infiltration into the project (so it will be relatively easy to remove it later).  In other cases there might be more cost effective ways to do achieve the desired result.

The first thing to do is understand where you are. This is one of the reasons the first step in the SPAMMED framework is to understand the stakeholders. If you manage to uncover the agendas and interests of the different stakeholders as well as the influence they may have on your project, it will at least help you pin-point your problems.

 The tricky part in knowing to how to navigate and influence the organization. Tools you can use are interpersonal skills, networking capabilities, schmoozing, and you also need excellent communication skills.

The point I am trying to make here is that even though technical people tend to regard organizational politics as dirty, you cannot afford to dismiss them. Organizational politics can have a severe influence on your project and actions. I didn't talk a lot about how to become more judicious political animal, I am probably not qualified enough to do that (you may want to check some of the resources below though)

More resources:

As, I mentioned, one of the essentials to actually getting your points across is good communications skills.

Communications Skills

“What we’ve got here, is failure to communicate, some man you just can’t reach…”  - this can work for a warden in a movie or an opening to a song[1], but it is definitely not an excuse for an architect. The architect is a hub of communication between management, users, developers and what not - A failure to communicate in the case of an architect can mean a conceptual bug down the line if talking to a developer, increased costs and animosity if talking to a project manager or even cancellation of the project if talking to upper management.  

As a senior technical person you already know how to solve problem and translate your ideas into working code – However as an Architect working with teams you have to solve a few more soft problems. One barrier to cross is conveying your ideas to others – or presentations skills.

Presentations skills start with the way you create your slides (e.g. bullet points vs. telling a story) and go well beyond that into how you present, your stance, your interaction with the audience etc. Note that presentation doesn’t have to be a keynote address in OOPSLA it can be a by-the-white-board standup with a college. There are different focuses for each type of presentation but the principles are the same.

So assuming we covered presentation skills is that enough? – No, since explaining yourself will only set you off on a journey, you also need to engage in a dialog with the “others” so that you’d reach an agreement (That you are right, negotiate some middle ground or understand your errors). Thus the next component of communication skills is negotiation skills. Negotiation skills will let you defuse situation that would otherwise deteriorate into a bunch or raving lunatics shouting at each other and instead have a collaborative common problem solving experience. That may sound like BS but if you want to move things in positive directions you need to be prepared. If anything you should at least know when to stop (a.k.a. BATNA – Best Alternative to a Negotiated Agreement) but there are many more things you can do before that (see more resources below)

More resource

 

Human Relations

Negotiations  skills, Leadership and  dealing with organizational politics are all, in a sense, aspects of Human Relations. Nevertheless Human Relations have a few additional  aspects which I think we, as architects, should be aware of. Basically there are three main components I want to discuss–team dynamics and personal dynamics and pragmatism.

The first aspect  of  human relations is team dynamics. There are several models explaining the various stages teams go through as they grow. I think the best knows is Bruce Tuckman’s model Forming-Storming-Norming-Performing.  The team members act and interact differently during these stages as a (technical) leader of a team or teams; you should be aware of these dynamics and behave accordingly. For instance mentoring and guidance are usually more welcomed during the Forming stage, while these efforts might be rejected during the storming one.

Looking at the team as a whole is one thing. However the developers and the rest of the stakeholders are all individuals. Each with his/her own personality, motivations and what not. Again there are many theories related to individuals from Maslow’s pyramid of needs (which talks about what motivate people) to Jung’s personality types which affects how people interact with each other’s e.g. software developers are likely to be Introverts, Sensing, Thinking, Judging/Perceiving (ISTJ/ISTP) so, for instance, they would not like to be told to do things that don’t make sense to them. In any event, I guess my main advice here is to be aware of some of these theories and pay attention to how they manifest themselves in situations you encounter. In a company I once worked for, I got a small tip from my direct manager. It turned out that when I was talking to people I didn’t hold in high regard – it well, er.. showed. This made them feel uncomfortable working with me but unfortunately none of us was going away. Being aware of the situation made me improve myself. For instance, I wouldn’t just cut away their speech when they tried to make a point or tried not to talk down at them, and hey, even listen sometimes J- This brings me to the last point I want to make on the subject – Pragmatism.

Pragmatism - the art of the possible.  As a leading technical figure (I assume that how you got to be an architect) you should be wary of architectural tyranny. Letting others have their way even if it that way is not the great (I am sure) way you’ve managed to come up with.  There’s a whole lot of clichés that can be thrown here like “there’s more than one way to skin a cat”, “better be smart than right” etc. so what? Sometimes even clichés are rightJ. As project span over a long time and you have to maintain good relations with most if not all the involved parties you should be watchful for absolute truths. Sometimes a little bit of pragmatism can go a long way towards making the atmosphere of the project calmer and nicer. If you take into account the person you are taking to or

 

Further reading

·         Team dynamics what to expect and do - Abhishek Agrawal talks about some of the models for team dynamics

·         Collaboration Explained -  A book by Jean Tabaka on team collaboration

·          

Summary

The architect’s role goes well beyond the technical skills. I hope that this short paper helped highlight some of the softer skills that are required (in my opinion) for an architect to be successful.

The goal of this paper was not to teach all the soft skills but rather to highlight tem and increase the awareness for them. I’d be happy to hear if you find any of the stuff here useful



[1] Cool Hand Luke (1967) and opening line for Gun’s and Roses “Civil War”


 
Tags: General | Papers | Software Architecture

October 18, 2008
@ 10:58 PM
As I mentioned in a couple of previous posts (like "Using REST along with other architectural ), I've been spending the last few weeks writing an Event system over WCF (probably also explains posts on  WCF gotchas like this;) ). Being a communication infrastructure it is still a long way from being completed, but it seems to be stabilizing and I think it turned out nicely so I thought I'd share a few details.

Let's start with the simple part - the usage.
The eventing is built on the idea of a bus (i.e. no centralized components) and the resources/services that want to use eventing have to use a library which I call EventBroker.  There are two modes for using the EventBroker. one is "regular" events which are contexless. This means that consecutive events can reach different services, and there is no context that flows from event to event:

bool raisedEvent = eb.RaiseEvent<SampleEvent>(new SampleEvent());
The second type of events are Sagas, which represent long running interactions. Sagas does have a "best effort" guarantee to reach the same recipients over consecutive calls. Also you can also End sagas (sucessful termination), Force End Saga (successful termination by a service that didn't initiate the saga) and Abot Saga (unsuccessful termination): Here is how you raise a saga event.
var evnt = new SampleEvent { data = somevalue};
var SagaId = Guid.NewGuid();
eb.RaiseSagaEvent<SampleEvent>(SagaId, evnt);
if you use the same Saga Id, the events are handled as part of the same saga, if you use a Saga Id that wasn't previously defined it will initialize a new saga.
The eventbroker translates events to the relevant contract and dispatches the events over to the different subscribers. Which brings us to to the next part which I  guess,   is also a little more interesting. How subscriptions are defined.

The first thing to do is to define the event itself.
    public class SampleClassEvent :ImEvent
{
public string DataMember1 {set;get;}
public int DataMember2 { set; get;}
}
There aren't any real constraints on the event, except that it has to "implement" the ImEvent interface. Which is really an empty interface but it marks the event as one for the event broker.
Then you have to define an interface for handling the event. The event broker, builds on the idea of convention rather than configuration (an idea popularized by the rails framework) so it is easier to generate the interface (something I do with a resharper template)
    [ServiceContract]
public interface IHandleSampleClass
{
[OperationContract]
int SampleClass(SampleClassEvent eventOccured);

}
The convention is that the interface will have a IHandle prefix followed by the name of the event. It will hold a single operation named like the event (without the Event suffix) and will recieve a single parameter which is the event data. Currently  events do return a value (int) but I am thinking about changing it to void and have everything marked as OneWay for added performance

Now, when we create a service which needs to handle events it will do that by specifing which events it handles. E.g.
    [ServiceContract]
public interface ImSampelResource : ImContract, IHandleSampleClass, IHandleSomeOtherThing
 {
}
So each contract declares all its subscriptions (by a list of IHandleXXX). It should also include the ImContract interface which holds all the service operation used by the eventbroker (e.g. ending sagas etc.).
Services that want to raise events should inherit from a ControlEdge class (base class Edge component that delegates control events to the event broker)

There's still the question of how does the event broker knows where to find other services. There are several ways this can be done (e.g. a service repository) but since we have  blogjecting watchdogs in place anyway, we use them to propagate liveliness (and location ) of services.

This sums up this post. It is basically just a little context for several planned posts where I hope to talk about some of the challenges, alternatives and design decisions that led me to the current design. Meanwhile, I'd also be happy to hear any comments, ideas or reactions you may have
 
Tags: .NET | Design | OO | SOA | SOA Patterns | Software Architecture | WCF | xsights

(This part III of a series -see Part I, Part II)

We need to face it, there are no absolute truths when it comes to software architecture( I guess that's part of the reason the term always looks so fuzzy. ) Should we use REST? it depends. Should we use OR/M or direct database access? it depends. Sometimes even a big ball of mud can be a good option. The good news is that we can always answer "It depends" to any architectural question and always be correct. The bad news is that it is our role to figure out what does it depend on and come up with a viable trade off.

The fact everything is a tradeoff doesn't mean that there aren't any cases where the trade-off is simple. E.g. if you just have to build a couple of data entry screens and a simple database you probably shouldn't spend a couple of months evaluating your options, just write the damn thing in your spare time. Nevertheless, just hacking it can mean it wouldn't be extensible, it that's what you needed then cool. if not, maybe you should have considered that.

This is one of the reasons I don't like the term "best practices" - it sends out a "you don't have to think anymore" message which is oh-so-incorrect. Patterns on the other hand, don't send  out this message, as they also include discussion on where to use them as well as their limitations and pointers to other patterns. Unless of course, patterns are seen as an end-goal rather than a means, in which case, they look pretty close to "best practices"

To sum this post - everything is a tradeoff. The most important bit here is to keep that in mind, even when the choices seems obvious. Awareness  is the key to better decisions.





Powered by ScribeFire.


 
Tags: Design | Software Architecture

August 31, 2008
@ 06:59 PM
Software architecture is not a three-layer diagram (UI-Business logic-Data)! As an architect you need to consider the project/solution at hand from a lot of different angles and take care for all sorts of concerns from the technical, team, managerial and event esoteric ones

  • Technical- you need to consider things like threading models, data flow data strucutre, testability, security , user interface. A solution is only as strong as its weakest link. If the UI is great but the system is not stable you will fail. if the system scales well but the security is flawed you will fail.
  • Team - You have to understand the limitations and capabilities of your teams and thier structure. If everybody knows Cobol, maybe rails is not a good option. if the teams are spread out geographically, you need to make sure you partition the system into chunks that wil allow them to progress independetly as possible.
  • Managerial - brings things like What's the budget for the project?, How much time do we have? so can you really plan on using a rule engine if it eats 80% of your budget (maybe ?), Can you plan a cool piece of infrastructure if it will take 4 months to build?
  • Esoteric - Sometimes you even need to consider less than common stuff. There aren't general examples here since its, well, esoteric - but a couple of examples I've seen include a project where we had to see how much power the hardware we use needs (since it was to be deployed on a truck) or on another project where we designed a multi-monitor UI we had to figure out if it is better to design the UI for side-by-side or one atop the other
You should note that most probably you will not be able to be master of all the needed domains. you do need to be aware that they exist and work with other experts to cover the other bases of the solution.



 
Tags: Software Architecture

As a follow up on the previous post. There are few things I consider as important for software architects. I wouldn't go as far as to call them axioms but I still think they are worthwhile.
  • Think Hollistically
  • It is always a trade-off !
  • Pay attention to your soft skills
  • Get SPAMMED for architecture
I wrote a lot about the forth one (see for instance this article in Dr. Dobb's). I'll expand a little about the others in the next few posts


 
Tags: Software Architecture

August 30, 2008
@ 09:44 PM
I recently stumbled on "97 Things - Things every software architect should know" (via Bobby Woolf). This is a list of axioms for architects (which will eventually be a book by O'Reilly) edited by Richard Monson-Haefel. While I don't agree with all the axioms, and some, which I feel, are a bit overlapping (e.g. one on trade-offs and one on balancing), there's a lot of great stuff ther.
For instance, here are a few of the ones I like:



 
Tags: Software Architecture

Following my latest post on evolving the architecture Dru asked me for more details on our RESTful control channels.
For one you can take a look at slide 25 of my presentation on REST which talks about the Sessions resource. The session resource returns an AtomPub feed of the current active sessions and then if you follow a link to a session you get the current status, the URIs of the participating resources etc.
I guess the more interesting questions are (especially in the light of all the on going REST debate we now see)
  1. Why rely on REST for the control channel
  2. Why not use REST for the whole system
So, why is REST a good option for the control channel?

  • the REST architectural style in general and REST implementation using web standards (HTTP, AtomPub etc.) in particular brings a lot of benefits in integration (what easy for humans to understand is easier to implement).
  • Another reason for REST (over HTTP) is standardization over languages and platforms. Any language and platform I've used has an implementation that allows sending and receiving HTTP messages. We have few components running on Linux and components running on Windows and we're planning even more heterogeneity down the road.
  • Lastly, REST allows for easy debugging and run-time interaction. This proved invaluable during system integration test where we could easily understand the current state of each of the components in the system as well as the general picture.
Ok, if everything is so good, why not use REST for the whole system? Well, because like any architecture or architectural style (especially, when incarnated in a technology), REST has things that it does well and things that it doesn't (personally, I don't buy the Only Good Thing(tm) for anything or as Brooks puts it there's no silver bullet).
Let's look at message exchange patterns for instance. REST over HTTP support the request/reply pattern.
This works extremely well in many business situation. For instance is we have an Order service (or resource for that matter) and we need to calculate the discount for a specific customer we can go to the Customer service and get her current status and check if she a VIP customer, senior citizen etc.
There are, however, places where it doesn't work as smoothly. Returning to our Order, lets consider what happen once the order is finalized and we need to both start handle it (notify the warehouse?) and Invoice it
The order service does not care about these notifications it isn't its business.
My favorite way to solve this is to introduce business events (incorporate Event Driven Architecture) so that the interested parties will get notified. Another common way to solve this is to introduce some external entity to choreograph or orchestrate it (BPM etc.) both options have different constraints and needs compared with REST. In my organization we have a lot of processes that lend themselves to event processing much better than they do REST over HTTP (though the implementation might end up aligned with the REST architectural style - I am not sure yet)

Another reason not to use REST is when you have to integrate with stuff that isn't RESTful, for instance we need to integrate with systems that use RTP and other such protocols so we are bound to that - and we are a startup with "green field" development. In an established enterprise the situation is much more complicated.

To sum up, in my opinion when you take a holistic view of a complete business you are bound to see places where different architectural principles are a good fit. Architecture styles (and architectural patterns) are tools you can use to solve the challenges.There are places where a hammer is a great fit, but it is also wise to make sure the toolset has more than just a hammer.

PS

It isn't that you can't do events with  REST over HTTP. e.g You can implement the events as an ATOM Feed and have the "subscribers" check this feed every once in a while (the way this blog works). It can even check the HTTP header before getting the whole feed. Still push is a more natural implementation for this for various reasons like you don't have to know where to find the event source and you can more easily improve latency (when needed) etc.

 
Tags: REST | SOA | Software Architecture

Retrospectives, every "agile" team does retrospectives.What are retrospectives anyway?

A retrospective is a meeting where the team takes a look and inspect the past, in order to adapt and improve the future.

Agile or not, our team does a retrospective at the end of each iteration (every two weeks in our case). We try to look at what worked, what didn't , how we are meeting our goals etc, how is the product going etc.. These meetings provide a lot of value for steering us at the right direction.
On going retrospectives that look at the near past allows for suppleness and change adaptation and they are very powerful at that - However it is sometimes worthwhile to reflect over longer periods of time.

One area where longer perspective is important is the architecture of the project. Evolving an architecture you run the risk of accepting wrong decisions - mostly because architectural decisions have long term implications, while YAGNI, time constraints and life in general drive you toward short term gains.

Again, taking an example from my current project, working towards the first release, we took a few major decisions during the development e.g.
  • federated resource management - Taking into consideration the fallacies of distributed computing we decided that we'd have local resource managers that will take care of resource utilization and allocation. The resource managers will have a hierarchy where they'd communicate with each other to gain the "bigger picture"
  • Introduce Parallel Pipelines - handle image understanding by dividing the work between specialized components.
  • RESTful control channel - to use a "lingua franca" between all component types so that we can easily integrate across platforms and languages
  • local failure handling - resources and components handle failure by themselves
  • Communication technology (WCF in our case) is isolated from the business logic by an Edge Component
  • etc.
Once we finished delivering the first release. We took a few "days off" to consider what we've done thus far. updated our quality attribute list per our knowledge working with the system and looking at some customer scenarios. studies the things we liked/didn't like in the design and architecture of the working system. and revised a few of our decisions for instance
  • We found that rushing to a working system we introduced some excess coupling to a specific technological solution (for video rendering). We initiated a few proof of concepts and found out how to both isolate the technology from the rest of the system as well as allow more technology choices.
  • We found that the some of the data flows were not as clean as we thought they'd be - adding new features caused more resource interactions than we thought when we partitioned the resources. We redefined some of the resource roles to get less message clutter (and higher cohesion)
  • The federated resource management works well, but introduce needless latency in session initiation. We now opted for introduce "Active services" which are more autonomous.
  • Add a blogjecting Watchdog in addition to local failure handling to both increase the chances of failure identification and recovery as well as get a better picture in a centralized Service Monitor.
  • RESTful control channel worked well and will continue for later release
  • Some of the scale issues will be handled by introducing "Virtual Endpoints" while some would continue to use autonoumous endpoint creation and liveliness dissemination (hopefully learning from the mistakes of others)
  • etc.
The result of these and the other decisions we've maid is a rework plan that will (hopefully anyway) make our overall solution better.
What we see is that we evolved our architecture as we went forward. While all the the decisions we made seemed right at the time we took them, only through reviewing them in a wider perspective (architecture retrospective) we identified the decisions that we need to change and the ones that we have to enhance. The insight you gain after working on a project for awhile are much better than the initial thoughts you have or the understanding you master in the initial interations.
I think it is essential to review the architecture once you've gained more experience with the realities of the system you write (vs. the precieved realities you have on the get go)

By the way if you work with a waterfall approach your situation is worse. Since in this case you take your decisions before you write any code so, you don't even have the benefit of POCs, and working code to enhance your insights


PS
if you have the MEAP version of SOA Patterns you can read more on the patterns I've mentioned here: Active service in chapter 2, blogjecting watchdog in chapter3, Service Monitor in chapter 4, Parallel Pipelines in chapter 3, Edge Component in chapter 2


 
Tags: Agile | Project Management | REST | SOA | SOA Patterns | Software Architecture

DZone recently published an interview with me on my  SOA Pattterns book. Along with the interview you can also download chapter 2 of the book (I think you need to be a DZone member to actually download it).

Chapter 2 includes  the Edge Component , Service Host , Active Service , Transactional Service and the Workflodize patterns. Additional downloads related to the book include
Lastly, you can ownload the first version of chapter 1, which I mention in the interview and the slides of a presentation on few of the patterns from Dr. Dobb's Architecture and Design World last year


 
Tags: SOA | SOA Patterns | Software Architecture

July 24, 2008
@ 09:49 AM
Every Thursday we have this "happy hour", you know beers, snack etc. Every other week or so we also try to make it educational and after socializing for a while hear a presentation  or a webcast.

I used this week's slot to present the REST architectue style. I think the presentation turned out pretty well so I thought I'd share it online (note it is a 6M ppt)


 
Tags: REST | SOA | Software Architecture

July 12, 2008
@ 10:30 PM
My friend Gunnar Peterson asked about my opinion on SOA and security concerns. Here's what I wrote him:

In a paper I wrote a couple of years ago I examined the relevancy of the “fallacies of distributed computing” defined by Peter Deutsch almost 20 years ago. Writing about the “Network is Secure” fallacy I wrote that after all these years you would think that the fact you cannot assume the network is secure would be a no-brainer. Alas it still it happens all the time - and that's for "regular" distributed systems.

 In my opinion, assuming the network is secure for an SOA is not only naïve but negligence pure and simple. The whole premise of moving an organization to SOA is connectedness and integration. So, unless your SOA will fail it will be connected to other systems. Whether you  are building RESTful systems, WS-* SOAs, EDAs or any combination of these architectural styles, If you won’t treat the services boundary as a border and secure it – you will be sorry…

Security in SOA should be considered at the "grand-scheme" level with issues like authertication, authorization but also at the single service level, looking at issues like DDOS, SQL injection, elevation of privilige and what not. A trivial thing like exposing a transaction beyond service boundaries can translate to an attacker denying services in your system simply by locking out your database. Again, this is just a simple example.

The other thing about Security is that you have to consider it early. patching security "later on" can have devestating effects on a system's capabilites esp. in areas related to performance. I have seen even military systems that had to go through serious rework, just  because Security was added as an afterthought instead of handled early on


 
Tags: SOA | Software Architecture

[It has been a little rough last week between a looming milestone @ work and my son fracturing his elbow @ home but hopefully I'll be back to the regular schedule this week]

Stateless services are da bomb right? they are easy to scale (since they have no state you can deploy as many as you like) they are easy to reuse (no state - no baggage) and what not.
The only problem with that is that the state doesn't really go away. Stateless services just suffer from NIMBYism ("Not in my back yard") when it comes to state. A stateless service needs to be stateful when it performs it action and since the state is not there, it has to get it from somewhere

There are basically two approaches to getting the state into the stateless service
The common way is to make the state someone else's problem (usually that would spell a database). With this approach the stateless service perform queries (database or otherwise) to get the state from the 3rd party. This is problematic in many ways e.g.
  • You need to pay network tax for getting the state (remember the fallacies of distributed computing..)
  • If that someone else is a single source (such as a database) it can easily become a barrier for scalability (I wrote about the RDBMS problem in the RDBMS is dead). If it isn't a single source you need to go to multiple sources so you have the network problem multiplies
  • You need to pay network tax for putting the state back at the state repository
The other way to get the state is to put the state on the message - or the "document" approach. This approach is superior to the previous one as you get to piggyback the data on the request. This is a good example of stateless communications*, which as a side effect, can save the stateless service the problems mentioned above.
The "state on the message" approach works when the handling of messages is serialized. ie. only one "station" in the flow can make changes to the state at any one time.  Unfortunately this only works for a subset of the interactions you can have. Inj most cases multiple consumers need to get to the same data or coordinate

You can also combine the two approaches and sometimes get good reults.
Another way altogether is to look at stateful services which I'll talk about in the next post



* Many times people fail to make the distinction between stateless services and stateless communications - I'll expand on that in another post.


 
Tags: scalability | SOA | Software Architecture

Simon @ CodingTheArchitcture recently asked "How big is your software architecture document? (and who reads this stuff anyway?)"
He notes that in a UG meeting most of the attendees has SADs that were more than 50 pages long.
It would probably not be too surprising if I say than in my opinion the answer is that it depends. Reflecting back on some of my past projects I had SADs that varied in range from a 200+ "write-only"* document to a less than 10 pages lean document. And the sizes match the intended usage of the documents. for instance in the two extremes mentioned. The first case it was a huge mission critical project with a specific requirement from the customer to have an "official" SAD and it was written to satisfy some project milestone (PDR) . Where the second extreme is an agile project where the architecture document was a working document, written some 10 iterations into the development to highlight some of the emergent guidelines.

What is common to all the SADs I wrote (or was responsible to) is that they all tried to grasp the essence of the design, all used multiple viewpoints to describe the solution, all were focused on quality attributes and all explained the rationale behind the decisions.
  • If you drone endlessly with details you don't see the forest from the trees.
  • if you don't use multiple views - you are likely to miss important aspects of the solution
  • if you aren't focused on quality attributes that you are most likely documenting design and not architecture
  • and if you don't explain the rationale then the document doesn't have a lot of added value beyond the code itself
in any event, the important thing here is that when it comes to Software Architecture Documents "size doesn't matter" :). What matters is that the SAD satisfies the reason it was written for


*While this particular SAD was rather long it also had a section that helped potential readers find relevant chapters so that it can actually be usable, and not just as a"door stopper"

 
Tags: Software Architecture

Someone calling himself r r left the following comment on part IV of my series of posts on SOA definition:

"I keep trying to read this series on SOA unfortunately suffers from the same disease as the rest of literature on the subject. stays general to a comfortable level so it can't really be applied anywhere, tends to complicate things where is not clear if it's needed, and encourages philosophical debate on what ultimately is a business (and so concrete) requirement. Meanwhile the serious (IMO) issues stay untouched - how does one actually approach an integration project with functionality, performance and security in mind. Which should be the standards used (considering the tens of standards on WS out there). How granular should the WS be (I'm done with answers like "not too much, but enough", or "well, depends on your project"). "
Before I talk a little about the "serious issues" mentioned above - I want to point out that the point of this series of post, as stated in the first post is to take a formal / semi-academic look at SOA. I started these posts as a reaction to a comment that Pete Lacey left on my blog stating that my view of SOA (as published in "What is SOA anyway?") does not demonstrate that SOA is an  architectural style. I don't pretense that this is some fully thought out academic dissertation or anything but I do try to look at the architectural roots of SOA.

That said let's take a look at the more interesting parts of this comment. First, the thing that bothers me about this reaction is (what seems to me as) the quest for final and concrete recipes. For instance consider the comment on service granularity
"How granular should the WS be (I'm done with answers like "not too much, but enough", or "well, depends on your project"")
The problems is - it does depend! and if you forgive me taking another philosophical detour, if you try to provide a hard definition for a service granularity you get  something like the heap paradox - When you remove individual grains  from a heap of sand is it still a heap when one grain remains. So while it is obvious that hiding a complete system as a single service is wrong and that exposing every little object as a service is wrong (even though for some inexplicable reason Juval lowy seems to thing that the latter is good practice) it isn't really obvious when you get too granular.

Nevertheless it is not a pure guess either. You can use some guidelines and measure them against your specific project/system/enterprise needs. Personally The set of guidelines I use is based on the fallacies of distributed computing :
  1.  The network is reliable
  2.  Latency is zero
  3.  Bandwidth is infinite
  4.  The network is secure
  5.  Topology doesn't change
  6.  There is one administrator
  7.  Transport cost is zero
  8.  The network is homogeneous
Since a service edge is boundary which may (usually is ) be accessed remotely you need to think about the incoming and outgoing interactions of the service within the fallacies stated above. if the proper behavior of the service depends on one of the above there's probably something wrong.

Regarding the other questions (how do you approach a real system), well, if you pardon me for banging my own drum, that's exactly why I started to write my experience on these matters as patterns. for instance if we look at the saga pattern (one of the patters I published online). you'd see that it is talking about achieving distributed consensus in a transaction-like manner. I talk about the problems of using distributed transaction etc., offer an architectural solution (the saga ) and then discuss relevant technology issues (e.g. WS-BusinessActivity ) as well as its implication from quality attributes perspectives (Integrity and reliability). Nevertheless even these patterns aren't an end-all solution. different circumstances require different solutions
Both my previous job and my current one involves building a scalable solution on-top of algorithmic engines. In my previous job I  managed the construction of a biometric solution that allows using multiple biometrics. In my current job I manage the development of  a mobile visual search solution . Again, while on the surface both needs to get some data, run a few  algorithms and produce an answer. These systems have very different quality attributes. On the first system we had to handle very large databases, hundreds of queries, an emphasis on modifiability and security, the current one needs millions of queries, almost no database, low latencies and emphasis on usability.  These differences result in radically different solutions, with different services, different interactions , use of different patterns etc. There's no "one right answer" (tm)


 
Tags: PaperLnx | SOA | SOA Patterns | Software Architecture

This post is part of a series of posts trying to define SOA as an architectural style. In the previous post I talked about SOA and the Layered architecture style (which generated a couple of follow-ups - one on layered architecture in general, one on its importance for SOA and on on layers in enterprise architecture vs. solution architecture)

The next architectural style SOA builds on is Pipes and Filters, Unlike Layers and Client/server which I described in previous installments, Pipes and Filter is not also a base style for REST. This basically, this style is where SOA and REST begin to diverge.
The pipes and filters architectural style defines two types of components - yep you've guessed it, Pipes and Filters.
Filters -  are independent processing steps they are constrained to be autonomous of each other and not share state, control thread etc.
Pipes - are interconnecting channels


Each filter exposes a relatively simple interface where it can receive  messages on an inbound pipe, process tthem and produce  messages on outbound pipes. The idea behind this is to allow easy composability thus allowing greater usage (also known as "reuse" - I'll discuss the difference in another post). Systems are composed of several filters working together, filters can be replaced with newer version (provided they keep the same interface) etc.
On the downside the overall latency is increased , since to accomplish a task you have to move from filter to filter.

The pipes and filters style brings to SOA things like the autonomy of services, the sense of explicit boundaries. For instance, this is the basis for why you wouldn't want to do distributed transactions across service boundaries, which I blogged about several times before.

The pipes part of the "pipes and filters" also means that the wiring can be taken care of outside of the services themselves and that you can control them externally, this works well with ithe use of middleware (service bus). Additionally Fielding (you know, the REST guy) also mentions that
"One aspect of PF styles that is rarely mentioned is that there is an implied "invisible hand" that arranges the configuration of filters in order to establish the overall application. A network of filters is typically arranged just prior to each activation, allowing the application to specify the configuration of filter components based on the task at hand and the nature of the data streams (configurability). This controller function is considered a separate operational phase of the system, and hence a separate architecture, even though one cannot exist without the other."
Which is the harbinger of the orchestration/choreography aspects of SOA.

So as you see, pipes and filters is one of the important pilars of SOA, in the next part (unless I'll have to clarify things about this post) I'll talk about the last architectural style SOA builds upon "Distributed Agents".


 
Tags: SOA | SOA Patterns | Software Architecture

March 6, 2008
@ 09:14 PM
Jack Van Hoof has a different view than I have on the difference between Tiers and Layers. I am not sure I agree with his view, but it still provides an interesting read. I think  the main difference between our respective views is that Jack takes a look form the enterprise-architecture angle which gives him layers like
  • Technical infrastructure - OS, directory Services etc.
  • Application infrastructure - Apps, Portals, DBMSs
  • Application Landscape - SOA, EDA
  • Bushiness Processes - BPM
Jack uses the term tier for layers within the same level of abstraction. for instance he gives the following examples:
"E.g. the layer of business services may be arranged in the tiers: front-office, mid-office and back-office. At the next lower layer, the application layer, services may be arranged in the tiers: UI, business logic and data persistency. The interaction of services between two tiers may be bidirectional (but may also be constrained to unidirectional). "
The perspective I have (or at least try to maintain in this blog) is the solution/product line architecture - basically living within Jack's application layers. So in my view I want to know and differentiate between the difference of having a UI and business logic live on the same machine vs. having them distributed in the world. So I guess in the end both perspectives need to have their place and the problem is, like many other times,is  overloaded terms


 
Tags: SOA | Software Architecture

Great news. Two of my friends and fellow DDJ bloggers, Eric Bruno and Udi Dahan have agreed to join my (now ours) SOA Patterns book which will be published by Manning.

Both Udi and Eric are competent and experienced architects who have experience designing SOAs . On the technology side -  Udi (“The software simplest”) specializes in .NET development e.g. his nServiceBus framework – which is a very good example for an endpoint-ware ServiceBus (vs. middleware ServiceBuses which is what most ESBs do).And  Eric, on the other hand, is a Java and C++ expert . Eric is the author of Java Messaging (one of the best books on JMS and web services ) and has also has a lot of experience in Financial systems. Together, the three of us bring a lot of real-life experience of building large and complicated system into this project.

The current game plan is for Eric to focus on the SOA pitfalls (“anti-patterns”) part of the book, Udi to provide a “putting-it-all-together” chapter , and for me to cover what’s left. I am sure however, that their experience and insight will also help make the other parts of the book (even? ) better.

If you are not familiar with the book - you may want to take a look at the first chapter and/or some of the published patterns like Saga, Service Firewall, Gridable Service, Edge Component and (a very early draft of) Aggregated Reporting pattern . Also you can take a look at the slides from my "SOA Patterns" presentation at Dr. Dobb's Architecture & Design world last year, which illustrates some additional patterns


 
Tags: .NET | Java | new | SOA Patterns | Software Architecture

February 23, 2008
@ 08:38 PM
In the previous post I mentioned a couple of questions on SOA and layers Udi left on an older post I made:

1. How does this [layers - ARGO] play with two services talking with each other? One pubs to the other's subs?The other requests to the first's response?
2. How valuable is the layered abstraction?


1. As I explained in the previous post. Layers does not necessarily mean unidirectional relation from a top layer to a lower level one - it does mean that a layer can only know a layer that is diretly above or below it. In other words the bidirectional interaction between two services  i.e. the request, reactions, events etc. flowing between them do not violate the layered style constraints.

2. So, how valuable is the layered abstraction to SOA? The short answer - very :). Again, as I mentioned in the previous post, the main reason layers don't seem that valuable is because they've been misrepresented and misused. Layers bring added flexibility to SOA. The fact that a service or any other SOA component cannot see beyond the next layer enables things like the  ServieBus, Edge Component, Service Firewall etc. Without layers it would be harder to have autonomous services as other services could (potentially) have access to the innards of the service adding more coupling and preventing independence.



 
Tags: SOA | Software Architecture

February 5, 2008
@ 11:29 PM
Following the third part of my "Defining SOA" posts Udi Dahan left the following two questions:
How does this [layers - ARGO] play with two services talking with each other? One pubs
to the other's subs?The other requests to the first's response?
How valuable is the layered abstraction?

Considering Udi and me usually see things eye to eye. I guess that if I managed to get him confused, more clarification is warrant :) I'll do that in two posts, this one which will explain the concept of layers and the next which will explain why it is paramount  to SOA (and answer the two questions)

Usually when I review an architecture one of the first (sometimes the only) architectural artifact I am shown is a "layered diagram" of the architecture e.g. something like the following:


These sort of half layers/half block diagrams with or without the common 3 layers (which also appear in the diagram above) of "UI" "Business" and "Data" give the whole idea of layers a bad name

The key differentiator between layers and just a bunch of blocks is the limitations on the allowed communication paths between the components (layers vs. blocks). In the previous post I quoted an old (2005) definition I had for layers where I said the following
"Typically a layered is allowed to call only the layer below it and be called only by the layer above it (but there are variants e.g. a layer can call to any layer below it; vertical layers that can call multiple layers; etc. -- all is fine as long as the layers communication paths are limited by some rules)."
Alas, I was too quick on the Copy/past and everything in the brackets (bold) is wrong - it should actually say - "but there's another variant where layers are allowed to call the layer above it or below it". The other variants (like  a layer that can all any below it) just muddy the water, makes it hard to distinguish between layers and regular components and thus make layers seem unimportant. consider the following diagram:


So, in the above diagram the relations are the Component D and Component C know each other. Component D is made of two layers (A and B). Note that a more proper representation should also explain the relations allowed between the layers i.e. is it unidirectional or bidirectional (unless there's a common convention in the project)

Why is the distinction between layers and other type of components important? because Layering gives you some benefits which "just components" don't:
1. Layers allow information hiding. Since we don't know the inner working of what's beyond the layer
2. Layers allow separation  - Things beyond the immediate layer  are hidden from each other. This means that things which are beyond the layers are loosely coupled in a way that allows for  flexibility and the addition of capabilities. for instance adding a firewall between your computer and the internet.
3. Layers allows changing the abstraction level - since layers are hierarchical in nature, moving through the layer "stack" you can increase or decrease the level of abstraction you use. This allows expressing complex ideas with simple building blocks. The best known example for this is the TCP/IP stack moving from an abstraction level close to the hardware of the network interface layer to the application level protocols such as HTTP


On the downsides - layers hurt performance by adding latency. Also too many layers introduce added complexity to the overall solution (e.g. it is harder to debug).

It it interesting to note that Interfaces are in fact leaky layer abstractions (vs. for example SOA contracts which are not leaky) - since when you use an interface you still need to instantiate the object which is otherwise hidden behind the "layer" (interface). This is basically the reason we want something like dependency injection (DI) - to help make the abstraction complete and why languages like Ruby where the contract abstraction is complete - you don't need DI (I discussed Ruby and DI in another post)

Another issue which I mentioned here is the difference between logical layers and physical (or potentially  physical) layers. I usually call the first kind layers and the latter tiers. logical layers are local and can assume a lot about their neighboring layers. Tiers or physical layers can be distributed, which carries a lot of implications (something I discussed here recently in relation to MS Volta)





 
Tags: Design | SOA | Software Architecture

January 26, 2008
@ 11:31 PM
David J. DeWitt and Michael Stonebraker are at it again. There was a lot of buzz on the internet after their previous post (here is what I had to say about it).
Their first point on the new post tries to counter the claim that MapReduce is not a database so it shouldn't be judged as one. They claim that it isn't a matter of apples and oranges but rather
 " We are judging two approaches to analyzing massive amounts of information, even for less structured information."

The problem with that is they continue from there to define a problem in database terms and then show how MapReduce will not be as good as a database in solving it - well, duh.
The fact that isolated queries may run better in a pre-indexed database should come as no great surprise. As I noted in the previous post on the subject - MapReduce can be used to create the appropriate index or partition the data into smaller chunks that would be easier to use to answer the type of queries David and Michael mention.
As Mark Chu-Carroll explains Map/Reduce and databased don't solve the same kind of problems

Also what happens when the database is constantly updated ?!  - I don't mind how scientifically accurate are the measurements that say database scale like no other things. I am more comfortable with the empiric experience by companies like Amazon, Diggs, Google and ebay who found they have to shard their data to support their scalability needs and not use distributed transactions/distributed databased.


 
Tags: data | scalability | Software Architecture | Trends

This post is part of a series of posts trying to define SOA as an architectural style. In the previous post I talked about how SOA builds on the Client/Server architectural style. In this post I'll talk about how SOA builds on the architectural style of Layered System.

Layered System or Layered architectural style is one of the most basic and widely used architectural styles. Here is a definition of Layered architecture I posted in the past
The layered style is composed of layers (the components) which provides facilities and has a specific roles. The layers have communication paths / dependencies (the connectors).

In a layered style a layer has some limitations on how it can communicate with other layers (the constraints). Typically a layered is allowed to call only the layer below it and be called only by the layer above it (but there are variants e.g. a layer can call to any layer below it;  etc. - all is fine as long as the layers communication paths are limited and restricted by some rules)
SOA takes the strict layers definition and restricts the knowledge of one service only to the service interface/contract of the other services. This means the services cannot be aware or care about the internal structure of other services. Services don't mind the internal structure of other services. This helps with introducing the  "boundaries are explicit" tenet  (although, it build on more than just layering)

The layered nature of SOA means you can also add additional layers between the services. One very common example is adding a servicebus (e.g. using an ESB or tools like NServiceBus) other examples can include load balancers, firewalls (see Service Firewall pattern) etc. Naturally, When you add intermediary layers  services don't talk to each other directly rather accept the services (such routing , message persistence etc.)  from the intermediary layer.

It should be noted, that in the context of SOA the layers are, in most cases, actually tiers. The difference is that tiers provide (potential) physical separation where as layers provide logical separation . When a layer is actually a tier it has extensive implication on the level of trust between the tiers (see my post "Tier is a natural boundary" for more details)

The next post in the series will talk about the "Pipe and Filters" style  and SOA. This is the first place where the REST architectural style and SOA diverge.


 
Tags: REST | SOA | SOA Patterns | Software Architecture

David DeWitt and Michael Stonebraker write about MapReduce in "The Database Column". Now I usually like what Michael Stonebraker writes (e.g. his piece on the RDBMS demise which I also wrote about myself). However I can't say that this time around.
David and Michael write that MapReduce is a big step backwards. before I'll talk about what they write, here is a (very high level) reminder what Map/Reduce is
MapReduce as Google's Jeffery Dean and Sanjay Ghemawat explain is a way to get automatic parallelization and distribution along with fault tolerance, monitoring and I/O scheduleing for tasks that need to work on complete datasets. MapReduce uses two functions:
  • Map - multiple instances of which run in parallel  to process a key/value pair and produce  produce a set of  grouping key(s) and intermediate values.
  • Reduce - which runs per grouping key and merge the intermediate values to a a set of merged outputs (usually one)
David and Michael claims that MapReduce is
1. a step backwards because it doesn't build on Schema
2. a poor implementation because it doesn't use indexes
3. not new
4. missing features - like bulk load, indexing, updates, transactions, integrity constraints, referential integrity, views
5. incompatible with DBMS tools - like report writers, BI tools, replication tools, design tools

Well, if anything, it seems that David and Michael don't really understand what MapReduce is. As I noted above MapReduce is a way to go over complete sets in an efficient distributed manner. In fact it can even be used to build the index of a traditional RDBMS. It isn't really competing wit databases Relational or other. Yep, comparing MapReduce and databse is the  apples and oranges thing...

I guess they might have meant to talk about another Google tool called BigTable - which is at least sort of a column database (Michael's company also makes a column database) for storing structured data in a highly distributed , high performance way. However David and Michael would still be wrong as BigTable is proprietary and targeted at a specific purpose so it isn't supposed to solve the same problems as a general purpose  database not to mention that it is highly scalable (ever heard of google's search engine ;) ) and does support things like indexes, updates etc.

Also as I mentioned in the "RDBMS is dead" post, the internet proved that RDBMS features (like transactions etc.)  can only only scale so much.  While Databases focus on the Consistency and Availability parts of the CAP conjecture and ACID tenets , internet scale systems pick Partitioning and Availability and BASE tenets instead.


 
Tags: data | scalability | Software Architecture

Sam Gentile and myself exchanged a few blog posts on the definition of SOA, in the latest installment Sam disagrees with me that SOA should first be looked at in the pure architectural sense without bundling in the business and enterprise aspects.
In a nut shell I have two main reasons to prefer looking at SOA at the core as a pure architectural style.
The first is the when you bundle in enterprise-wide aspects of implementing SOA you loose out on the option (or the audience) that can use it to solve more local problem (i.e. at the product/solution level) using the same principles that bring the benefits on the enterprise scale.
The other reason I have  for separating the concepts is that the business encompassing definitions tend to be fluid, hand waiving ones and cannot be measured for compliance.
Consider the definitions Sam quotes from  Thomas Erl's books:
"SOA establishes an architectural model that aims to embrace the efficiency, agility, and productivity of an enterprise by positioning services as the primary means through which solution logic is represented in support of the realization of strategic goals associated with service-oriented computing." (emphasis by Sam)"

SOA represents a model in which functionality is decomposed into small, distinct units (services), which can be distributed over a network and can be combined together and reused to create business applications. [3]
Now what the hell is that? These are all noble goals but shouldn't this be the goal of any enterprise architecture ? What makes SOA unique in this sense?
Also how does these definitions help us build services? what makes a service a service ? Why is (or isn't) any web-enabled component a service?
Definitions that distance themselves from the architectural roots seems to me like smoke and mirror and contribute to the general confusion around SOA - to the point where even people like Harry Pierson wonder why we should even bother defining it

Personally, I still think it is worth while defining *** ( the architectural style, formerly known as SOA) since as I mentioned earlier it is (in my opinion) a useful architectural style for building distributed systems - whether the distributed system is a solution, a product, a product line or a complete enterprise





 
Tags: SOA | SOA Patterns | Software Architecture

December 29, 2007
@ 10:49 PM

Sam Gentile comments about my attempts to define SOA (Part I, Part II, more to come..) and says that

"That's all well and true, but any definition of SOA must encompass the business drivers and business reasons, as SOA is not really about technology. It is about a better alignment of business and IT through business processes and services. The goal is to create a dynamic, more Agile and Dynamic IT that can respond quickly to new business opportunities and threats by quickly assembling new capabilities from putting together composite applications (and even Mash-ups) from reusable business services..."

I am sorry Sam, but I beg to differ, not about the importance of business drive behind implementing SOA, but about what SOA is. The culprit, in my opinion, is terminology overloading

 SOA is, as I said in the above mentioned post and numerous other times, is first and foremost an architectural style - as an architectural style it offers several architectural benefits and poses several architectural constraints. This has nothing to do with business drivers. it has to do with defining components, relations, attributes on relations and components as well as constraints. Now you can take those set of rules and use (or misuse) them as you like, in the context of a subsystem, single project, a product line or  an enterprise - this is your choice.

Applying SOA, on the other hand, has everything to do with the business . I'll take Sam's post word for word but instead of using the word SOA, I would prefer using the term SOA initiative. An SOA initiative is the effort of applying SOA in a wide context for an enterprise, aiming to increase the alignment of IT and the business etc. I would have to say though,  that in my experience, such an effort would rarely use SOA alone. It would also include other distributed architectural styles that also help with decoupling and loose coupling like EDA and REST to name a couple.


By the way, SOA has nothing to do with technology either. You can implement SOA using WS-*, Atompub, MSMQ, CORBA just as much as you can implement REST with quite a few technologies, it so happens that WS-* is a common implementation technology for SOA, and that HTTP is used as a common implementation technology for REST but both styles live independently of the technologies.


 
Tags: SOA | Software Architecture | Trends

In the previous post  on defining SOA I claimed that SOA is an architectural style building on 4 other architectural styles. The first one of these is Client/Server.
Describing client/server is easy - not because I am such a genius (far from it) but it has already been done before numerous times. Let's take a look at the definition from  Roy Fielding  in his famous dissertation (The link is to chapter 3, REST is defined in chapter 5 if you are interested)

The client-server style is the most frequently encountered of the architectural styles for network-based applications. A server component, offering a set of services, listens for requests upon those services. A client component, desiring that a service be performed, sends a request to the server via a connector. The server either rejects or performs the request and sends a response back to the client. A variety of client-server systems are surveyed by Sinha [123] and Umar [131].

Andrews [6] describes client-server components as follows: A client is a triggering process; a server is a reactive process. Clients make requests that trigger reactions from servers. Thus, a client initiates activity at times of its choosing; it often then delays until its request has been serviced. On the other hand, a server waits for requests to be made and then reacts to them. A server is usually a non-terminating process and often provides service to more than one client.

Separation of concerns is the principle behind the client-server constraints. A proper separation of functionality should simplify the server component in order to improve scalability. This simplification usually takes the form of moving all of the user interface functionality into the client component. The separation also allows the two types of components to evolve independently, provided that the interface doesn't change.

The basic form of client-server does not constrain how application state is partitioned between client and server components. It is often referred to by the mechanisms used for the connector implementation, such as remote procedure call [23] or message-oriented middleware [131].

SOA takes from the Client/Server style the two roles - ie. in each interaction one party is the client (what I call service consumer) and the other is the server (service) which  handles the request coming from the client*. Unlike traditional client/server, the roles are held only for a particular set of interactions - a given interface that the service exposes. In another set of interactions the roles can be reversed and a component that once was a server can now act as a client even working with the very same component that was previously its client.

Like REST, SOA takes the constraint of separation of concerns which allow the service and its service consumers to evolve independently (as long as the interface is kept).
In order to support this, services should takes care of all its internal state without exposing its internal state or its internal structures outside of the service. This also allows the service to scale behind the interface but for that we also need constraints and capabilities from the next architectural style layered system, which I'll discuss in the next installment on this subject.


* You can compose SOA with other architectural styles to get different behaviors. E.g. compose SOA and  EDA and you can have the service also push data.This t isn't, however,  something SOA ,manifest in its basic form


 
Tags: REST | SOA | SOA Patterns | Software Architecture

December 20, 2007
@ 12:24 AM
Wes Dyer, one of the principal people behind the Volta tier-splitting was kind enough to leave a comment on my previous post. Here is one quote from that comment

"I do want to clear up a few things about Volta that we apparently
didn't make clear enough. We do not believe that you can develop an
application as if it will run on a single tier and then just sprinkle a
few custom attributes here and there and be done with it. More than
anything else, programmers need brains. Volta does not claim that
programmer brains can be checked at the door. When the programmer wants
to divide the application across a particular boundary then things like
network latency, new failure modes, concurrency, etc. need to be
considered at that boundary. What Volta does is make expressing the
transition between boundaries easier. It reduces the accidental
complexity of writing all of the boilerplate code to express the
programmer's intention. This allows the programmer to focus on the
essential complexity of his problem domain -- figuring out how to write
effective code for that particular tier boundary."

For one, it is good to hear that the architects behind Volta have a deeper understanding of distributed computing challenges - even if the first version doesn't seem to show it. I didn't use MS Volta enough to say that indeed the problem is not with the inherent capabilities and design  (let's just take Wes words for that). I am also not against saving the boilerplate code (though I would personally favor libraries rather than code generation and try to keep the "generation" gap to a minimum (i.e. the amount of generated code or the distance between the abstraction and the next concrete level)).
Lastly I am also in favor of trusting developers have brains and that it is ok to provide developer "sharp tools". So if all is good, where's the problem?

The problem is that you have to make it "easy to do the right thing" and provide the means to do the more complicated, less safe things. When I teach my young kids (and I can objectively say they are very smart :) ) to use a knife, I don't hand them the razor sharp, butcher knife first. They start with the plastic ones. When they've mastered that they can try something more dangerous. When you allow distributing something at a flick of an attribute and put marketing blurb on the site that makes it compelling to use it you create the wrong impression to the less experienced folks.

In one project which architecture I reviewed, the (very talented) architect/developer designed his own distributed transactions system (he shouldn't have been doing that in the first place - but that's for another post). When designing this he built in a lot ways to control the transaction behavior including the option to allow transaction participants to prevent rollback without failing the transaction.  Circumventing the transaction was as easy as making it work properly. Are there edge cases where you may need to have one participant violate the ACIDness of the transaction ? I guess  so - but that is not the general rule. Most of the time when you commit a transaction you expect it to be ACID. if for some reason it didn't behave that way - you want to know about it, even if it didn't actually failed. When you don't make it easy to do the right you get unexpected behaviors, you get hard to explain bugs, you get slow performance etc..
Developers using tools, smart as they may be, don't usually go and read all the source code of the tool/framework they are using (maybe they should). If two options are just as easy to use, it seems safe to assume they are just equally right. Things which are unsafe should be clearly marked as such to prevent mis-use by unexperienced users. This is especially true for tools that are targeted for common use and to ease the life of inexperienced developers


 
Tags: .NET | Design | Software Architecture

December 11, 2007
@ 11:27 PM
I got a couple of emails with questions reagarding  my previous post on Volta. So here's another go at explaining why dynamic-tiering is not a good move - this time in technicolor.

Let's start with a simple illustration. The diagram below represents a typical local component(A) in its environment. As a component that works locally, it has access to other local components which it interacts with. These can be objects it created by itself or objects that where injected to it. The likely design for local components is to have a chatty interaction - After all objects can talk to instances of other objects quite easily.



Now enters Volta (or any other  such framework - and I've seen a few. I am  ashamed to say but I even wrote one about 15 years ago) and says we'll just mark things we want to execute on a different server and everything would be fine. What you get is something like the illustration below:



We have the same number of interactions - only now all the interactions between A and its (used to be) near environment requires serialization, network interaction, possibly encryption, authentication, authorization and what not. You can imagine that this type of interaction can have a heavy hit on performance and scalability if it wasn't pre-designed somehow.

This is a bit of hand-waving so let me also give you an example from a real project. About 3 years ago I was invited to consult in a project. This was the kind of project that interacts with real things like sensors etc. I'll use an automated irrigation system to illustrate its architectural components. One type of component is "Things", these represent real devices you can interact with like sprinklers, soil sensors etc. Things represent the logical state of the real devices and cannot talk to each other. When two Things need to interact -e.g. we want to turn on the sprinkler if the soil is dry, we introduce another architectural component, we'll call it "Interaction" which looks at the state of the Things and can then act upon it. The last major type is "Services" (not services in the SOA sense) e.g. we can have a Service that reads the weather. Services can't interact with Things directly, but they can interact between them and they can interact with "Interactions". This particular system had dozens of Things, Hundreds of "Interactions" and "Services". And the tiers/process boundaries were as follows:


Interactions have to know about changes both in Things and Services so messages keep flying around this system to keep the Interactions in sync as well as propagate decisions made by Interactions. The outcome of this "smart" design is that every status change in a Thing results in an order of magnitude more messages to react to the change is status. I was brought in to find a way to find a way to get in-order reliable messages flow fast enough between the different tiers. I did my best and left -what they didn't want to listen to, and the better solution is to give a lot of thought about related Things , Interactions and Services and bundle them together into "tierable" component. The interactions within these "chunks" would be local and would then inflict a whole less messages on the system. In our example it makes sense to bundle the four components (sprinkler etc.) into a single tier and possibly the same process and increase the overall performance significantly while also giving us  more cohesive boundaries.




(as a side note I'll just mention that I ran into someone who is part of this particular project a few days ago - They are still struggling with performance and stability problems...)

Anyway, one could argue that frameworks like Volta would allow you to move from the bad partioning to the good one more easily - but this is not really so since when you rearange the components you also have to remodel the messages that flow between the new partitions. Also

This is not to say that having the ability to run a system in local and in distributed modes does not have value - as I said in the previous post- it is the assumption that you can easily move this boundary and still get a viable solution that is wrong. Also if you are going to allow running in local and distributed mode that doesn't have to spell to "dark magic" of MSIL rewrites and compilations.
In another (SOA) project we designed services so that in a small-scale installation you would be able to instantiate services in the same process. Services were constructed as Active Services (i.e. have at least one  thread of control). If you wanted to let two services run in the same process you just had to write a new ServiceHost and a new ServiceBus The new ServiceHost has to provide each service its own thread or thread pull and the ServiceBus has to work inmemory by passing message objects around rather then serializing/deserializing and sending them over the network. On a small installation this works better than multiple processes (but not as good as a system designed specifically to run on a single tier). Note that this is the opposite of what Volta does as it takes a distributed solution and allow it to run locally rather than the other way around.

The other part of Volta is the C# to javascript cross compiler. This may have a future - but it really depends on the attention Microsoft will put into this direction. Google does something similar on its android mobile platform where it takes Java bytecode and translated it into the Dalvik VM. But for Google that's a strategic platform. With MS investments in Silverlight (Which I personally prefer), I would guess the effort in would always lag behind (though I hope they'd get it to be better than it is today)

 
Tags: .NET | Design | Everything | SOA | Software Architecture

December 6, 2007
@ 11:43 PM
Microsoft uses the "live labs" to release all sorts of test balloons. Sometimes we get really nifty stuff like Photosynth or SeaDragon. Unfortunately, sometimes we get stupid not so bright ideas like Volta.

Ok, so what is Volta? Here's what the project's homepage has to say (emphasis mine):
" The Volta technology preview is a developer toolset that enables you to build multi-tier web applications by applying familiar techniques and patterns. First, design and build your application as a .NET client application, then assign the portions of the application to run on the server and the client tiers late in the development process. The compiler creates cross-browser JavaScript for the client tier, web services for the server tier, and communication, serialization, synchronization, security, and other boilerplate code to tie the tiers together.

Developers can target either web browsers or the CLR as clients and Volta handles the complexities of tier-splitting for you.  Volta comprises tools such as end-to-end profiling to make architectural refactoring and optimization simple and quick. In effect, Volta offers a best-effort experience in multiple environments without any changes to the application."

The idea sounds very compelling - I kid you not. So what's the problem?

The first issue is that, as a platform/framework (MS would say factory), Volta tries to accomplish too much. On the one hand Volta is another go at the web/desktop convergence trend. On the other hand it is supposed to be a solution for "painless" tier-splitting. Both of these tasks are very heavy. My opinion is that the Single Responsibility Principle (while originally defined for objects) applies here. And Volta should choose one thing and try to excel in that.

What's more disturbing to me, is the automatic handling of the "complexities of tier-splitting". Here's another excerpt from the Volta site which further explains the "tier-splitting" concept:
Objective

We have an application that runs in a monolithic environment, say the browser. We want parts of this application to run in other environments, such as servers. We don’t want to litter the application with plumbing code.

Rationale

The standard techniques for distributed applications infuse our code everywhere with information about what parts run where. This makes the code hard to change. Typically, once we make these decisions we can’t change them because it is too expensive. However, environments, requirements, and performance profiles change and we’re stuck with applications that can’t adapt to new realities. We need to separate the concerns about what the application does from the concerns about where parts of the application run.

Without Volta, we are forced to decide where code runs before we know everything it is going to do, in particular before we know the communication frequencies and delays. Development methodologies force us to make irreversible decisions too early in the application lifecycle. Volta gives us the means to delay decisions until we have adequate information to base them on.

Recipe

Volta tier splitting automates the creation of the communication plumbing code, serialization, and remoting. Simply mark classes or methods with a custom attribute that tells the Volta compiler where they should run. Unmarked classes and methods continue to run on the client.

We may base our decisions about tier assignment on any criteria we like, such as performance or location of critical assets and capabilities. Because Volta automates boilerplate code and processes for dispersing code, it is easy for us to experiment with and change assignments of classes and methods to tiers.

Wow, Agile development at its best, allowing us to postpone architectural decisions,  that just sound too good to be true. Well, the problem is that it is too good to be true. Abstracting the network out, and providing location transparency without thinking about the  implications of distribution is the reason "distributed objects" failed. e.g. Here is what Harry Pierson (DevHawk) had to say about distributed objects:
"...back in 2003, mainstream platforms typically used a distributed object approach to building distributed apps. Distributed objects were widely implemented and fairly well understood. You created an object like normal, but the underlying platform would create the actual object on a remote machine. You'd call functions on your local proxy and the platform would marshal the call across the network to the real object. The network hop would still be there, but the platform abstracted away the mechanics of making it. Examples of distributed object platforms include CORBA via IOR, Java RMI, COM via DCOM and .NET Remoting.

The (now well documented and understood) problem with this approach is that distributed objects can't be designed like other objects. For performance reasons, distributed objects have to have what Martin Fowler called a "coarse-grained interface", a design which sacrifices flexibility and extensibility in return for minimizing the number of cross-network calls. Because the network overhead can't be abstracted away, distributed objects are a very leaky abstraction.


So here comes Volta and tells us just put a [RunAtOrigin] attribute on the code you want on another tier and if you don't like that you can change it to another place in your application and what not. Note that the notion that you can automate some or maybe even all of the distribution "boilerplate" code may be viable. The problem is in the premise that you can seamlessly move that boundary around. There's a fundamental  difference between tiers and layers. Tiers should be treated as a boundary .Volta designers do talk about Security but they seem to forget a few of the other fallacies of distributed computing...



 
Tags: .NET | Agile | OO | Software Architecture | Trends

In the previous installment I talked about the architect and the architectural decisions, I also said (ok, wrote) that architects do more than that. Well, here are a few of the duties I think  architects should have (sometimes not exclusively)

project CTO - Tom Berray has an excellent paper describing 4 models for the role of a CTO. 3 of them can be applied to software architects (within their projects)
  • "Big Thinker" - This is somewhat akin to the role discussed in the previous post.
  • "External Facing Technologiest" - I usually saw this in larger projects, but it is also applicable for smaller ones. There are many occasions where the technical capabilities of the project have to be presented and/or negotiated with external stakeholders. Architects are in a good position to perform this as they should have good understanding of both the business and the technology. Additionally making architectural decisions already requires the architect to understand the different stakeholders' needs
The third model is called "Technology Visionary and Operations Manager" - Making sure that technology works to deliver business goals - but how is that done?

In their book on organizational patterns, Jim Coplien and Neil Harrison, talk about the "Quattro Pro for Windows" (QPW) development team. According to the case study, Borland had a team of 4 architects who worked together to produce what the authors call prototypes*. 6 month later these architects were joined by additional developers to produce the product. During the development the architects kept meeting on a daily basis to coordinate their efforts (sort of like a daily stand-up in a scrum of scrums).

The situation in the QPW is probably close to the ideal architect involvement in a project - coding architects that work closely with the team, while driving technical and architectural decisions. The availability of multiple architect (but few to prevent the "design by committee" effect) also enhances the overall quality of the solution.

Another aspect of the architect work is to act as a coach/tutor. It isn't enough for the architect to "know best". We already know that architect must also be able to reason about their recommendations/decisions, but that's just part of the story. Helping other team members get better in what they do means that they'd be able to do their job better, they'd be able to come up with their own ideas (and get more fresh ideas into the discussions) and produce better software. Since the architect is ultimately responsible  for the quality of the solution, making others perform better should be a top priority for the architect. Being considered as a source of knowledge will help an architect perform his/her role, even when they don't have an architect title


Actually, What they did were POCs or spikes  (see "Architecture Evaluation in code" for an explanation of the differences)


 
Tags: Agile | Software Architecture

Jeff Atwood (of Coding Horror) writes about Brian Foote and Joseph Yoder's "Big Ball of Mud" paper but completely misses the ball (pardon the pun). Jeff says that the paper describes
"classic architectural mistakes in software development."

while the paper describes the exact opposite. with the exception of the "Big Ball of Mud" pattern itself which can be seen as an anti-pattern (as,by the way, the authors explain) the rest contain exceptionally good advice on how to prevent problems. Let's look at them one by one
  • Throwaway Code -  when you just want to make sure something works (prototype, spike etc.) or even a quick fix - you don't write elaborate and heavy code - instead you write something simple to solve the problem. Throwaway code only becomes a problem if you don't actually throw it away...
  • Piecemeal Growth - The waterfallish "big plan" in advance has failed numerous times - so instead we should build incrementally, don't over plan or over design, build things into libraries only when it is proven to be recurrent problem, refactor etc.
  • Keep it working - The pattern is basically about building often and keeping iterations short, having tests that always prove your code still works as you make changes to it
  • Shearing Layers - This pattern is about identifying related components. We all know change is inevitable - however things don't change at the same rate. For instance we can probably get interfaces to be a little more stable than the implementations behind them. Frameworks evolve at a different rate than the business they support etc. The pattern is about dividing the system so that things that change at similar rates are together i.e. in the same package,  CSCI  or whatever you use to partition your system.
  • Sweeping It Under the Rug - When you find that you do have badly designed or badly implemented code - the pattern suggest you localize and isolate it to a fixed area (e.g. behind a facade) to prevent it from propagating into the rest of the system.
  • Reconstruction - This pattern is about understanding when the code base is so bad that it is better to start-over and rebuild from scratch rather then try to patch it. In many  way reconstruction is not a good or easy thing to do however the point here is to identify when it  is the lesser evil. 
Oh, and what about "Big Ball of Mud" itself - essentially from the architectural perspective it is indeed an anti-pattern - you have something that is not very maintainable, hard to understand and what not. However we should keep in mind that the idea behind designing an architecture is not to get the best, cleanest architecture. The idea is to make the right tradeoffs so that you'd be able to deliver the best overall solution under the constraints you face (budget, time, the team's skill and what not). If you're biggest constraint is time-to-market and your architect spends all eternity planning the 8th wonder of the world, fire his ass. I'd rather live with a Big Ball of Mud for the first release than not ever make a release...
Big Ball of Mud can be considered a pattern for pragmatic approach to building working software. This is probably not acceptable in the long term  - but it can be a good option for short term if you are aware that that's what you are doing and willing to treat what you get as "Throwaway code".

So again, except maybe big ball of mud, these patterns are not "project pathologies" as Jeff calls them - these are very good ways to keep delivering business value and working softwares



 
Tags: Agile | Design | Software Architecture

November 24, 2007
@ 06:34 PM
A few weeks ago I posted a reaction to a post by Pete Lacey that asked what is SOA. In a comment to my post Pete said that my definition isn't good since
"...even according to your definition, an architectural style contains constraints, and to date neither SOA nor web services have been shown to exhibit any constraints"
The idea behind this series of posts is to try to take a little more formal view at what I think SOA is. It is based on my thinking for the past few weeks but it is also still a work in progress (so any comments are welcome)

The way I see it SOA is an architectural style which is derived from the following architectural styles:
  1. Client/Server
  2. Layered System
  3. Pipe and Filters
  4. Distributed Agents
Note that if you add to the above statelessness, uniformed pipe and filters and a cache you can get a RESTful SOA. This is not REST as REST itself does not require distributed agent or even pipes and filters (but it does build on client/server and layered system). In other words not all RESTful systems are SOA, you can build SOAs which are not RESTful and you can build RESTful SOAs.

The main components of SOA are Service,Message, Contracts and Consumers. Policies also exists but now I tend to think they are optional. The four architectural styles mentioned above affect the definitions of the different components and the way they interact together

In the following posts on this subject I'll first take a look at each of the contributing architectural styles and how they affect SOA and later try to provide a definition that builds on them


 
Tags: REST | SOA | SOA Patterns | Software Architecture

Back in April I wrote how Adobe AIR (then it was still called Apollo) marks the beginning of the invasion of the web clients into the desktop. Later I wrote about the Java and .NET counterattacks (JavaFX and Silverlight) and then I wrote about Google's answer when Google Gears was announced.

Well Mozilla "Prism" demonstrates that even simple steps can help make this transition.
The main idea behind Prism is to "integrate web-applications into the user desktop experience". Behind this fancy statement we have a very simple solution - the ability to add a a desktop/start/quicklaunch shortcut to any web application (or page for that matter)and have that show in a window that is configurable so that it doesn't waste pixels on irrelevant stuff for the applications (like navigation buttons, address bar etc.) - which makes it better then just adding a shortcut yourself. Simple and elegant. Here's what my Google reader looks like with Prism:

If you want to start using it, you can just download the prototype for Mac OS X, Linux, and Windows.


 
Tags: Everything | RIA | Software Architecture | Trends

October 30, 2007
@ 11:41 PM
Earlier today Microsoft announced it newest SOA initiative codenamed Oslo Here are a few observations I have on this announcement.

Let's start with "what is it?" Well it isn't an "it" per se, since Oslo is a bunch of initiatives within the Microsoft offerings.

For one, it is some of the libraries within .NET 4.0 -- specifically the next versions of WCF and WF.

Secondly, it is a bunch of designers and tools that will be part of Visual Studio (beyond VS 2008).
The most interesting component of Oslo will be a new repository to allow version management of models and services. I guess it is safe to say it will be built upon Team Foundation Server (or a subset of which which will be used by both products).

The last part of the puzzle is of course V.Next of Biztalk and something currently branded as "Biztalk Services '1'". As far as I know Biztalk sells pretty, but I think it is both too bloated (e.g., think about the hardware needed to run this in high-performance solution) and builds on the wrong architecture (hub vs. bus). I hope Microsoft makes major updates this time (Biztalk 2004 to Biztalk 2006 mostly innovated around the business activity monitoring. While that's important I think more work on the engine was/is due).

Biztalk services would offer an implementation of some of the SOA patterns I talk about -- service host, workflodize, etc. -- to provide an infrastructure for building services. The relation between "Biztalk 6" and "Biztalk services 1" is not clear from the information provided by Microsoft; hopefully this is just a branding issue and not a tight relation between the products.

On the upside, one of the key persons working on this is Don Ferguson who, before joining Microsoft, was chief architect for IBM's software group. About a year ago I had the chance to hear him talk about SOA and all I can say is that Don is someone who really knows his stuff.

PS: It's amusing to see the press release talks about "model-driven" approach rather than software factories, but I guess that's just nitpicking.
 
Tags: .NET | Everything | SOA | Software Architecture

Let's assume I convinced you that some projects need architects (see part I). Convinced, you go and hire an architect. now what?

Let's start by looking at  "architectural decisions" - which is sure sounds like something we'd want an architect to do. I read once (I think that was Martin Fowler) that an architectural decision is a decision that in hindsiight you wished you made right. if we look at a formal definition of software architecture (say from IEEE 1471) we see that the architecture embodies the fundamental decisions about the system its components, their relations and their properties. Using this definition an architectural decision is a fundamental decision about the system (which pretty much explain why we want to make them right etc.)

Well, here are two observations on what I've said thus far. One is that we would want to postpone architectural decision as much as we can, since changing them will cause us a lot of headache. The problem is that in order to postpone an architectural decision we need to build flexibility into the system which is an architectural quality in itself - which might not be the top of the list if we prioritize it vs. other architectural qualities we need.

The second observation is that if we "refactor" the pretty language out from both of these definition - we can see that an architectural decision is basically a guess, hopefully that's an educated guess but it is a guess none-the-less. and as Albert Einstein once said it is  hard to make  predictions - especially about the future.

This is why architects  breadth of knowledge - which helps explain the architect training program I posted about a few weeks ago (see Architect training program Part I and Architect training program Part II). Another aspect is experience. And to get a wider perspective it can be helpful if this experience includes other roles besides developer such as project manager or business analyst etc. Another important component is domain knowledge and understanding of the business.

Using all these you (as an architect) may come up with a reasonable architectural decision (e.g. use MVC pattern) and a design to match it and that's it.

Well, actually, not quite since as I said earlier it is still a guess. Remember  an architectural decision (and any design for that matter) is a mirage no matter how beautiful the power point slide looks (or white board or UML sketch etc.)

Alas, power point compilers are still in the making. Which means that as an architect, you must be able to prove your point in writing - that is coding. While you are at it, you also need to know a thing or two about the technology you are using because it too has an architecture, features etc. which can have a significant effect on the end result. (You can read a little bit more on this in the "Architecture Deployment" paper I published a while ago).

The result of trying to postpone architectural decisions, ever changing requirements along with adding details as we unfold the  architectural abstraction level to a working system, is that the architect can't just appear at the inception of a project and disappear afterwards - they need to stick around for the game. This is especially true if you want to have an evolving architecture

An architect needs to do more than "architectural decisions". There are also additional reasons why the architect should have continuous interaction with the rest of the development team. However that will have to wait for Part III. :)




 
Tags: Agile | Everything | Software Architecture

Well it seems setting up a new company can keep you busy at times - which is my official excuse :) for the quiet last week. Hopefully I'll have a little more free time this week

Note: I am talking in this post about roles and capabilities. i.e. when I say architect I mean someone who has the capacity to be an architect (e.g. as demonstrated in the architect training suggestion I made) and takes that role. On a specific project you may have a person that performs multiple roles such as an architect and project manger or an architect and developer while  other project warrant one or more full time architects.

Not all projects need architects. There, I've said it. Not all projects need architects and I am not talking here just about trivial projects. There are cases (maybe even many cases) where you can get by with what I call "off-the-shelf" architecture - maybe with a few adjustments that any master developer (i.e. seasoned and experienced developer) can handle. For instance a lot of web-sites can do pretty well by using Model-View-Controller (or a derivative of that) along with a simple O/R mapper such as active-record. In fact a lot of them do when they use a framework like Rails that made these architectural choices for them*. Another example is the vanilla 3-tier architecture provided by vendors (such as this one by Microsoft). Yes, when you take something off-the-shelf the result might not be optimal but that doesn't mean it isn't sufficient. You just have to be aware for the tradeoffs...

Another point is that "design" is not an exclusive architect thing. A developer is not a good developer unless she also known about  proper design. mastery of technical details of the language without understanding the wider context of design will just help you code a lot of crap faster.

Having said that we need to consider a few issues - When do you need architects? What do they Do? What's their relation and interaction with the developers ?

When do you need architects?

It is sometime a fine line between a project that can get by with an "off-the-shelf" architecture and one that needs an architect. It would be nice if we could have something like a litmus test that would tell us if architects are needed or not. I don't have one. The closest thing I could get to this is something I call the SCLR test (pronounced scaler). SCLR stands for Size, Complexity & Limited  Resources.
  • Size - well if you are going to have something  estimated at 1000 man-years or dozens of teams. I think it is pretty obvious that you can't just use something  which isn't made to fit. If anything there's a need to divide the work between the teams in a way that would make sense so that you wouldn't get a big-ball of mud. There's also a lot of need to coordinate the efforts and keep the big-picture inline. Personally I think that it doesn't  have to be a huge project to warrent some architects involvement. Since as Fred Brooks notes the number of interactions grows exponentially as we add more people.  In my experience trouble starts even with more modest numbers - more than 4 or even 3 different teams working concurrently is probably a good number to start thinking  about architects
  • Complexity - There are many signs for complexity in a project. The vision statement can provide a hint. "Let's design the software to support the next mars mission", "best CRM platform ever"  - an ambitions project will not make-do with "average" architecture. Size (which  I already mentioned) is also a sign of complexity, while previously I talked about size of the project, the size of data, users etc. is also relevant when we're thinking about complexity . A lot of external interfaces is another sign. Integration doesn't seem very complicated, until you actually try to pull it off. When you have to do a lot of that in a project that's complex. And there are many other signs
  • Limited Resources - Naturally every project has limited resources, but limited resources should be considered as a sign for architect involvement if the resources are extreme. When resources are extremely limited the tradeoffs that have to be done are more meaningful, which is why wed want people who can help with that (i.e. architects). For instance in a projects I worked on in the past we had a lot of availability and performance requirements on one hand but only so many "U"s in the rack and even limited electricity to make all this magic happen. This turned something which otherwise was a relatively standard IT project into something a lot more challenging.
Assuming I manage to convince you that some projects can't just choose one of the available blue-prints and need some more work - the next step would be to convince you that architects are better suited to solve this than developers. I'll try to do that in the next post on this series where I'll explain what (I think) architects do and their place in the development team


* Rails has more than just MVC and Active-Record but that isn't an important point for this discussion



 
Tags: Design | Everything | Software Architecture

October 5, 2007
@ 10:46 PM
Pete Lacey has a post called "What is SOA?" where he defines SOA as follows:
"
  • Network Oriented Computing (NOC): An approach to computing that makes business logic available over the network in a standardized and interoperable manner.
    • Service Oriented Architecture (SOA): A technical approach to NOC that has a non-uniform service interface as its principle abstraction. Today, SOAP/WS-* is the chief implementation approach.
    • Resource Oriented Architecture (ROA): A technical approach to NOC that has an addressable, stateful resource as its principle abstraction. Today, REST/HTTP is the chief implementation approach.
  • Business Service Architecture (BSA): An unnecessary term (also not an architecture) that tries to make the obvious something special. Aka, business analysis. Aka, requirements gathering"
I am sorry but I beg to defer.

The first thing to note (again) is the architecture vs. architecture style differentiation I mentioned in a previous post (You can see a similar definition by Stuart Charlton) Here is a quick reminder :
Software architecture is the collection of the fundamental decisions about a software product/solution designed to meet the project's quality attribute requirements. The architecture includes the main components, their main attributes, and their collaboration (i.e. interactions and behavior) to meet the quality attributes. Architecture can and usually should be expressed in several levels of abstraction (depending on the project's size).
An Architectural style is a blue print that can be used when you desing an architecture. An architectural style defines some of the components and thier attributes as weel as place constraints on how they can interact.
My claim is that SOA is an architectural style for distributed computing which puts extra emphasis on the interface (and hence gets the easier interoperability). Ok, if SOA is indeed an architectural style, we should be able to define it as a set of components, interactions and attributes. Well, I already did that a while ago (in a paper called "What is SOA anyway?"). And while it may not be perfect, I think it is a reasonable definition all the same:

"SOA is an architectural style for building systems based on interacting coarse grained autonomous components called services. Each service expose processes and behavior through contracts, which are composed of messages at discoverable addresses called endpoints. Services’ behavior is governed by policies which can be set externally to the service itself. "



You can see the above mentioned paper for a little more detail on each of the components.

ROA, in my opinion, is just a re-branding of REST so that it would be easier to discuss it as an architectural style and not connect it to the HTTP implementation - which is what  a lot of REST proponents are doing.

By the way, as I pointed out before, there are a few other important architectural styles that are related to distributed systems like Event driven architecture, Spaced based architecture, peer-to-peer etc.

As for "Business Service Architecture" - I personally like to think about that as "SOA initiative" as in the strategic decision to try to implement an SOA in an organization while trying to achieve the more nebulous traits like business and IT alignment etc. (which is why it is nether architecture nor architecture style)


 
Tags: Everything | Papers | REST | SOA | Software Architecture

In a recent post Steve Vinoski said:

"Frankly, if I were an enterprise architect today, and I were genuinely concerned about development costs, agility, and extensibility, I’d be looking to solve everything I possibly could with dynamic languages and REST, and specifically the HTTP variety of REST. I’d avoid ESBs and the typical enterprise middleware frameworks unless I had a problem that really required them (see below). I’d also try to totally avoid SOAP and WS-*."

It is easy to dismiss this as just another yahoo who goes against conventional wisdom until you remember that Steve spent more than a decade working in Iona in leading roles like Chief Engineer of product innovations and helped develop some of the middleware standards for OMG and W3C.

Well, I guess that's becoming an epidemic  now :)  just recently we had Michael Stonebreaker talking about the RDBMS demise, Pat Helland talking about life beyond distributed transactions.  and now Steve on ESBs.

That trend aside, I think Steve is doing throwing the baby out with the bath water. The dream of a single infrastructure for an enterprise is ludicrous enough (Remeber Peter Deutsch and the "The network is homogeneous" fallacy). but if you drop the "E" from the ESB moniker you get a valuable middleware which is very usable in many situations and not just legacy system integration. For instance one thing that is missing form "HTTP variety of REST" implementation is reliable messaging. location transparency is  harder to solve with HTTP etc.

Another problem I have with the current approach of Steve is that he is replacing one dogma (EBSs are good) with another (ESBs are bad use Ruby, REST) - this is not a healthy approach. The solution should match the problem, that's probably the primary reason why we need architects after all

 
Tags: ESB | Everything | REST | SOA | Software Architecture

In the previous post on the subject I promised to expand a little more on the suggested content for "Distributed Systems Architectures Workshop" so here a short drill-down:

Even though most of the time should be spent on working, designing and evaluating architectures there's probably a little room for theory .

Module 1 The basics (probably not more than half a day)
What's software architecture
The software architect role
Activities
Scenario based architectural design
(documenting software architectures)
Agile SDLC and architects

Module 2 Distributed Systems background

Understanding the Fallacies of distributed computing

Distributed architectures styles - it is important to understand the different architectural styles that can be used to implement distributed system - whithin this topics like clustering, computation and data grids, messaging , publish subscribe etc should also be discussed

  • Client-server - The most basic distributed architectural style. It is based on the  N=1 premise and isn't fit for most of today's challenged. However it is still an option for some types of projects.
  • Pipe and Filters - not necessarily a distributed style, but it can be applied in distributed space
  • N-Tier - That's actually a moniker to anything where N>2 but usually it pertains to 3-tier architecture (front-end, server, database) or the internet 4-tier version (client, webserver, application server, database). 
  • Event Driven Architecture
  • Service Oriented Architecture
  • REST 
  • Space-based architectures - like JavaSpaces  and its implementations like Blitz (open source) and Gigaspaces (commercial)
  • Peer-to-Peer - you know that's what all those file sharing tools use

Distributed Consensus

  • 2 phase commit - used by XA and COM+ distributed transactions
  • 3 phase commit - considered a non-blocking protocol (vs. 2PC which is a bloging protocol) 
  • Paxos commits
  • Sagas
  • Eventually consitent (BASE) - Basically Available Scalable/Soft state & Eventually Consistent. An alternative to distributed transactions used by a lot of internt-scale companies (see a post I made on ebay's architecture )

Module 3 - workshop - most of the days should be focused on actually working to design architectures.
I would think that this would be handles best by working  in groups. e.g. having each group focus on one architecture style.
 
The groups would be given a scenario which covers some architectural concern (integrity, performance, scalability, availability etc.)  and would try to design strategies to handle the scenarios within the constraints of the architectural style. Present that to the other groups and then have a facilitated discussion on the pros/cons of each strategy. The  scenarios should be based on a large enough story to allow meaningful architectures to emerge (e.g. you can see the  10 scenarios in my SOA Patterns presentation )


Any comments or other ideas for what's needed for this kind of a workshop are welcomed


 
Tags: .NET | Everything | Java | Software Architecture

It seems that even the smartest people can get the difference between architcture, architecture styles and technology wrong
For instance Anne Thomas Manes points out the Roy Fielding makes this mistake in his REST and Relaxation presentation by mixing an architectural style with technology:
 "Roy is equating SOA with web services. Although a lot of folks use web services to implement services, that's simply an implementation decision"
But then procede to make the exact same mistake 
"So when watching Roy's presentation, replace the term "SOA" with "WS-*", and the discussion will make a lot more sense."
REST is an architectural style you can implement it with WS-* which is a technology. It is not the most natural way to use WS-* standards but it is doable.

Looking at the same context (i.e. Roy Fielding's presenation) Steve Jones makes a similar mistake confusing Architecture and Architecture style.

My definition for software architecture is
Software architecture is the collection of the fundamental decisions about a software product/solution designed to meet the project's quality attribute requirements. The architecture includes the main components, their main attributes, and their collaboration (i.e. interactions and behavior) to meet the quality attributes. Architecture can and usually should be expressed in several levels of abstraction (depending on the project's size).
An Architectural style is a blue print that can be used when you desing an architecture. An architectural style defines some of the components and thier attributes as weel as place constraints on how they can interact.

For instance, the REST constraints (taken from Anne's post mentioned above) are:
"Uniform Interface:
  • Resources are identified by only one resource identifier mechanism
  • Access methods (actions) mean the same for all resources (universal semantics)
  • Manipulation of resources occurs through the exchange of representations
  • Actions and representations are exchanged in self-describing messages

Hypertext as the engine of state:

  • Each response contains a partial representation of server-side state
  • Some representations contain directions on how to transition to the next state
  • Each steady-state (page) embodies the current application state"
Architecutre Styles can be combined to create new architectural styles. Roy Fielding demonstrates this in his famous dissertation  where he demonstrate how REST is a composition of several styles such as  Client/Server, Layered system, Stateless etc. As another example (which a lesser degree of precision) I take about enhacing SOA with EDA in "bridging the gap between BI and SOA"

The last piece of the puzzle is technology. Technology (in the software context) are set of tools provided by a vendor to enable and support building software solutions. As I've said here numerous times, technologies has their own internal architectures (as they are software solutions themselves) which is why different technologies support different architectural styles and why the alignment of the technology with the architecture chosen for your solution is important.

Yes this post is all about semantics - but clear meanings are important to prevent confusion, at least in my opinion anyway


 
Tags: Everything | REST | SOA | Software Architecture

A side effect of my decision not to become an independent consultant at this time means that I have to shelve some of the projects I was considering. One of these projects was to create a training program for software architects which I was discussing with a couple of training centers here in Israel.
Since It seems I am not going to promote it, I thought I'd share what I think a training program for .NET/Java architects should look like in the hope that someone would find it useful and do promote it (or parts of it)

Soft Skills
The way I see it architects needs a bunch of soft skills to be able to perform their roles.
Here is the list I identified in the past (by the way, I began a series of posts on each of these skills and never got to finish it - maybe it is time that I will :) )
  • Leadership. Influencing others to accomplish tasks and following your guidance
  • System thinking. Understand decisions and constrains in the
    wide scope pertaining to whole of the solution at hand. This includes
    the ability to abstract problems.
  • Strategic thinking. Understanding decisions and constrains and their alighments to the overall business of the company.
  • Organizational politics. Understand the environment you operate in and how it influences you.
  • Communications. Making sure you get your point across.
  • Human relations. Understand the "people" aspects or human
    factors and dynamics. This includes things like negotiation, pragmatism
    etc
I am not sure if you can teach all of them, but few courses that can help (in my opinion) include:

  • Presentation Skills - While getting the architecture and technology right is what matters, if you can't explain it to the different stakeholders you're toast.
  • Strategic Planning - This has to do with the vision thing I expect architects to manifest. Note that having a vision should not be confused with future-proofing a solution. future-proof means excess work not needed. Having a vision is knowing where you want to end - it can still be perfectly valid to completely re-write your applications along the way


Project Management
While the architect is not the project manager (mostly anyway), I think understanding the constraints coming from the project management point of view is very important. Since most environments call for a mix of agile and formal disciplines (hey, you've got to be pragmatic). I would train architect both in SCRUM and RUP (or some other formal methodology)

Also while not all environments needs this I would give an 2 days overview of important standards. The first would be IEEE 1471, which defines a standard for documenting software architectures. I would also teach ISO 90003 and CMMI.

It should be noted that the ISO 90003 is much better than the previous incarnation (ISO 9003) as it basically lets you define what you want to do to cover the different areas. The standard just helps you make sure you think about the various parts of project management (requirements, environment etc.). For instance I demonstrated how key areas of 90003 can be mapped to SCRUM to get it approved on my last project.


Languages, Design  and Patterns

I would want the .Net/Java architect training program to include at least 2 of the following languages:
Ruby, Scala, Erlang, F#, Python , Groovy ,OCaml
The reason for this is that these languages have different design goals than .NET and Java so learning them gives you additional perspectives and broaden your horizons for other ways of thinking (even if you don't use them in your project directly). You might have noticed that there's no .NET or Java training here. The reason for that is that's a prerequisite as far as I am concerned. You should master at least one

Object Oriented principles - hopefully aspiring architects already know this. However, I often see people who discovered some of the principles by themselves but haven't heard about all of them.
I am talking about principles such as Liskov Substitution  Principle , Open Closed Principle, Single Responsibility Principle , IoC containers, Don't Repeat Yourself  and YAGNI (I summarized my opinion on most of them in this paper)

The next step is to cover some design issues like Domain Driven Design, UI Design, Database modeling, Database alternatives l(after all the database is dead, right? )

Advanced design patterns - When most people hear the term design patterns they think about the GoF patterns. There are however literally hundreds of design patterns. Some of them are even worth learning :) . For instance there are patterns for concurrent and parallel systems like Proactor, Reactor Half-sync/Half-Async etc; Workflow patterns like Cancelling partial Join, Recovery Action etc; SOA patterns (ok, so I am still working on that :) )



Architecture Workshops
Another important part of the training, in my opinion, is to do some workshops and actually try to apply some of the material covered.

  • Architecture Evaluation - workshop 2-3 days - It is probably worthwhile to delve a little on scenario based evaluation techniques such as LAAAM and ATAM. While I prefer evaluating architecture in code, the scenario based thinking is very valuable for eliciting architectural requirements
  • Distributed Systems Architectures workshop - I'll expand on this in the next post


Lastly, there are also a few miscellaneous subjects like architect 101, the SPAMMED architecture framework , Agile architecture, Behavior Driven Design , common frameworks (though hopefully this would  not be needed ) like Spring/Spring.Net, Hibernate/NHibernate, iBatis  etc.


PS

Note that there are a few architect training programs available out there
One is offered by the Software Engineering Institute (SEI) and includes a 6 courses. SEI program seems to be focuses on formal sides of architecture as it includes courses on documenting software architecture and ATAM (You can see an old presentation I have on ATAM here)
Dana Bredemeyer  also offer architect training. Dana offers several workshops that cover the software architecture profession.
TOGAF (which is more of an enterprise architecture framework) offers both a certification and courses
Lastly, IASA is considering creating a software architect program and has a few courses in development
If you know any others I'd be happy to hear about them


 
Tags: .NET | Agile | Design | Everything | Java | OO | Software Architecture

You may have read about the guy who after spending 2 years on a Ruby on Rails project switched back to PHP. Without getting into the debate of whether Ruby on Rails is better than PHP or wether Ruby is overhyped (probably - but that doesn't mean it isn't any good either.By the way  it is the same with SOA, but I digress)

Reading his post I saw a few quotes that raised a red flag for me such as:
"But at every step, it seemed our needs clashed with Rails’ preferences. (Like trying to turn a train into a boat. It’s do-able with a lot of glue. But it’s damn hard. And certainly makes you ask why you’re really doing this.)"
and
#2 - OUR ENTIRE COMPANY’S STUFF WAS IN PHP: DON’T UNDERESTIMATE INTEGRATION
By the old plan (ditching all PHP and doing it all in Rails), there was going to be this One Big Day, where our entire Intranet, Storefront, Members’ Login Area, and dozens of cron shell scripts were ALL going to have to change..
and
Speaking of tastes: tiny but important thing : I love SQL. I dream in queries. I think in tables. I was always fighting against Rails and its migrations hiding my beloved SQL from me.

What these quotes say really is that this guy doesn't know about "Technology Mapping 101". Here is what I wrote about  technology mapping*  about 2 years ago (incidentally that's about the same time this guy started his Ruby adventure :) )


"Technology mapping can have a significant impact on the ability to actually implement the architecture. The wrong mapping can make adhering the architectural guidelines very cumbersome and sometimes nearly impossible."
Every technology or framework has its own architecture. This architecture poses constrains and makes certain things easy (like using the ActiveRecord pattern in Rails) and certain things harder (like not using O/R Mapping ) so, for instance, on the case mentioned above a better technology mapping might have been RBatis (iBatis for Ruby/Rails) which lets you map SQL statements to objects. It is important to note (in Rails case) that one of the core tenets for Rails is preferring convention of configuration when you don't do that you have to work hard(er) - as you are working against the framework

Another problem with the technology mapping here was his point #2. It is a pity he only saw it in after the fact. It can be justifiable to switch everything but you've got to allow this change to occur iteratively. While I generally dislike the software architecture = building architecture metaphor, using it here does make sense: building software for an existing business (vs. greenfield development) is like building a new intersection. You have got to think about building all the detours that would allow the traffic (business) to continue to operate, sure it might run slower in the intermediate phase but it can't stop altogether.

So again, Once you have an architecture that fits your business, take a look at the technology options you have. try to choose the best fit. Whatever you choose - take a look at the implications of that technology and think about the tradeoffs you may find that you either have to adjust your architecture or change the technology altogether - if you don't you might find you waited 2 years of development effort or even more..



* Technology Mapping is one of the steps of a set of activities I identified as needed to make sure your have a solid architecture. The activities goes by the acronym SPAMMED and you can read about them more in this article and/or this presentation



 
Tags: Design | Everything | ruby | Software Architecture | SPAMMED Process

There were a few threads about whether SOA is about the technology or not.

In my opinion SOA and Architecture in general  are never about the technology - technology is important but it is just one variable in the equation. What we are looking for is a way to satisfy as many of the business needs as we can under all the constraints we face.

For instance, a few days ago I got a question in my email box from someone calling himself coldplay. While I don't think the band has somehow got itself interested in Event Driven Architecture, the question itself looked interesting enough. Here is the situation:
Current Setup
Lets take a e.g of a Inventory Stock Reorder point exception with in heterogeneous apps environment(No-SOA and integrations)...
The exception definition was built into the source apps and when the stock dropped below reorder....event registered and led to a exception. Exception was further handled by a rules based engine and a workflow notification raised ..

Planned Setup
Same e,g as above .. Post SOA implementation.. Inventory management is composite service built by orchestrating collaborative services from SAP and Oracle...which have different data model supporting them...
The exception definition requires to be defined outside the native Oracle apps and might have to get some event related information from SAP web service also .. to arrive to a conclusion as to whether this really is a Exception or not ..


Possible Technical approaches:

• Data Persists somewhere in the processing the Exception
• Data doesnt persist
• In mem database used..


My Question now :

1. What do you advise to be used in EDA?  which would reduce network round trips, decrease apps server loads from the above 3 technical outlooks.
2. What is diff between in-mem db and usual processing of apps logic by a apps server

I feel :

• Data persistence would lead to larger commit times and reduce operational efficiency
• If the data doesnt persist... and all validations are executed on the fly... dont you think the current apps servers would die processing ... or if its processing capacity is increased .. is it going to be economically viable alternative.

To be quite honest I can't really answer the questioned asked because the question lacks the business context -
what are  the implications if events are  missed or lost? What's the acceptable latency that would allow the business to operate properly ? if the Oracle bits and SAP bits need a lot of data from each other - then maybe the whole service partitioning is wrong and the services are not cohesive enough? How many business events are expected anyway? How often? and the list goes on.

Once you answer the business questions you can look at the available technology portfolio and ask whether you would want a in-memory Database or maybe a datagrid would be a better option? and even then the decision is not just technology driven since when you do cost/benefit analysis you need to take into account purchasing costs, operational costs, skillset of the dev team, time to implementation etc.

This is not to say that if you choose a technology that isn't aligned with your architecture you should reconsider the architecture (or technology). Also since each technology product brings into the table its own architecture (with its own constraints and decisions) you probably need to verify the architecture once you make technology choices. but still, at the end of the day it is the business needs that sits in the driver sit, the rest is just tagging along for the ride.


 
Tags: Everything | SOA | Software Architecture

September 20, 2007
@ 12:25 AM
Another REST related post - this time I want to share a couple of observations I had after reading (Roy's presentation from RailsConf 2 days ago via Pete Lacey) and listening to Roy T. Fielding's presentations.


The first point has to do with a question which is sometimes raised whether you can do REST without HTTP. i.e. can you have a RESTful architecture  if you don't use the http protocol and further more not using the http verbs (GET/PUT/HEADER etc.) or  as the unifier interface. I talked about it a while ago and I think you can. listening to Roy's talk  it seems that, at least in http architect's opinion the answer is yes as well.

Another point that occurred to me, watching Roy's talk, which is related to the "REST magic" post I wrote a little over a week ago. The use of a uniform interface is tauted by REST proponents (and Roy himself) as coupling reducing formula. After all if you use a uniform interface you are not coupled to the particular semantics of any resource/service you already know the capabilities (actually the maximal capabilities) they offer. What ensues is that instead of using a lot of verbs (ReserveRoom, UpdateOrder etc.) you use a lot of nouns (/rooms/, /orders/order1 etc.)

This works extremely well on the "human" web where my browser can navigate to any-ol'-site without any prior knowledge of what's the site about. When I navigate to Amazon I can buy stuff, when I navigate to New York Times I can read stuff etc. The problem here is the browser is really dumb about what's going on. I, as a human using the browser, understand the context from the content (well, most of the time anyway;) ) so the browser can remain decoupled.  However when you translate it to the "programmable" web you usually don't have some mighty AI engine examining the response to understand the context - instead what you do is trade the verb coupling, which with WS-* web-services would be defined in a contract, you are now coupled to the nouns ( this is not to say the nouns aren't discoverable - since they are due to the hypertext or document orientation communication REST encourages). The end result is pretty similar to what you get when you use verb based contracts your software still needs to understand (where "understand" means some level of coupling) what it is doing with the "other" services. not to mention that you still need to understand the content of the message (sorry- response) to do anything useful with it.

In any event, while loose coupling is very desirable, we also need to remember that the only way to truly achieve complete decoupling is to not connect components. So some coupling is always needed if we want to produce meaningful systems.

What do you think?


 
Tags: Everything | REST | SOA | Software Architecture

If there's one reason to go to ApacheCon 07 in Atlanta, then it's probably Roy T. Fielding's "a little REST and Relaxation"

Here is the abstract:
"Representational State Transfer (REST) is an architectural style that I developed while improving the core Web protocols (URI, HTTP, and HTML) and leading them through the IETF standardization process. I later described REST as the primary example in my dissertation. Since then, REST has been used (and sometimes abused) by many people throughout the world as a source of guidance for Web application design. But is the REST that we hear about today the same as what I defined in my dissertation, or has it taken on the baggage that comes with an industry buzzword? This talk will provide a real introduction to REST and the design goals behind its evolution as the Web's arhitectural style. This is not about XML-over-HTTP as an alternative to SOAP, nor about "resource-oriented" frameworks that help simplify CRUD operations, but rather about the design goals and trade-offs that influence the development of network-based applications. I will also describe what happens when we relax some of the REST constraints, and how such relaxation is impacting the design of the waka protocol as a replacement for HTTP."
Now all I have to do is find an excuse for my boss... :)

There isn't a whole lot of information available  on WAKA  (that replacement for HTTP Roy mentions in the end of the abstract). Belwo are a few links I managed to find
And there's a few others but not as interesting (to me anyway). Well, as we see this WAKA thing is in the works for a long time now. Also replacing something as ubiquiteus as HTTP is not a small feat. But I guess if anyone can pull this off it would be Roy... As always, only time will tell

Edited (18/9): it seems that a recent version of Roy Fielding’s presentation  is available online on parleys.com (via Stefan Tilkov)



 
Tags: Everything | REST | SOA | Software Architecture

From time to time I read about the magic that is RESTful services and how they solve everything and anything like scalability, idempotency, simplicity etc. for instance in "RESTful Web Services" by Sam Ruby and Leonard Richardson they say
 "PUT and DELETE operations are idempotent. if you DELETE a resource, it's gone. If you DELETE it again, it's still gone..." (p.103)
or
"the safe methods, GET and HEAD, are automatically idempotent as well" (p.219)

Another example comes from Anne Thomas Manes who said

"The REST architectural style defines a number of basic rules (constraints), and if you adhere to these rules, your applications will exhibit a number of desirable characteristics, such as simplicity, scalability, performance, evolvability, visibility, portability, and reliability.

The basic rules are:
  • Everything that's interesting is named via a URI and becomes an addressable resource
  • Every resource exposes a uniform interface (e.g., GET, PUT, POST, DELETE)
  • You interact with the resource by exchanging representations of the resource's state using the standard methods in the uniform interface
"

I think such claims  are plainly wrong and misleading.
 
Don't get me wrong, I like the REST approach, since it encourages better service design - e.g. document oriented message exchange vs. the RPC like message exchange which the so called "WS-death-*s" (or actually the tools that support them) encourages.

It also encourages the above mentioned traits - however that's exactly the  point - REST encourages this thinking not solves scalability or other problems out of the box- you still need to design your services properly.

For instance if you follow Anne's rules you can still end up with a service which is stateful, that performs heavy distributed transactions against multiple databases and systems - i.e. a service that is neither simple, scalable or perfromant

DELETE will only be idempotent if the resource is idempotent (e.g. a specific version of a resource)  or the message is idempotent (e.g. requesting a deletion of a specific version) if you are deleting the "recent version" then it might have been recreated between your calls you are now deleting something completely different. heck, even a GET (read) message with a single reader can be made to be non-idempotent  if you decide to code something that alters the state of a resource significantly whenever it is read. When you have multiple readers and writers GET will not be idempotent "automatically" as two consecutive reads can give you two different representations as the resource might have changed (again unless the resources are idempotent)

REST is not different from other styles in this respect - for instance you can do Object orientation in C but working in an OO language encourages object orientation (the opposite is also true - using an Object Oriented language does not guarantee that you get an Object Oriented design)

At the end of the day, architects should still think about the design if they want to ensure the results matches the quality attributes they want to achieve - some environments/styles/tools will make some quality attributes more easy to achieve but nothing will solve the problems for you.



 
Tags: Everything | OO | scalability | SOA | Software Architecture | REST

September 7, 2007
@ 02:07 PM
In the previous post on the subject I wrote that the RDBMS is dead. I didn't mean that it is dead dead, but rather that it isn't well build to meet some of the newer challenges like linear scalability, high availability etc.
Well, it is one thing hearing it from me - and it is another thing hearing it from someone like Michael Stonebraker.
Michael, was the main architect for the Ingres prototype project at UC Berkely just one year after Codd's paper and (9 years before Oracle was released  and more than a decade before the commercial version of Ingress was released).

Well that was in 1970 - in 2007 Michael recently wrote :
 
In short, the world of 2007 is radically different from the world of the late 1970s. However, none of the major vendors have performed a complete redesign to deal with this changed landscape. As such they should be considered legacy technology, more than a quarter of century in age and "long in the tooth".
Among the new needs Michael cites are intelligence DBMSs (needs a lot of relations), textual and semi-structural data etc. He also said (promoting his own product) that 2007 customers expect high availability, linear scalability.
Michael's main point is that specialization can provide significant performance enhancements vs. the one-size-fits-all approach of RDBMSs. He gives his product (Vertica) as an example for how a column oriented database (vs. the RDBMS row orientation) can outperform RDBMs by a factor of 50. Google's Big table is another example.

Interesting...


 
Tags: BI | data | Everything | scalability | Software Architecture

September 4, 2007
@ 11:03 PM
When I begun writing SOA patterns, the first version of chapter 1 was a general introduction to Service Oriented Architecture from the perspective of Software architecture. When the editors saw the patterns chapters they've felt the chapter wasn't focused enough on patterns so I rewrote it untill it finally molded into the current version.

Nevertheless, I think that the first version has value on its own providing some guidance on the influences on architecture and putting SOA in an architectural context. I will probably edit it a little over the next few days so that it would be standalone (i.e. disconnected from the book). Meanwhile you can download the original version from here



 
Tags: Everything | SOA | SOA Patterns | Software Architecture | Papers

Another great presentation at Architecture & Design world was Neal Ford's presentation on Domain Specific Languages (DSLs) . As the title suggests Neal gave examples both in static languages (Java) and Dynamic ones (Groovy, Ruby).

One interesting observation Neal made was that humans tend to create DSLs in real life whenever they (ok, we :)) have any non-trivial interaction or behavior. Neal gave sevaral examples such as the Starbuck's order taking ("Venti Iced Decaf with whip...") , musicians and a few others.

The next important point was contrasting ("classic") APIs and DSLs. The main difference is that the context is implicit and not repeated

A key  technique for building DSLs Neal mentioned was Fluent Interfaces. Fluent Interfaces means modeling the API so that lines of code are readable English-like sentences. The fluency comes from the easier readability by the interface user.

Fluent Interfaces, now that's a novel idea - what would a fluent interface look like, hmm, wait, I have an idea. Here are 3 samples that come to mind

DIVIDE x BY z GIVING y ROUNDED

INSPECT data REPLACING ALL "foo" BY "bar

READ someFile AT END SET eof TO TRUE


If you haven't guessed  the statements above  are in ...Cobol (by the way pardon the caps that's Cobol conventions..)
So ok, it isn't a new idea, but it is interesting to see it is making a comeback
Anyway one area where we see a lot of fluent interfaces emerging is configuration (mainly as an alternative to those lengthy XML files). For instance the following is an excerpt of configuring  Restlet components  (taken from my Edge Component pattern paper):

Builders.buildContainer()
            .addServer(Protocol.HTTP, portNumeber)
            .attachLog("Log Entry")
            .attachStatus(true, "webmaster@mysite.org", "http://www.mysite.org")
            .attachHost(portNumber)
            .attachRouter("/orders/[+")
            .attach("/getAll$", getAllRestlet).owner().start();
            .attach("/getLast$", getLastOrderRestlet).owner().start();


Note that this example also uses another fluent interface/DSL technique which is method chaining.

Dynamic languages make it even easier to write DSLs since they provide a lot of extension capabilities (see my previous post on OCP in Ruby), are less strict about types, allow reopening classes etc. 
Oneexample for a Ruby DSL is RSpec which is a framework to support Behavior Driven Development (BDD) in Ruby - The example below shows an excerpt for defining specifications for an eight-ball game

require 'eight_ball'

describe Eight_ball do
    before(:each) do
        @eight_ball=Eight_ball.new
    end
    .
    .
    .
    it "should lose if 8-ball sinked in pocket other than called" do
        [1,2,3,4,5,6,7].each ( | val | @eight_ball.sink(:player =>"Player1", :Ball=> val)
        @eight_ball.call(:player => "Player1", :pocket => :upper_left)
        @eight_ball.sink(:player => "Player1", :Ball => 8, :pocket =>:middle_left)
        @eight_ball.game_status.should == :ended
        @eight_ball.player_status("Player1").should == :lost
    end
end
 
By the way Joe Ocampo built a very nice port for rbehave (another Ruby BDD framework) to .NET 3.5 by extending NUnit which has a very Ruby-like syntax
      

Anyway, the DSLs demonstrated by Neal provide a very good example of the difference between dreaming big and actually doing stuff in the small. The counter example for that are "Software Factories". As I wrote here about a year and half ago
Software Factories is not a new idea  - see for example "Software reuse: From Library to Factory" by M. L. Griss  (published in 1993(!)) which talks about "Software Factories" and "Domain Specific kits": components, frameworks, glue languages etc.  The current Microsoft  incarnation of Software Factories takes a similar approach focusing on Domain Specific Languages, Frameworks but also adding important aspects like multiple viewpoints, patterns and designers. The idea is that  building on modern technologies, as well as learning from the mistakes from sister approaches to code generation (OMG's MDA, in case you are wondering) will enable us to build something that is useable.

Microsoft seems to be taking some steps in the right direction (GAT is probably the best example). Nevertheless there is still a long way to go before we can realize the dream of "factories" for vertical applications


Unlike small code based DSLs - the modeling based approaches of software factories, MDA etc. aim too high and thus provide much less value or suffer too much from the generation gap (the code generated is too generalized or far off from the actual need of the solution). Another problem with software factories/ MDA DSLs is the modeling (i.e. diagrams) - they say a picture is worth a thousand words. This is true if you treat models as sketches you can raise the level of abstraction by as much as you want and convey ideas with less clutter. However when you need to make the model very specific so it would allow code generation - you get to a stage where it is more convenient to do it in code and rely on generated or pre-built DSL or framework

Lastly you can  download Neal's presentation (in PDF form) from his site


 
Tags: .NET | A&D2007 | Design | Everything | ruby | Software Architecture | Java

August 21, 2007
@ 02:58 PM
Ok now, that I got your attention, that it isn't dead yet - but we can see a whole class of applications (maybe a couple of classes) where the importance of the RDBMS as we know it today is greatly diminished.
In an article I posted recently on InfoQ, (which I also mentioned in the post on eBay architecture last week ) I discussed the notion of database denormalization on internet-scale sites (such as Amazon, eBay, Flickr etc.). One point of denormalization is immutable data where there isn't a lot of gain in normalization to begin with.
The other thing is entity representation vs. speed. The problem is that joins are slow and sometimes you get to corners where if we want any type of scent speed we need to denormalize. Todd Hoff notes that as well:
The problem is joins are relatively slow, especially over very large data sets, and if they are slow your website is slow. It takes a long time to get all those separate bits of information off disk and put them all together again. Flickr decided to denormalize because it took 13 Selects to each Insert, Delete or Update.
This point is, however, that these "corner cases" get more and more prevalent even in smaller scale application - especially when you have complex entities (as is the case with defense systems for example). Mats Helander, recently wrote a post about saving to Blob, and only adding fields as needed for indexing and identity purposes. Mats also suggest the semi-transparent way of using XML columns where the database can do something with the otherwise opaque data.
This point in fact, demonstrate that the relational data future is indeed not totally secures as we  do see that that leading databases  begin to treat XML data (which is hierarchical and not relational)  as a native citizen - to the point we can even index XML data.

So far we've seen a trend to denormalize more, handle non-relational data, what else? ah transactions
Ive worked on several systems where the data was constantly updated and actually gave the system's representation of the world out-side (of the system) the focus was on availability and latency. Which is again also aligned with the approach taken by the large internet sites which emphasis eventual consistency over immediate consistency.
In distributed systems crashes happen. The RDBMS is show-stopper when it comes to crashes - if we can't commit, we need to stop,roll back. now maybe we can start-over. Is this acceptable? there are many scenarios where it is not. I've seen it in defense systems, in communications systems and even in e-commerce systems (if you are not responsive, I'll just go to the competition).
What do you do in the presence of error? Joe Armstrong suggest the following as the basis for Erlang in his thesis:
To make a fault-tolerant software system which behaves reasonably in the presence of software errors we proceed as follows:

1. We organize the software into a hierarchy of tasks that the system has to perform. Each task corresponds to the achievement of a number of goals. The software for a given task has to try and achieve the goals associated with the task. Tasks are ordered by complexity. The top level task is the most complex, when all the goals in the top level task can be achieved then the system should function perfectly. Lower level tasks should still allow the system to function in an acceptable manner, though it may offer a reduced level of service.The goals of a lower level task should be easier to achieve than the goals of a higher level task in the system.

2. We try to perform the top level task.

3. If an error is detected when trying to achieve a goal, we make an attempt to correct the error. If we cannot correct the error we immediately abort the current task and start performing a simpler task.

On top of that we try to keep any update local i.e. within a task boundary on the hardware where the task occurred - distributing the transactions is not a good option. I outlined why when I talked about SOA and cross-services transactions but the reasoning holds.

Well, truth be said the RDBMS is not dead, its demise probably not even around the corner. Also this does not mean that there aren't any uses for a database. But that's true for other architectural choices. Who ever said that a single tier solution is not the right one for very specific types of system...
RDBMS succeeded to to become the de-facto standard to building system because they offer some very compelling attributes - ACID brings a lot of piece of mind. Large scale systems,low-latency system and fault tolerant systems opt for another set of compelling attributes  (BASE). The point is that  when you design your next solution maybe the conventional database thinking is something that you should at least give another thought to and instead of just following dogma


 
Tags: data | Design | Everything | scalability | Software Architecture

One of the most interesting presentations in Architecture & Design world was the eBay Architecture presentation by Randy Shoup and Dan Pritchett. The presentation was only one hour long, so Randy and Dan didn't cover all the topics in the slide. Here are some of the insight I took from this presentation.

Architecture evolution  -  eBay actually went through several architecture revolutions. Their initial architecture cannot even begin to scale to their current loads. It was, however, a very good fit for their initial quality attributes - specifically, the emphasis on time to market and costs.  This shows the importance of balancing quality attributes. Sure an architectural change is painful but if they'd future proofed too much I doubt they would ever get something working.

V2 demonstrated that traditional 3-tier architecture would only scale so far. It was nice to see how it evolved though. Also with the move from version 2.4 to 2.5 and later to 3 we see eBay learning about  CAP - the hard way. In its final (current) incarnation eBay's data architecture prefers partitioning and availability over consistency. This doesn't mean they forgo consistency altogether - just that they trade the comfort zone of ACID transactions with the BASE approach. Where BASE - stands for Basically Available, Scalable/Soft state & Eventually Consistent. .
eBay partitions thier data in two levels one is a SOA like division by business areas (users, items etc.) and the second level is an horizental partitioning based on access paths.This BASE approach to data was dubbed by Dathan Pattishall (from Flicker and Friendster) as sharding (via HighScalability). This approach means things like high partitioning, no distributed transactions (also see below), denormalization etc. (you might also want to read the item I wrote on denormalizaiton in InfoQ yesterday).
The more major implication here is that when it comes to internet scale, the database looses its importnace - or as Bill de Hora nicely puts it:
The use of RDBMSes as data backbones have to be rethought under these volumes; as a result system designs and programming toolchains will be altered. When the likes of Adam Bosworth, Mike Stonebraker, Pat Helland and Werner Vogels are saying as much, it behooves us to listen.

As I said the data architecture of eBay is SOAish -  partitioned their components and data along business lines, and they apply many of SOA principles. They don't however unite data and components to create a service and  they don't (seem) to have the same contract boundaries that SOA promotes (Randy told me that they are currently contemplating SOA).

Returning to the  eBay do not use transactions. "no transactions" which seems very controversial  - but if we just consider some of the points I made on transactions between services in previous posts - it is the only logical way to ensure scaling. By the way, as can be expected  they do use transactions - when they are local e.g. if the users table is spread over a couple of table both will be updated together).

The application layers also follow the segmentation by business areas. eBay cacse metadata/immutable data as much as possible. keep the application stateless (i.e. state comes from client/db) e.g. they don't use sessions. The DAL virtualized the horizontal partitioning mentioned above for the rest of the code.

It was also interesting to  that eBay developed its own messaging infrastructure - though Randy and Dan did not provide alot of details on that

Development process - It seems that eBay is using some hybrid of feature driven development with waterfall (i.e. the development is feature by feature - but the development of a feature is waterfallish). The do have a constant delivery rate which they synchronize using the concept of a train. if you have a features that is it will be added to the train which is scheduled to arrive around the time your feature will be ready. Several features are delivered as a package which gives a predictable (weekly). I guess it also gives them some nice metaphors to use such as a feature that doesn't make it - misses the train or the train leaves on time etc.

The slides of the presentation can be downloaded from Dan Pritchett's site (They not from the same event but they are pretty much the same slides. Also you can read Elliotte Rusty Harold's account of the presentation.
 
Tags: A&D2007 | Everything | SOA | Software Architecture

August 1, 2007
@ 09:52 PM
I won't say anything about my presentations (that's for others to say :) ). The point of this post is just to let you download them. So here they are:
  • SOA Patterns (2.14mb) - Takes a look at different strategies (patterns) to solve common SOA pitfalls
  • Getting SPAMMED for architecture (4.56mb) - Takes a look at the activities architects can/should do when they think about software architectures. The presentation also covers architecture in agile projects.


 
Tags: .NET | A&D2007 | Agile | Everything | SOA | SOA Patterns | Software Architecture | SPAMMED Process

July 22, 2007
@ 08:16 AM
I've got a few emails from people who asked about the commit and consensus protocols (Paxos, 3PC etc.) mentioned in the previous post.
I was going to write something up but then I find a very good overview by Mark MC Keown which provides "A brief history of Consensus, 2PC and Transaction Commit". This short writeup provides links to the most important papers like the Fischer, Lynch and Paterson (FLP) "Impossibility of distributed consensus with one faulty process", Jim Gray (who, unfortunately, was lost at see earlier this year)  and Leslie Lapmort's "Consensus on Transaction Commit". In addition to Mark's post and all the papers there you may want to check out the presentation Jim and Leslie made on paxos commits

By the way, if on the other hands, you already know all this (and a little more) than check out Werner Vogel's (Amazon CTO) blog as there's a interesting  job opening for you there


 
Tags: Everything | Software Architecture

Following the previous post I had a chance to exchange a few email with Mark Little (the director of engineering in the JBoss division of Red Hat). Mark thinks the topic of transactions and SOA has been beaten to death already and wonder's why does it need to resurface (see his post "Is anyone out there?") - I don't see a problem with discussions resurfacing when new people are faced with situations others already solved (but that's a matter for another post)

Anyway, the reasons we're here is  that I think that during this conversation  mark made a few interesting observations and I think the end result is pretty interesting. I decided (with his permission) to post it here ( It is only minimally edited: no deletions, few additions (in []) and a few time shifting to make it more coherent as a single conversation)

Mark
: From what I can see it's [the arguments on transaction and services - are] the same old arguments that have gone round and round, ignoring the important fundamental issues and not doing enough background research.
Sagas are transactional - it's just an Extended Transaction model and not an ACID transaction model. Don't get hung up on the word "transaction", which is way to overloaded in our industry to actually mean anything by itself. Plus, 2PC is a consensus protocol too; it does not impose any other aspect of ACID than the A. Even the D is optional until/unless you want to tolerate failures.

Arnon I know this is an old argument - but that doesn't mean it isn't worthwhile

Mark It isn't worthwhile if people aren't going to listen ;-) I've been involved in these debates so many times over the past 7 years (for Web Services transactions) and longer for extended transactions, that it gets a bit old after a while. Maybe we should create a wiki page and point people at that ;-)?

Arnon I guess, but you should keep in mind that people who are solely in the .NET camp only got WS-AT recently with Windows Communications Foundation so you can expect the issues to resurface. By the way a wiki might not be a bad idea 

[regarding 2PC.] 2PC is a distributed consensus protocol and in principle doesn’t have to be related to ACID transaction. But I think the common view and use of it is for ensuring distributed ACID behavior. Looking back at my experience with XA and COM+ transactions it seems it does a good job at achieving this ACIDness


Mark This is an education issue. The literature is clear on this. People who know and understand transactional protocols don't make the mistake of equating 2PC to ACID properties.

Arnon Yes it is an educational issue . But I am not sure that it is that common knowledge. It is expected that middleware vendors who build the tools to support these protocols to understand it better - I don’t think it is that widely known outside these circles. Most of the architects I’ve don’t (maybe It time to look for new friends ;-))

By the way as 2PC is not resilient to failures of the coordinator so in a highly distributed environment like SOA it might have been a better idea to go with paxos commits if at all you go down that path.

Mark The reason WS-AT and WS-ACID chose 2PC is: interoperability. All TP monitors support it. Try getting IBM, MSFT, Oracle, BEA etc. to change to Paxos, 3PC, flat-commit, or anything else and you'll be waiting for the heat death of the universe.

Arnon Can’t argue with that

Mark [also] 2PC is resilient to failures if the coordinator eventually recovers. Paxos has its own failure assumptions too: Jim never disputed this. Same as 3PC and other consensus protocols. As with *any* fault tolerance approach (transactions, recovery blocks, replication, etc) it's always probabilistic. All we're doing is making it highly unlikely that the system cannot complete, but we can never make it entirely safe. Even in the airline industry they can "only" go to a probability that failures happen .000000001 ;-)

Arnon You are probably right that in an SOA situations the chances of not getting an ACID transaction are worse than in a controlled environment - which actually make the situation even worse since people using WS-AT perceive it as allowing them ACID interaction (e.g. Juvals podcast) .


Mark WS-AT is *all* about ACID, in the same way WS-ACID is about ACID transactions. It is *nothing* to do with SOA though. Web Services are not purely the domain of SOA implementations!

Arnon I totally agree that Web-services and SOA are not directly related and can each exist independently of each other. Again this is an educational issue but,  SOA==Web-services is a very common misconception (I guess the word “service” in web-service doesn’t help ;-) )

[in any event] I think distributed transactions in general should be used carefully period.


Mark Absolutely. They are not a global panacea and people who push them as such do more harm than good.

Arnon WS-AT is more problematic than regular distributed transactions as by definition in an SOA you do not know who and how many other services will participate in your transactions so you are much more likely to run into problems.

Sagas which embrace the temporal shift don't give an illusion of ACIDness and allow to focus on achieving distributed consensus while keeping all parties involved consistent. I think that it is a much better option if you need transaction-like behavior

Mark For SOA, yes. Although Sagas are only good for a certain type of use case. That's why we've always tried to develop "live documents" that allow people to add new models when/if needed. With a couple of exceptions during the BTP days, there has always been consensus that one size does not fit all (http://www.webservices.org/weblog/mark_little/blackadder_and_the_micro_kernel_approach_to_web_services_transactions).

Web Services *anything*, whether it's WS-AT, WS-Sec, or WS-Addressing all have their non-SOA aspects because Web Services aren't developed purely with SOA in mind. If that were to happen then Web Services as a technology would lose some of their important benefits immediately.

Arnon This whole discussion is in the context of SOA (at least from my side ) – naturally there’s a place for ACID transactions for other uses.

Regarding Sagas - calling them "Extended transactions that are not ACID" is just semantics - my point was that they are not ACID transaction. I think most people equate transactions with ACID transactions as well (but I may be wrong)

Mark Many people do and that again is an education problem. The term Extended Transactions (don't need to say "that are not ACID") has a well defined meaning in the R&D community. There have been many good models and implementations around Extended Transactions. They really took off in the vendor community through the Additional Structuring Mechanisms for the OTS, back in the 1990's. If you check that out you'll see that it formed the basis of WS-TX and WS-CAF. Even in Jim's original technical report he discussed relaxing all of the ACID properties in a controller manner to get more flexibility. That was the first extended transaction. In fact, ACID transactions are just one type of extended transaction. There are many many others, including nested transactions, coloured actions, epsilon transactions, sagas etc.

Unless I qualify it beforehand, I try to never use the term "transaction" in isolation because it has different meanings to different people. For example, when talking to developers working in trading infrastructures, a "transaction" isn't an ACID transaction at all. In telecos it's different again.


 
Tags: Everything | SOA | Software Architecture

Evan H asked a question about distributed transactions and services in the MSDN architecture forum:

Are distributed transactions (ie.. WS-Transaction) a violation of the "Autonomous" tenant of service orientation?   Yes or No and Why?  Kudos if you can address concurrency and scalability (in an enterprise with multiple interacting services).

I answerd this questions back in april when I wrote a couple of posts that explained why cross-service transactions are a bad idea:cross service transactions and some more thoughts on cross service transactions.
Roger Sessions also agrees with this view (well, it seems actually, he wrote about it well before I did :) ):
When the WS-Transaction specification was first proposed, back in 2002, I wrote an article explaining why I thought the idea of allowing true transactions to span services was a bad idea. I published the article in The ObjectWatch Newsletter, #41: http://www.objectwatch.com/newsletters/issue_41.htm. Nothing since then has changed my mind. Atomic transactions require holding locks, and spanning transactions across services requires allowing a foreign, untrusted service to determine how long you will hold your very precious database locks. Bad idea. Just because IBM and Microsoft agreed on something doesn't make it good!

The reason I am bringing this issue back is that Juval Lowy (who wrote the article that triggered my first post on the subject) has recorded an Arcast with Ron Jabobs. Where he re-iterated the idea that "Transactions is categoricaly the only viable programming model" and you should strive to use it whenever you can. It seems Juval admits you sometimes need to use Sagas (which he called "long running transactions" - you can see in my link why I think that's a wrong name). He also agrees that you can also use a transactionable transport and then only do internal transactions from each service to the transport (a pattern I call "Transactional Service"). However, at the end of the day, he still thinks you should use WS-AtomicTransactions whenever you can.

I agree that transactional programming is important. I think it is the simplest programming model (from the developers side). I would probably never write an interaction with a database that is not transactional; I look very favorably at initiatives for in-memory ACI (no Durability) transactions such as the one Ralf talks about.  Until we get to Distributed Transactions...

First, we should note that transactions are not "the only viable" option.As Martin Fowler notes Ebay seems to be doing fine without distributed transactions. Not only that, they abandoned distributed transaction and went "transactionless"because they needed one simple thing... Scalable performance .

In most COM+ scenarios you have a single server or a few internal servers where the distributed transaction happen - and even there you should plan your transactions carefully if you want to get any kind of decent performance. In SOA scenarios the situation is more complicated as the distribution level is expected to be higher (even if you don't involve services from other companies). More distribution means longer times to complete transactions (especially if a participant can flow the transaction and extend it). It also means increasing the chances of failure (see Steve Jones series of posts on five nines for SOA). In my opinion, the more distributed components you have the more you want their interaction to be decoupled in time - i.e. the opposite of transactions.

Juval also said he doesn't buy the denial of service problem I mentioned (supporting a transaction means you allow locks - if an external party doesn't commit you retain the lock..). Juval said he assumes that a solutions has both authentication and authorization so this shouldn't be an issue. For one, I have seen too many projects where security was something that was neglected or quickly patched in at the latest moment - so I would hardly assume security. Even with security on - you increase your attack surface.
But that's just the half of it. Even if all your service consumers have good intentions - you still don't know anything about their code. SOA is not like the "good old days" where you owned the whole application  - this means you cannot trust their security to be ample. Also you don't know anything about their code quality. Services are likely (in the general case) to be deployed on different machines, even if they start co-located. I think that a Service boundary should be treated as a trust boundary just like a tier boundary. I strongly believe you should have reduced assumptions on what's on the other side of the service's boundary - transactions are not reduced assumptions

SOA and distributed transactions do not go hand in hand - it isn't just autonomy at stake here. It is a problem for performance and scalability and even security period.

To finish this post - I would also highly recommend looking at Pat Helland's paper "Life Beyond Distributed Transactions: an Apostate's Opinion" and a post he recently made called  "SOA and Newton's Universe", where he explains more eloquently than I ever could why SOA is not a good fit for distributed transactions.



 
Tags: .NET | Everything | SOA | SOA Patterns | Software Architecture

July 9, 2007
@ 10:55 PM
Steve Jones has (yet another) great post called "Le Tour SOA - why support services are critical, but not important".
You should go read the article - but in a nutshell, Steve explains that important services are the ones that bring business values and critical services are the supporting ones that help keep the light on for the important services to function properly. 

While the post has SOA in the title. I think it is more general and is also applicable to applications or any other IT generated components. In fact it can also be applicable to IT itself as Nicholas Carr noted in 2003 when he published his paper "IT doesn't matter". Nicholas argues that IT will become akin to electricity and as such be critical for the business to continue operating but not important. As a side note I'd say I think this is might be true for traditional businesses but not for businesses where the IT is the business (such as banks, insurance companies, etc.)

Back to critical vs. important - I think this is an important for architects to make this distinction to be able to prioritize work and not confuse business value with semblance of business value due to criticality for operations. This doesn't mean you can neglect critical tasks (after all they are critical...). It is the important stuff that will bring your business the competitive edge.


 
Tags: Everything | SOA | Software Architecture

July 2, 2007
@ 01:28 PM
A couple of quick  observations following the Events and temporal coupling post

Events, Current data and aggregated data all have Time-to-live aspects.
  • Events value usually diminishes over time until the TTL reaches
  • Current data usually have a constant value while their TTL lasts (until a new value is the current data) - unless we are talking about version data which is a  component of or a step  in the direction of aggregated data.
  • Aggregated data  has the longest TTL, it is interesting to note that its value increases over time
Also while the Current data TTL is determined by the producer both Events and Aggregated data TTLs are determined by consumers

Yeah, I know these are not not earth shattering observations but I still think they are  interesting
 


 
Tags: Everything | General | Software Architecture

June 28, 2007
@ 03:23 PM
You raise an event when something interesting happens to you, you think it is important, but you don't care enough to know who is interested. you are even less interested in to personally going and to each an every interested party and letting them know. So - instead, you raise and event, and let the poor buggers take care of any implications by themselves. We raise the event "now" when the change happened - it is only important now anyway...


Looking from the "poor buggers" -the event consumer point of view things are more complicated. There are events which are cyclic in nature like stock price updates, the blips from a sonar etc. if you missed one, then it isn't really important you'd get the right information in the next update (actually, that isn't entirely true - see later in this post). Then, there are the events which only occur once. sometime it isn't important for you to listen to them if you are not up and running in the same time. Other times you can't afford to lose an event for instance if your ordering service (or component for that matter) communicates with the invoicing one using events you don't want to miss the event of a new order else you would loose money.

This basically means that the event producer and the event consumer are coupled in time - one way to solve it is to make sure both of these services are available at the same time i.e. if the invoices crashed, then processing orders should be suspended (note that this doesn't mean that you don't accept orders just that you don't process them).
Ok - maybe we can just raise the event "transactionally" - this would probably work, but we need to remember that the event producer doesn't really care about the event consumers, why would it want to fail because of them?!
Maybe a better way would be to "raise" the event over some reliable transport  - this has a few problems. one is that we've passed the problem to the connection between the event producer and the transport. It might be acceptable to have a transaction between the event producer and the transport. However, as I've already said the producer doesn't care much about the consumers..
We can have persistent subscriptions for existing consumers to prevent events from getting lost which make both creates a er minor problem that new consumers can't see past events but also has the risk of existing subscribers disappearing and their queue can then grow endlessly (or until an administrator would remove the subscription)

Ok, let's try to look at the problem from a different angle. looking at the events, what we can really see is that an event has a time-to-live (TTL) as far as the event consumer is concerned. For instance in the case of the cyclic events the TTL is the interval until the next  event. Actually, even with cyclic events the TTL might be larger - if we are also interested in analyzing trends or  ab normal occurrences (which is why I said it isn't entirely true we don't care about old events). In case of one-time events the TTL might be indefinite or maybe even then it might be some definite value (one day, week, year etc.). Since we can't know about the TTL of consumers it can be a good idea to make past events available somehow.

Thus, when you design an event centric architecture like  EDA (whether on top of SOA or not)  it is important to think about event consumers - we don't want to think about  specific  consumers since it negates the benefits of thinking in events, but I would say that you want to think about event consumers in general, after all your component is also an event consumer (do unto others as you would have them do unto you)

One option, which I already talked about, is to make past events available as a feed. Event consumers can then come at their own leisure and consume past event  (this can be in addition  to  to raising the events in real-time). This provides a partial solution as the maximal TTL is determined by the event producer (after which the event is deleted from the feed). This may be acceptable but you must be aware of that.
The other option is to to log all the events and provide an API to retrieve past events. In a sense the max TTL is still at the hands of the event producer only if you use a database it would probably be a large time compared with a feed. Alternatively the events can be logged on by a central "always present" event aggregator (in a manner similar to the aggregated reporting pattern I described for SOA).

To sum all this - events they seem only to matter in the instance in time they are created, we are used to that thinking from building OO systems where all the components are co-located in the same address-space and time (even there I can think of scenarios where we would want past events) - in a distributed world events need to have a TTL, the TTLs can vary and are determined by the events consumers. Lastly, as I demonstrated in the paragraph above, there are several strategies we can use to help solve the event TTL dilemma (and there are probably a few others).


 
Tags: Everything | SOA | Software Architecture

June 24, 2007
@ 09:20 AM
Few months ago I wrote here about solving the mismatch between Service Oriented Architecture (SOA)  and Business Intelligence (BI) (see papers and articles section). Recently I got the following question from Ben:
One major question I have is around large data sets. As an experienced BI/DW architect and developer I have worked on a number of large scale data warehouses. Retrieving large data sets (i.e. millions of records) doesn't seem to fit well into SOA. As you state in your article, we could have another point-to-point interface, where the service which houses data we need gets a request and writes out a batch file (xml or plain ascii text). Then using typical ETL, we grab the file and load it. The underlying source system (service) can use optimization in generating a large data set (vs. record by record) and
the data warehouse can correspondingly load in bulk.
Like most architectural questions - the answer is "it depends"
For instance, if you do a run-of-the-mill ETL as a on-time setup then it is just that- a one time setup and I, personally, don't see any contradiction between SOA goals or tenets and that.

I do think that iit is better to enhance SOA with EDA interactions to provide a long term solution to the BI problem. You can also have a dedicated component that aggregated the information that flows in in these events and builds batch files that are suited for the ETL you've used during the setup phase (mentioned above).
It is true though that moving an SOA which is already in-place to EDA is not a small feat, but adding EDA layers does not have to mean that the old interfaces go away - especially not immediately (remember to treat services as products)

If you have a business that generated millions of records on a daily basis - then the situation is more complicated. Now you have to think about the trade-offs between "compromising" SOA and adding a dedicated interface (or a backdoor to the database) for the ETL vs. the implications of performance, bandwidth, transition costs, ROI  etc. of pushing that information with EDA.
 I, personally believe in pragmatism and the "no-silver-bullet" approach so I can't say that EDA is always the best solution (As an aside, this is part of the reason I write my book as patterns not as "best-practices guidance"). You may find that ETL is the best trade off in your situation. Yes I know that it isn't a definitive answer - but real life is (usually) a little more complicated than black and white solutions. As architects we need to find the best trade off for the situation at hand.


 
Tags: Everything | SOA | Software Architecture | BI

I thought I has this  RESTful web services thing figured out, but following one of the threads on the Yahoo group on Service-Oriented-Architecture I came to the conclusion that maybe I don't.

Steve Jones tried to see if he understands REST by giving an example and that example was corrected by Anne Thomas Manes (who is a research director with the Burton Group which recently stated that the future of SOA is REST).
Here are the examples from the above mentioned thread:
POST http://example.org/customer
HTTP message body contains a representation of "anne"
server creates a subordinate resource called http://example.org/customer/anne

GET http://example.org/customer/anne
returns a representation of "anne"

GET http://example.org/customer/personByName?name=anne
returns a representation of "anne"
or perhaps returns the URI of the "anne" resource
or perhaps returns a list of URIs of all people named "anne"
might also be specified more simply as
GET http://example.org/customer?name=anne

GET http://example.org/customer/personByAge?age=27
returns a list of URIs of people whose age is 27
or perhaps returns a collection of representations of all people aged 27
might also be specified more simply as
GET http://example.org/customer?age=27

PUT http://example.org/customer/anne
HTTP message body contains a representation of "anne"
either creates a new resource called "anne" (if none exists)
or replaces the existing "anne" resource

PUT http://example.org/company/newco
HTTP message body contains a representation of "newco"
either creates a new resource called "newco" (if none exists)
or replaces the existing "newco" resource

If you prefer the server to assign the URI you would instead say

POST http://example.org/company
HTTP message body contains a representation of "newco"
server creates a subordinate resource called http://example.org/company/newco

POST http://example.org/customer/anne?addCompany=http://example.org/company/newco
this would append the newco company reference to the "anne" resource

You can see another example for what I am talking about here on Jon Udell's blog giving an example from RESTful Web Services, by Leonard Richardson and Sam Rubycovering  of doing a transaction in RESTful style

If all these are indeed "legal" or "correct" RESTful interactions I have 2 observations to make
First, I guess Pat Helland is right when he said "Every noun can be verbed" since I don't see the real difference between having a contract with a PersonsByAge request which returns a document* of Persons and a REST request like " GET http://example.org/customer/personByAge?age=27" or even " GET http://example.org/customer?age=27".

The second observation has to do with the so called "uniform interface". I would argue that the resources and their attributes (age=27, name="anne") are the interface. the POST, GET etc. uniform interface does not mean much more than the "uniform" SEND, BROADCAST  interface of messaging.
Further more if resources and their attributes are indeed "the interface" - than not only does REST not have a uniform contract - it actually has a dynamic one which changes in run-time as new resources are created - such as the "POST http://example.org/company"  which creates a new resource "http://example.org/company/newco" in the example above







* I think it is very important for SOA to have document oriented messages and not RPC one I\ll blog in a separate post about the differences. for now it is suffice to say that the REST hypermedia notion of returning the URIs of all the relevant persons should also be present (one way or another) in a good document oriented message even if you are using WS-* or plain messaging as transport
 
Tags: Everything | SOA | Software Architecture

In addition to the drafts of selected patterns I publish on my site, you can now purchase my book via the Manning Early Access Program (MEAP).
MEAP means you can get chapter drafts as I write them and the complete book when its done (ebook or printed). Here is Manning's explanation:
"Buy now through MEAP (Manning Early Access Program) and get early access to the book, chapter by chapter, as soon as they become available. You choose the format - PDF or ThoutReader - or both. By subscribing to MEAP chapters, you get an opportunity to participate in the most sensitive, final piece of the publishing cycle by offering feedback to the author. Reader feedback to the author is welcome in the Author Online forum. As new chapters are released, announcements are made in the MEAP Announcement Forum. After all chapters are released, you will be able to download the complete edited ebook. If you order the print edition, we will ship it to you upon release, direct from the bindery, weeks before it is widely available elsewhere.
By the way, this is probably also a good time to mention that I'll be speaking about quite a few of the patterns in Architecture & Design World 2007 which will take place this July.

There is still a lot of work, but I already like to thank all the people in manning that helped me get this far. especially to Cynthia Kane my editor (hey, maybe now she'll give me more slack :) )
Ok, 'nuff blubbering, back to completing chapter 5...


 
Tags: .NET | Everything | SOA | SOA Patterns | Software Architecture

Agile and documentation? what gives? First things first documentation is not something that is prohibited by Agile Manifesto. Working software is definitely preferred over "comprehensive documentation" but there can also be some value in documentation

The first question is why would we want to document anything if we have a working software. I think there are several stakeholders like project newcomers, maintainers etc. who will be interested in something that will let them get up to speed and provide them an overview of what's going on before they delve into code. You can read more on that in  a post I wrote almost a year ago called "Who needs a software architecture document" but in essence the main motivation for documentation is that assuming that the software is successful it will outlive the team - i.e. the people that built the software will not be the ones that will have to develop, maintain and support it for most of the software's life.

If we agree that we need a software architecture document the question is what to document and when.

There are two main "things" you can document in regard to architecture, the first is the obvious one: the architecture itself. In my experience the most value can be derived from documenting recipes i.e. how  to do stuff that is common in the architecture. These recipes can be a short description of the context and then a
pointer to a test (or tests) and an implementation that exercises this. You can think of the recipes as a type of a tutorial to the architecture.

Other documentation worthy elements related to the architecture are an overview and technology mapping (including what a developer needs to install to start working). The overview allows a newcomer to understand where to find what, the technology mapping allows for understanding what she needs to learn and install to be productive. Note that to be useful the overview should be at a higher level of abstraction than the code - otherwise you run the risk of missing the forest for the trees or at least not saving any time.

It is obvious that documenting any of this before your architecture is stable more or less is useless, as a rule of thumb I would say this can be around the 5th-6th iteration - assuming the team has to grow during the project. If the team stays stable for the duration of the project, this documentation can take place towards the end of the project (though I would probably add recipes to a wiki or something similar during the project as development patterns emerge).

The second "thing" you can document in regard to architecture are the decisions you decided against, in my opinion this is more important than all of the other items mentioned above together.  The reason for that, while it might take a while to understand a well written software and infer its architecture it can be done, but it is virtually impossible to understand the options that were disqualified from looking at the chosen solution.
Understanding the options that weren't used can save time for the person reading that description. both in understanding why things are the way they are. Furthermore it can save time trying things that didn't work or provide clues to options when the circumstances change (since, as we all know, requirements change..)
The best time to document decisions you decided not to take is when you opt not to use them - this is when you remember best "why". for instance, in my current project we use x.509 certificates to authenticate clients and we use decided to use Kerberos tickets to authenticate components within the service. There's a reason for making that translation*,   there's also a reason for making the transition by replacing the client certificate with the edge component's credentials instead of mapping the client's certificate to a Kerberos ticket using an Identity provider*. we had two developers spike different options for two weeks until we came to the current solution instead of the more obvious choice of passing the x.509 certificate from the edge into the service and using the client's credentials. This question is likely to come up when/if someone else would take over the project, when the technology will be updated etc. Again, if we know why we didn't make that choice we can better decide what to do when the circumstances change.

To sum up, there are few architecture related issues that are worth documenting even in agile projects. some of them can be postponed some of them are worth documenting a little earlier. In any event it is better to document after the fact and to keep the documentation light.





* It all has to do to to limitations of WCF in regard to  the transports we use (HTTP, MSMQ and TCP) and the request/reaction pattern (asynchronous communication) we use.



 
Tags: Agile | Everything | Software Architecture

June 6, 2007
@ 10:02 AM
While I am on the topic of REST, it is probably a good time to comment on my (first) post on InfoQ "Debate: Does REST need a Description Language"

Personally, I think there's merit in Services publishing their message structures in a machine readable format. When a Service has a machine readable contact. generated stubs allows you to make the interaction with less bugs vs. hand crafted interactions. It also makes it easier to test the service itself.

I do agree with Stefan's views on runtime interface dependency where he said that if a service consumer needs just 20% of the information in a service it shouldn't be forced to deserialize (i.e. know or care about) the whole message.However, I think this is a weakness of tooling not the concept. What if you had a tool that reads the machine readable contract, allow you to pick the 20% you need and generate for you a stub that ignores all the other 80% and "hand pick" the 20% you need. This is what you would personally do yourself anyway, and since the code is generated from the Service's definition it would be more resilient and error-free This is effectively designing a personalized mini-contract from the published general one. It does mean that when that 20% changes you will be affected, but this is something you'd have anyway.

I also agree that that the WS-* standards and resulting contract are (and getting more) complicated. Much of this can probably be attributed to the "design by committee" effect. However, there are also some real challenged that the SOA and ROA architectural styles do not address and we still need to solve those. Trying to solve these challenges is, by the way, what prompted me to write my SOA patterns book...


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

DevHawk (Harry Pierson) raised today a question I was toying with myself for a while now - if REST is an architectural style can it exist without the specific technologies that define it today. Or as Harry put it :
  1. REST is a an "architectural style for distributed hypermedia systems".
  2. REST "has been used to guide the design and development" of HTTP and URI.
  3. Therefore REST as an architectural style is independent of HTTP and URI.
  4. Yet, I get the feeling that the REST community would consider a solution that uses the REST architectural style but not HTTP and/or URI as "not RESTful".
What I had in mind for example is to use messaging where the equivalent of the URI would be a topic hierarchy.
Topic hierarchy allows you to have a unique "URI" for each resource.
The next thing we need to take care of are the PUT, GET, POST and DELETE verbs - we can do that by making the verbs part of the message headers.
As an aside I'll also say that  if we try to think about it as an architectural constraint then we don't necessarily have to use these verbs, a more general rule would say that the verbs are uniform and well known rather than specific ones
The rest (no pun intended) of the concerns, like specifying related states etc. can be dealt with making conventions on the message formats
Is that still REST?! I wonder...

In any event, what worries me the most in regard to REST is the religious manner that some people seem to treat it. By the way that is the same phenomena we see with some of the Agile folks. As for me? - Well, I don't really care if I fit that label or the other. I am just payed to deliver working  and viable software :), but hey, that's another discussion.


 
Tags: Everything | SOA | Software Architecture

Last month we saw Ajax was making a move to the desktop with the aid of Adobe Apollo. Earlier this month I talked about it again when Sun announced JavaFX and after  Microsoft officially announced Silverlight.
I guess it was inevitable that Google would also join the fun of blurring the lines between desktop applications and web applications. And indeed I just read that Google released a beta of Google Gears.

What does Google Gears do? well, in essence it lets you develop occasionally disconnected web applications by providing an API to store application resources and data locally as well as Workerpools  that allow you to run tasks in the background (sort of like a collection of processes that host javascripts and  can communicate by messages).

What all these means is that it wouldn't be too long before the the architectural choice between web clients or desktop clients will not be that important it will be more about making a technology choice on the  preferred development environment and tooling as well as designing robust occasionally disconnected applications regardless of that technology. Just look at  the architectural guidance Google provides for utilizing Google gear.  You'd see that the components that they identify are very similar to the high-level components you are likely to design with any occasionally disconnected desktop application...




 
Tags: Everything | Software Architecture

Yesterday I attended Jim Coplien's presentation on "Organizational Patterns - a Key for Agile Systems Development". Overall I think It was a very good presentation. Jim makes a few interesting claims, some of which are controversial within both in the traditional and the agile spaces
Few examples
  • Process guidance (ISOs etc.) doesn't work - roles are stabler than processes, processes always change.
  • Jim says that in order to make a change you need to make it at the organizational structure level. The processes will then support these changes
  • TDD is evil - it is just an re-incarnation of bottom-up procedural design. It is better to follow "Design by contract"
  • He says XP is not a good methodology (He thinks SCRUM is good)
  • etc.
Additionally he talked about some of the organizational patterns he and Neil Harrison discovered studying organizations for more than a decade. You can  read the Top ten patterns on his site.

Jim covered 2 patterns that are related to software architecture: Architect controls product and Architect also Implements
Architect controls Product basically says that you should have an architect and that she should oversee that the direction of the project is flowing in the right direction.

Architect also implements - this pattern says that in order for the architect to broaden her leadership without sacrificing depth and pragmatics she must also participate in the implementation (beyond advising and communicating). Jim gave the example of the development of Borland's Quatro pro for windows in 1993 where the team's architect had a daily meeting (akin to scrum stand-ups) for synchronization and would then go and code with the developers. The Quatro pro team had 4 architects out of 12 persons that made the team.If a third of the development team is architect I'd say he is right -  My experience, however with most organizations I see is that you hardly have one architect per  project (sometimes you only have one for several projects). In these cases I hardly see the architect writing production code as part of the team since she would not have time to fulfill her architectural responsibilities. She must know how to code though and she must be able to prove her designs in code or be able to offer a candidate implementation if needed (I also wrote about that in the past see "Should architect's code" part 1, part 2, part 3

By the way if you are located in Israel, Jim will be here for a couple of weeks and he is giving a few courses like Agile Architecture, Patterns of Agile Project Management etc. You can find more information on pacificsoft's site


 
Tags: Agile | Everything | General | Software Architecture

Udi Dahan writes that ".NET/Java Interop is not a reason for SOA". Udi writes that companies that  need to integrate two technologies turn to web-services and that
"The only problem is that in order for things to work right, they really must have a chatty interface, and flow transaction context between these “services”, and all the other things I describe as anti-patterns"

Udi is right that if you don't rethink and remodel your systems you will (probably) not  have an SOA as you are likely to find your self implementing  anti-patterns such as the ones he mentions.

However, using Web-services does not automatically mean that you are doing an SOA. If you don't think about moving to SOA you can still opt to use web-services as a remoting  or RPC technology to connect two systems. The advantage over the other proprietary products Udi mentions is that web-services are a standard technology. This will work well or fail is orthogonal to the technology choice. It depends on the architectures of the systems you integrate. If you need to flow transaction between the systems you'd also need that even if you cross-compile one of the applications in the other environment.

Another thing I don't agree with is the word must Udi uses. First, while it is likely that older systems has chatty interfaces it is not a must. The designers of the legacy system may have thought about the consequences of distribution without regard to SOA. Also you can still wrap an existing system with a service contract (using web-services or any other technology) and not get to chatty interfaces etc. However that means that the wrapper should have some substance or business logic inside it to mask the old system's behavior this is especially important  if you are thinking about moving to SOA and you take into consideration that the business will not just halt and wait there until you are done. You have to think about interim solutions, such interim solutions can include wrapping a legacy system with an Edge Component and a SOA facade (a pattern I call Legacy Bridge) while you move in the grader direction of a full blown SOA.


 
Tags: .NET | Everything | SOA | SOA Patterns | Software Architecture

It has been awhile since I last published a pattern draft - but I guess it is better late than never.
The saga pattern deals with manginc complex interactions between services without the use of atomic transactions, which as I mentioned in the past are not a good idea (see "Transactions between services? No, No, No!" and "Some more thoughts on Cross-service transactions" )

You can download the draft for the Saga pattern from here.
I'll also add a link to it from the SOA Patterns book section (where you can also download the other pattern drafts I published)

By the way, I am not happy with the current  sketch (the pattern illustration) in this pattern, so it will probably change in later drafts. I would be happy to hear any suggestions you have for improving it.


 
Tags: Everything | SOA Patterns | Software Architecture

May 23, 2007
@ 11:06 PM

I just read Shy Cohen's (via Nick Malik) article in Microsoft's Architecture Journal entitled "Ontology and Taxonomy of Services in a Service-Oriented Architecture" Shy provides a list of what he calls service types. He identifies two major types  bus services and application. He then continues to sub-divide them, the Bus Services are divided into communication and utility services and the Application services  are divided into entity, capability, activity and process services. I have to say it was quite alarming to see this coming from someone who had deep involvement in defining Windows Communication Foundation Indigo.

Where do I start?

well, for one, it seems completely fails to make the distinction between Services as in Service Oriented Architecture and Services as in capabilities or features an infrastructure provide. The "Communication services" are for the most part capabilities that a service infrastructure (such as an ESB) provides. Not Services you would define in an SOA initiative And then there's the matter of service granularity and the difference between Remote objects and SOA for instance, the example  Shy gives for  a "method" (his word) on a Customer service (entity service):

"An example of a domain-specific operation is a customers service that exposes a method called FindCustomerByLocation that can locate a customer's ID given the customer's address"
why would a service return a customer ID? This is the kind of call you would make on an object you hold a reference to not some remote "Something" that also want to authorize your call and may reside in a different company. This kind of thinking is what made remote objects fail. Gregor Hohpe explained that nicely in a paper  called Developing software in a Service Oriented World
The Transparency Illusion.  Distributed components promised to hide remote communication from the developer by making the remoteness "transparent". While the basic syntactic interaction between remote components can be wrapped inside a proxy object, it turned out that dealing with partial failures, latency, and remote exceptions could not be hidden from the developer. It turned out that 90% transparency was actually worse than no transparency because it gave developers a false sense of comfort.
As a side note, Gregor recently gave a presentation that covers this paper at JavaZone which you can watch online at InfoQ

Returning to Shy's article let's take a look at another quote:
Capability Service may flow an atomic transaction in which it is included to the Entity Services that it uses. Capability Services are also used to implement a Reservation Pattern over Entity Services that do not support that pattern, and to a much lesser extent over other Capability Services that do not support that pattern.
I already explained why cross-service transactions and especially flowing transactions is not a good idea in SOA so I won't do it again here - but you can read about it both here ("Transactions between Services? No, No, No!") and here ("Cross-Service Transactions"). Also I truly hope Shy didn't mean .NET data sets when he said "In some cases, typically for convenience reasons, Entity Service implementers choose to expose the underlying data as data sets rather than strongly-schematized XML data. Even though data sets are not entities in the strict sense, those services are still considered Entity Services for classification purposes." In any event the whole decomposition of Services into fine grained "capability", "Activity" and "process" takes no account of the fact the SOA is a distributed architecture...maybe Microsoft is not affected by the  fallacies of distributed computing ?
*ad nauseam (latin)- to the point of disgust


 
Tags: .NET | Everything | SOA | Software Architecture

Yesterday, we had a meeting of a few architects and Memi Lavi gave a nice presentation on Architecture usability. The notions of architecture and frameworks/software infrastructure  are a little mixed in this presentation, but in essence Memi is right to say that among the quality attributes the architecture has to address you should have modifiability and architecture usage scenarios like "adding a complex form will take less than 2 weeks" or "wiring a new component into the workflow will take less than one day" etc. Reducing  friction* of frameworks is very important. I don't think it is the most important though. since if you could only have one of  the quality attributes you need to support (performance, correctness, architecture usability, framework friction, system usability, availability, budget etc.) I am not sure you would always pick architecture usability.
Driving home, I thought that essentially software architecture is just
The set of compromises, trying to maximize the effect of a limited set of  prioritized quality attributes
we can't have it all - so we need to prioritize our list of quality attributes and focus on the most important ones.


*Friction (from Ryan via Udi): "friction" is a (subjective) measure of how much the tooling gets in your way when trying to solve a specific-case problem. I've come to evaluate frameworks based on two rough metrics: how far the framework goes in solving the general case problem out of the box and how little friction the framework creates when you have to solve the specific-case problem yourself. When a framework finds a balance between these two areas, we call it "well designed."



 
Tags: Everything | Software Architecture

May 15, 2007
@ 08:16 AM
Pat Helland is back in Microsoft (after a two years vacation in Amazon)  and more importantly he also restarted blogging. I only met him in person a few times - but he is definitely one of the few persons really worth listening to - especially when it comes to distributed computing. Not only does he make interesting observations he is also capable of explaining them in a crisp and interesting manner.  Indeed, it didn't take too long (his second post) before he blogged some valuable content. The post is called Memories, guesses and apologies (go read it).

Pat talks about how the notion of time in a distributed environment is subjective and you can really know what happened before what and what we can do about it (I really think you should  just go read it :) ).
Another related aspect of the phenomena Pat mentioned is that taking a snapshot in time, the chances of having a single unified truth in a distributed system degrade in a proportional manner to the system's load. I  had a chance to work on a few systems where some of the sites had either occasionally connected or connected over  low bandwidth networks. This situation makes the whole notion of guessing the state and compensating and/or apologizing for wrong conclusions much more explicit than in always connected high bandwidth system. Nevertheless, latency still exists even in connected systems and and you should really be weary of assuming a universal truth - unless you can stop the businesses  long enough to allow complete synchronization.
As I mentioned a few days ago, we can't afford to have cross-service transactions (I also think we can't afford too many distributed transaction in non-SOA architectures, but this is a especially true for SOA) which makes things even worse in this sense. One thing we can do in an SOA to achieve distributed consensus is to run a Saga. Saga, which is a long running conversation between services, is probably one of the most important interaction patterns for SOA.
You know what? instead of trying to explain it here in a haste i'll just publish the pattern draft - I'll try to do that before the end of the week.






 
Tags: Everything | SOA | SOA Patterns | Software Architecture

May 13, 2007
@ 08:43 AM
Johanna Rothman writed about "letting go of BDUF" (Big Design Up Front). One statement she makes is that you can't predict what the architecture needs to be. I can't say I agree with that, since many times you do know something about the project and you do have prior expereice that can give you enough confidence to lay some ground rules. I called this Just Enough Design Up Front or JEDUF for short.

Another statement Johanna made, which I wholeheartedly agree with, is that you should let the architecture evolve. Evolving an architecture sounds very compelling but it is not a simple feat. Architectural decisions tend to have system wide implications which means that changing one too late in the game you'd get a lot of rewrite and/or refactoring to do
.
My strategy to solve that conflict is to:
  1. Set the first one or two iterations as architectural ones. Some of the work in these iterations is to spike technological and architectural risk. Nevertheless most of architectural iterations are still about delivering business value and user stories. The difference is that the prioritization of the requirements is also done based on technical risks and not just business ones. By the way, when you write quality attribute requirements as scenarios makes them usable as user stories helps customers understand their business value.
  2. Try to think about prior experience to produce the baseline architecture
  3. One of the quality attributes that you should bring into the table is flexibility - but be weary of putting too much effort into building this flexibility in
  4. Don't try to implement architectural components thoroughly - it is enough to run a thin thread through them and expand then when the need arise. Sometimes it is even enough just to identify them as possible future extensions.
  5. Try to postpone architectural decisions to the last responsible moment. However, when that moment comes -make the decision. try to validate the architectural decisions by spiking them out before you introduce them into the project

These steps don't promise that the initial architecture sticks, but in my experience it makes it possible to minimize the number of architectural decisions but still have a relatively solid foundation to base your project on


 
Tags: Everything | Software Architecture

April 30, 2007
@ 07:34 PM
An article I wrote on Business Intelligence (BI) and Service Oriented Architecture (SOA) has just been published on MSDN.
You can find it here http://msdn2.microsoft.com/en-us/library/bb419307.aspx.

The article explains the SOA & BI mismatch and how to bridge it by adding EDA to SOA. (I bloged about it here before, but the article is more ordered and complete)


 
Tags: .NET | Everything | SOA | SOA Patterns | Software Architecture

April 29, 2007
@ 03:14 PM
Back in January, I took part in an architect panel that Microsoft Israel organized. The panel was led by Ron Jacobs and it featured Udi Dahan, Assaf Jacoby, Coby Cohen, Dudu Benabou and myself. A few days ago Ron edited this recoding and turned it into a podcast in his Arcast series.

The panel's focus was on lessons learned from mistakes made in past project. Ego maniac as I may be :) -- even though you don't get to hear me much in the final edited version -- I think the podcast is worth listening to, as the panel raised some interesting points. You can download the podcast here (don't worry it is in English even though it was recorded in Israel)

PS

I am the first speaker after the introduction, in case you are wondering.


 
Tags: .NET | Everything | General | Software Architecture

April 20, 2007
@ 11:15 PM
I have seen the following question on one of the forums I follow
"I have studied up on the SOA approach and it all sounds good.  But most articles stop at the theory.

Lets say I sell things.   I have a CustomerProfileService.   The application does CRUD through this service to a back end database.  Its autonomous and isolated.
I have anther service, InventoryItemProfileService.  Again, the application does CRUD through this service to a back end database. It is autonomous from the CustomerProfileService.  Not only may it live on a different DB from the CustomerProfileService, it might exist on a different platform.
Now lets get to the InvoiceService.  Lets say from the client side, I would guess that i would have a CreateInvoice(custID,itemID[] ) method.  The InvoiceService would then call out to the CustomerProfileService for profile that meets the needs of the invoice, then another call out to the InventoryItemProfileService for the item descriptions and such.

Here is the question.  It would seem like in the back end (the db) of the InvoiceService there would be tables to support the customer info and the item info from the invoice.  Where prior to SOA, when everything was in the same db, these requirements would be largely satisfied by joins.  Now a logical join across services just seems radically expensive (everytime you touch the invoice).  hence the need for the customer and item tables local to the invoice service.

Does this sound right?  Just how often does the InvoiceService have to go back to these other supporting services?"
I also got a comment with a similar theme on my Cross Service Transactions post.

I see a few problems with the way the services in the question are modeled (like CRUDy interface) but in the end it all boils down to the root cause -and the real problem: granularity of the services.

Sure when "a service" is too small it doesn't make sense to separate its tables from those of other services. it doesn't make sense to have transactions that span only what's internal to the service. It doesn't make sense to pay the price to make a service autonomous (like caching reference data from other services). When the granularity is too small you will often find that you need to make a loot of interactions with other so called services. you are more likely to have CRUDy interfaces.
You are also more likely to have slow performing solution and suffer from  low-availability.
Using services in a granularity mentioned above is, in my opinion, a nightmare that would probably make you work very hard to maintain  the SOA principles in place  - or the more likely option, that you would circumvent the principles so that you can get something maintainable, usable and performing (and flip the bozo bit on this all SOA thing)

So what is the right granularity. Well, it is not a one-size-fits-all kind of thing, but as a rule of thumb I would say anything just shy of a sub-system and up. A service has to have enough meat so that it would make sense having it autonomous; that the transactions would fit nicely inside its boundaries; that it would be worthwile making it highly-available; that you can pass a complete task/document to it and it won't have to talk to a gazillion other services to complete processing it; etc.

If your application's idea of invoices is a 2 tables one with a header and one with invoice details - then don't make that a service. if invoicing is a sub-system with complex business rules a lot of options and what-not - then it can be a good candidate

Think about it next time you design a service :)



 


 
Tags: Everything | SOA | Software Architecture

April 17, 2007
@ 01:58 PM

After seeing Juval Lowy's article on WCF transaction propagation in the May issue of MSDN magazine. I posted  " Transactions Between Services? No, No, No! " in my DDJ blog. I've got a few comments which I thought warrant a post in their own-right.

The previous post was triggered by an article that promoted flowing transactions (i.e. you perform a transaction against one or two services and then one of the services calls an additional service and it joins the transaction). It is important to say that I think transactions between services should be discouraged regardless of automating extension of transactions. Transaction propaqgation just makes the matters worse.


There might still be some edge case where you have to have an atomic transaction from a service consumer to the service. I think that in the vast majority of SOA implementations you shouldn't do that and I would think real hard about the other options before allowing it in my architecture.In general  I think cross-service transactions are an antipattern (and that's the way you'd find them documented in my SOA patterns book :) )

One of the comments I received began with:

"Cross service transactions are a sure way to introduce coupling and performance problems into your SOA." I'm not sure I agree with that thought. Logically speaking, cross service transactions are a must. The question is how to implement them. There are two mechanisms we can use for implementing TXs: (1) ACID TXs; (2) Long-running TXs. The latter is preferable for the cases Arnon is talking about (large geographical distances, multiple trust authorities, and distinct execution environments). ACID TXs are more suitable for what Guy has mentioned (DeleteCustomer service invokes the DeleteCustomerOrder service internally). I agree with Arnon the a-synchronicity is preferable, but we all have encountered use-cases where ACID-ness is required from a business requirement level... [snipped]


One minor point in regard to this comment is that I don't like the term long running transaction - there is a long running interactions between services and I think the term SAGA describes them better. Sagas are made of a series of business activities that flow back and forth between services to realize a larger business process. Note that these interactions doesn't necessarily have transaction-like behavior.


which brings me to the more important point of looking at the statement "Logically speaking, cross service transactions are a must". I don't think so. For instance, if a service that manages the inventory in a warehouse receives a request for some items and later a cancelation of that request. The first request can trigger the inventory service to order some more items from a supplier. Whether or not the cancellation would cause a cancellation of the order of the supplier depends on the business rules of the inventory service for inventory levels for the items ordered. it might also depend on whether or not the items have already been received etc. The cancellation (the "abort") of the original request does not have to translate to an abort (or compensation) on the request receiver. Furthermore if the service communications model is based on the push model (e.g. using EDA with SOA) the cancellation notice would just be propagated without regard to the inventory service -. It is the inventory service's responsibility to understand the ramifications of this event and act accordingly. Even the example given in the comment 'DeleteCustomer service invokes the DeleteCustomerOrder service internally" is not a good candidate from ACID transactions (there's also a problem of service granularity here - I'll talk about it later). Since when the customer service decides to delete a comment and request the Orders service to delete orders - there's a reasonable chance that some of the orders are already paid for but not delivered. In this case the customer cannot really delete the customer until all the paid orders are resolved. Or maybe the order service is a facade to a night batch that does the actual deletion. - I know I am just fantasizing with these examples but the point is that the customer service has no knowledge on the order service or the inventory service above except the messages supported in their contract. To assume something about the internal behavior is problematic. Even if you know about the internal structure on the onset, the whole idea of SOA is that the services can evolve independently from each other...


Another thought triggered by the example in the comment originated by the granularity of the services (DeleteCustomer service vs. a Customer Service that also supports deleting customers) is that we should be really conscious to the difference between other architectures like 3-tier client/server and SOA. SOA is actually more distributed than 3-tier - we cross a distribution boundary every time we pass a message from a service to a service and not just when we move a massage from a client-tier to an application server. We add this distribution to gain advantages in flexibility and agility. However, we should note that this is a weakness of SOA (considering for example, that Martin Fowler's first law of distributed object design is" Don't distribute your objects!") means we should really pay attention to the way services interact with each other.

  • The granularity of services - having a lot of fine grained services means there will be a lot of interactions over the wire (even if you don't go out to the network you still have to serialize/deserialize, follow the security policy etc.) rather than internal interactions that much faster
  • The Granularity of messages - The same considerations should also guide us to try to create larger and fewer messages. for the example above . Instead of a DeleteCustomerOrder message maybe something like an UpdateCustomersOrders message that can hold a list of customers and orders and the status changes or . by the way this would also support off-line clients better since they can accumulate changes.
  • The assumptions we can make on the other service's availability, performance, internal structure, the trust we have for it etc. - We should try to minimize the assumptions we make and concentrate on what can be inferred from the contract. Remember that policies can change externally so the business logic within a service cannot count on them being constant. this brings us back to the issue of transaction. every cross-wire interaction increases the chances of failure - in transactions one failure invalidates all the transaction is invalidate. every cross-wire interaction within a transaction increases the length of time we lock internal resources (even if we do trust all the involved parties) - especially if that transaction can extend itself automatically. Also as I've mentioned in the previous post the transactions also open the door for denial of service attacks.

If we want to reap the benefits that are sold under the SOA moniker, like flexibility and agility, we really have to pay attention to this extra distribution and design our services differently than we would components in a 3-tier architecture - but hey, that's why they pay us the big bucks, right ? :)

I should probably also add  that building SOAs is not a goal in itself. We can build perfectly good solutions using other architectures - but if we find that we do need SOA (or any other architecture for that matter) we have to pay attention to the way we implement it to both keep its benefits and not harm other quality attributes like performance, security etc..



 
Tags: .NET | SOA | SOA Patterns | Software Architecture

Back in January I opined that  moving to web applications was not the optimal solution to the real problem we have/had with desktop applications which was  installation woes. What we got was a poor UI without installation problems so we (software industry) started to resolve problems like we had when we moved from  terminals to graphical UIs etc. - 
So now we have Rich Internet Applications (RIA) - using technologies like AJAX - but they suffer from other problems  which again we've already been through

Well that was the topic of the post in January. Now I've stumbled upon an interesting/amusing  twist - called Adobe Apollo
Apollo let's you, yes you've guessed it, take your RIA applications and deploy them as desktop applications.  you can now take your HTML, CSSs , AJAX scripts pack them up as a single file (AIR) and lo and behold deploy them on the desktop. You even get these nifty start menu and desktop shortcuts :)

The reason not to dismiss this a complete waste of time - is that what we actually see here is another example of a trend to convergence  web and desktop UI architectures  and programming models. I say "another" because coming from the desktop direction Microsoft is doing pretty much the same thing. WPF brings the web-programming model  with its markup (XAML) and "code-behind" concepts to the desktop as well as pushing the same model to the browser with WPF/E .

The difference between Microsoft's and Adobe's solutions is that, Adobe is coming from the web-side and, as I said, Microsoft is coming from the Desktop side - both companies are striding toward the same goal -  and what we are left with,  is yet another technology war





 
Tags: .NET | Everything | General | Software Architecture

Follwoing several months of hard work, IASA - The International Association of Software Architect, has finally published on-line the Architect Skill set library. What is the skill set library you ask ? Well, it is what you get when a few dozens of architects (your humble servant included) author papers relating their experience on the different aspects of the architect’s role- everything from design, infrastructure, quality attributes to human dynamics (soft skills)

I should probably mention mention both Nicole Tedesco and Dana Gerow who orchestrated all this effort, and whiteout whom, I personally doubt if the project was now online

By the way, what is even more interesting is that IASA is organizing training material based on this body of collective knowledge - and if this works well we’ll finally have a comprehensive course to train new architects. Meanwhile, you can find the all the PDFs online at http://www.iasahome.org/web/home/skillset

P.S.

As I mentioned earlier I also wrote a paper for this project – on Views and Viewpoints which you can download directly from here


 
Tags: Everything | Software Architecture

I was going to try to explain why it took me so long since I've posted the last pattern draft on-line when I saw that a couple of my fellow Manning authors already did that. See Roy Osherov's "Writing a book is like developing Software" and Fabrice Marguerie's "My Writing Experience". I have similar experience here -there are a few commonalities for software writing and it seems that the counter measures of shorter iterations, refactorings (which I guess writers know as rephrasing) and increased inspections seem to work here as well.

Finally, I am back to writing new stuff and I am completing Chapter 4 now. Chapter 4  deals with SOA security pattern, and I've decided to release the "Service Firewall" pattern as free draft. Note that it is a draft and it can change by the time it gets to publication for example the Edge Component, which I published a few months ago already went through some extensive rewrite (maybe I'll post the updated draft..)

The Service Firewall helps deal with malicious "service consumers" and protect the services from several types of attack including for example XDoS (XML Denial of Service), malicious content, preventing leaking private information from the service etc.

You can download the draft for  Service Firewall  pattern  from here .


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

March 4, 2007
@ 08:14 PM

Following my post on SOA definition, Alex left the following comment

One question - how can an organization achieve "agility" through an SOA, if not through "re-use"? Isn't re-use really the ROI for implementing a Service?”

The way I see it, Agility means the ability to change rapidly and it doesn’t have to mean reuse  –for instance, it can come from the ability to replace a component without disturbing other dependent components – though you can say that this is reuse as you are reusing the interface (contract).

When you replace or update a service  you may reuse some or maybe even all of the previous version of a service – as long as the context for that service didn’t change significantly – if it did the granularity of the reusable components will be much smaller than a  “service”. 

I would also note that I think there’s a difference between reuse and use. If you take the same ordering capabilities and you include it in two business processes that just using it. I’ve seen reuse of services in product companies where services were reused with few modifications between two or more solutions but this isn’t very common.

Regarding the ROI of SOA –That doesn’t have to be reuse or just reuse it is also things like easier connectivity so that you can integrate faster with partners or new components that are developed . Another way to measure ROI is measure the gains in easier replacability and adaptability so you can faster respond to changing business requirements (e.g., changing what counts as a VIP customer without letting any of the service’s consumers that something changed).  


 
Tags: Everything | SOA | Software Architecture

February 11, 2007
@ 08:25 PM

Udi has some comments on my SOA definition. Udi says that the definition I provided does not support  the notion of publish/subscribe using topics for services. My answer to this is yes and no :)


First thing first, I never said (or at least I never meant to say) that contracts are limited to only incoming messages. Contracts contain incoming and outgoing messages.   I probably should have stated it more clearly though.
Udi says “Contract: Who owns the message type being published? The publisher or the subscriber? Common SOA knowledge would say that the message belongs to the contract of the service that receives it”


I don’t know who is “Common SOA knowledge”. In my opinion, this thinking is a wrong “even” for request/reply. The reply message belongs to the service the sends the reply


Regarding Endpoints – if the subscribers go to a topic as in “ServiceName\TopicName “ then yes I would call that an Endpoint since this is a well known address consumers (subscribers) go to find messages published by a service


Regarding consumers Udi says “ Is the publishing service “using” the subscriber when it publishes a message? I don’t think so, and the subscriber definitely isn’t using the publisher at that point either. So, we’ve got some inter-service message-based communication going on and it isn’t clear if we even have a service consumer. In fact, if all a service ever did was subscribe to some topics, and publish messages on other topics, it looks like we’d have very loose-coupling but be straying from the common SOA wisdom.”


Maybe that’s just semantics but I don’t see why the subscriber isn’t using the publisher- The publisher publishes a message on a topic this is part of its offering. The subscriber chooses to consume that information and maybe do some stuff with that – possibly publishing some other messages. That’s a “using” relationship to me.


Nevertheless - SOA is not a synonym for "Distributed system" so there are cases when distributed components that communicates through messages aren’t SOA. For example publish/Subscribe using topics where the topics are common and shared between components so that multiple services can publish on the same topic does not, in my opinion, fall under the definition of SOA . This doesn’t say that this is a bad architecture in any way – but it isn’t SOA either.
As I said in the “What is SOA posts” for an architecture to be SOA you need autonomous components , that publish and accepts messages defined in contracts, delivered at an endpoint and governed by policies to service consumers – no more, but no less either.


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

February 9, 2007
@ 06:50 AM

I've been talking about SOA for a while now it's finally time to (try to) properly define it

I've publised this as a 5 posts on my DDJ blog and I thought it was good enough to be publised as a single whitepaper:

"Service Oriented Architecture or SOA for short has been with us for quite a  while.  Yefim V. Natiz, a Gartner’s analyst, first talked about SOA back  in 1996. However it seems that only in the recent year or so SOA has matured enough for real systems based on the SOA concepts to start to appear – or has it?  There is so much hype and misconceptions surrounding SOA that we first have to clear them all up before we can explain what SOA is – let alone identify who really uses it...." (Download full PDF (670K))

You can see additional presentations and papers I wrote here


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

January 30, 2007
@ 06:30 PM

[originally published in my DDJ blog]

You may have read my BI and SOA post where I suggested using EDA within SOA to solve the BI/SOA impedance mismatch. Jack Van Hoof made the following comment on that post:

Many people think of SOA as synchronous RPC (mostly over Web Services). Others say EDA is SOA. And there are many people who say that the best of EDA and SOA is combined in SOA 2.0. Everybody will agree that there is a request-and-reply pattern and a publish-and-subscribe pattern. It is easy to see that both patterns are an inverse of each other….

I think that "Synchronous RPC" is not a very good (or useful) definition for SOA (see my series on what is SOA anyway). Nevertheless, I also think that even if all you have is synchronous request/reply you can still implement both asynchronous messaging and EDA How can we implement Asynchronous Messaging?

Option 1 Duplex Channel
Let’s say you are a service consumer. You send me your request. Instead of a reply I just acknowledge you that I got the message. I put the message into a queue and process it on my "spare" time. I then call you with the answer.

Option 2 One way Channel
Again you send the request. Instead of a reply, I give you a token or a ticket for the answer. When you think it is time, for example when the time promised in the SLA elapse (or whenever), you call me again, give me the ticket, and I look up the answer and give you your reply. If we hide all this protocol inside some software infrastructure the applications can see asynchronous messaging even though we have synchronous request/reply on the lower levels.

Okay, so what about Events? How can we publish events just using request/reply. The previous example would not work since we can miss out on important events?

If you are reading this blog -- chances are you already have the answer working on your computer -- yes, it is RSS. Think about it using an RSS Reader that pulls the server that publishes this blog you reach out using synchronous request/reply and get all the posts (events) that were added since the last time you asked.

There are a few additional architectural benefits for working this way. For one the service does not have to manage subscribers. Secondly, the consumer doesn’t have to be there the moment the event occurred to be able to consume it -- and the management and set up is easier and simpler than using queuing engines


 
Tags: Everything | SOA | Software Architecture

January 25, 2007
@ 11:57 PM

Jack Van Hoof left the following comment on my post on BI & SOA:

"Many people think of SOA as synchronous RPC (mostly over Web Services). Others say EDA is SOA. And there are many people who say that the best of EDA and SOA is combined in SOA 2.0. 

Everybody will agree that there is a request-and-reply pattern and a publish-and-subscribe pattern. It is easy to see that both patterns are an inverse of each other. In my article 'How EDA extends SOA and why it is important' I explained the differences between the two patterns and when to use the one or the other.

 

Because of the completely different nature and use of the two patterns, it is necessary to be able to distinguish between the both and to name them. You might say making such a distinction is a universal architectural principle. Combining both of the patterns into an increment of the version number of one of them is - IMHO - not a very clever act. I believe it is appropriate and desirable to use the acronyms SOA and EDA to make this distinction, because SOA and EDA are both positioned in the same architectural domain; SOA focusing on (the decomposition of) business functions and EDA focusing on business events."

I agree with some of the things Jack says but not all of them. The way I see it EDA and SOA are two different architectural styles- but I guess that I see it a little different than Jack does

EDA is an evolution of the publish-subscribe style - and can exist independent of SOA i.e. you can implement it with other architectural styles SOA is an evolution of the component based development style which puts an emphasis on interoperability and adaptability.

However I don't agree that SOA is "Synchronous RPC". That's just the initial "wave" of SOA implementations since synchronous interactions are easier to grasp and implement. I think that adhering to SOA principles you can also implement additional interaction patterns including, asynchronous messages, publish/subscribe and EDA (and combining SOA with EDA is what I suggested for solving BI in an SOA)

 

I don't like the SOA 2.0 term as well - but that's just because I don’t see a need for defining a new term :)

 

I'll post more about this once I finish the "What is SOA anway" series on DDJ where I explain the way I see SOA

 


 
Tags: Everything | SOA | Software Architecture

January 7, 2007
@ 11:02 PM

[based on a few posts from my DDJ blog]

Implementing Business Intelligence (BI) solution on top of Service Oriented Architecture (SOA) is not a simple feat. A recent survey by Ventana Research shows that "...only one-third of respondents reported they believe their internal IT personnel have the knowledge and skills to implement BI services.". There's a good reason for that since there an inherent impedance mismatch between BI and SOA which takes some effort to overcome. The purpose of this paper is to look to explain the problem as well as look at the possible solutions.

Service-Oriented Architecture is about autonomous loosely coupled components. These traits gives you lots of benefits such as greater flexibility and agility but it also means that services have private data. Data that you don't want to expose to the outside as exposing it will decrease autonomy and increase coupling. This is why services only expose data and processes via contracts rather then exposing their internal structure.

That is all fine until you start to think about business intelligence. The cornerstone of any business intelligence initiative is gathering, collecting and consolidating data from all over the place. Once you have the data, you can use tools to analyze it, data mine it, slice, splice, aggregate, and whatnot. Traditionally BI builds on ETL (Extract, Transfer, Load) which goes directly to the database of the involved sources.

And here lies the problem: On the one hand we have services that want to keep their data private, and on the other we have a datamart or warehouse that wants that data badly.

What are our options?

  • If you go with traditional ETL, you introduce coupling into your service.
  • If you only rely on contracts that were constructed for business processes you may be missing out on important data.
  • If you build a specific contract that exposes "all" the data you are back at the point-to-point integration -- solving point-to-point integration is one of the reason we want SOA in the first place.

The second option seems to be the most reasonable choice of the three -- but it also has several problems. One problem is that the BI needs to know about all the contracts. The second was already mentioned -- important data might be missing. The third problem is that the BI system need to fetch data from the services which means it may miss out on data in the intervals between request. On the other hand, too frequent requests and you can congest your network easily as well as cause DOS on your own services.

Clearly we need a fourth option

In my opinion, the best way to tackle BI in SOA is to add publication messages into the contract. By "publication messages", I mean that the service will publish its state either in a periodic manner or per event to anyone who listening. This is a service communication pattern which I call "Inversion of Communications" since it reverse the request/reply communication style which is common for SOA.

To make the solution complete, you can add additional requests/reply or request/reaction messages to allow consumers to retrieve initial snapshots. Following this approach, you get an event stream of the changes within the service in a manner that is not specific for the BI. In fact, having other services react on the event stream can increase the overall loose coupling in the system - for instance by caching results of other services

Why is this better than the other three approaches? For one , you can get a good picture of what happens within the service. However the contract is not specific for the BI and can be used by other services to cache the service state (thus increasing their own autonomy), for reporting (you can see an early draft of the aggregated reporting pattern), and for BI purposes. By working against a steady stream of events, the BI platforms can Analise treands, keep history and get the complete picture they need.

The approach above is sometimes referred to as "Event Driven Architecture" (EDA) and while I (and others) see EDA as another facet of SOA, not everyone agrees. Gartner, for instance, sees EDA as another paradigm and SOA just for request/reply, or client/server. Recently, however, they published a paper that calls the approach described here as "Advanced SOA". I tend to agree more with the "advanced SOA" definition and don't see a contradiction with EDA and the SOA definitions. We are still using the same components and the same relations only adding an additional message exchange pattern into our toolbox.

A note on implementation: If you are implementing SOA over an ESB that is rather easy to implement as most ESBs support publishing events out of the box. Using the WS* stack of protocols, you have WS-BaseNotification, WS-BrokeredNotification and WS-Topic set of standards. If you are on the REST camp, then I guess you will need to implement publish/subscribe by yourself.

Once you have event streams on the network, The BI components grab that data scrub it as much as they like and push it to their datamarts and data warehouses. However, event steams can also enable much more complex and interesting analysis of real time events and real time trend data using complex event processing (CEP) tools to get real-time business activity monitoring (BAM)

You can also get post as as a presentation down loadable from the papers section on my site or directly from here. (The download is about 3MB.)



 
Tags: Everything | General | SOA | SOA Patterns | Software Architecture

December 18, 2006
@ 07:17 PM

I've stumbled upon a presentation by Ron Jacobs on the Software architect's role (via Shahid Sah's blog) called Architects and the Architecture of Software. In this presentation Ron compares the architect's role for that of an explorer, an advocate and a designer.

While I can go for the designer bit - although I don't like the heavy analogy to building architects (I know, I know I have that as well in my software architecture presentation - but at least it is no longer there in the next version)

However I would personally replace advocate with a mentor and explorer with a polymath or renaissance man and add a leader and visionary as well (although Ron mentiones that as part of the discussion on explorer)

Advocate is someone who observes, listens and gives advice - but a mentor is someone who helps others reach the right decisions and help them learn and evolve. I think that has much more value. I want a Socrates not an Alan Dershowitz on my team

An explorer looks for new grounds and is a bit of a visionary - but a renaissance man is both knowledgable and inventive. As a development manager I rather have someone who knows what he is doing, understand the wider perspective and can find solutions to my problems - and not someone who would take me on a road to uncharted territories. I'd take Leonardo Da Vinchi over Columbos ( who accidently gave the competitive edge to spain and didn't even know it) any day.

A visionary and leader is also important - you want someone who is able to look far and that can help your team get there- I guess that is somewhat akin to an explorer (in the sense Ron mentions) but again I'd rather have a Martin Luther King than a Columbos (though lucky wouldn't heart).

But hey, that's just my opinion :-)


 
Tags: Everything | Software Architecture

December 15, 2006
@ 10:50 AM

One unique aspect of SOA vs. other architecture styles like Object Orientation , Client/Server or even 3-Tier architecture is that it is built for highly distributed systems. Each and every service is a sub-system in itself it can run on its own machine and be located everywhere in the world . Many times, the service itself needs to be distributed in its own right. One reason to use distributed computing inside the service is computational intensive tasks.

 

 One of my recent projects was the development of a  biometric platform.  The platform can be used for many usage scenarios. A simple scenario is an access control systems - e.g. authorize entrance into a secure building or area. This is a relatively simple scenario as you usually only have to deal with few thousands of people and as a person requests entry she also declares who she is (e.g. using an RFID card with her ID). In these cases you can go to the database, lookup the appropriate record , run the biometric algorithm or algorithms and verify the person is who she says she is. However the same platform also has to work for other, much more demanding and computing intensive scenarios. For example consider a forensics scenario where you have a fingerprint collected at a crime scene, in this case you don’t know who the person you are looking for is, and you have to run your search on basically all the database which can contain millions of records. Keep in mind that when you match a biometric template[1] you calculate the probability of a match (based on the internal structure of the template) and  that each template weights about a one kilobyte you quickly realize that this can be quite a CPU intensive task.

Sometimes when you develop you SOAs you will have algorithmic tasks or other computational heavy tasks such as the one mentioned above and the question is

 

How can a Service handle  computational heavy tasks in a  scalable manner?

 

You can get the full pattern from here

[This is an early draft of one of the Performance, Scalability and availability Patterns from my SOA Patterns book]



[1] you can think of biometric template as a signature or a hash that represents the biometric sample. The template is smaller than the sample but contains enough information to identify the original.

 


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

December 5, 2006
@ 08:16 AM

I've added a section called SOA Patterns on the site while holds the current draft for the table of contents of the SOA Patterns book I am writing. The section lists the problem each pattern addresses as well as links to published patterns. Also, you can  use this to monitor my progress (patterns that already have their problem written down already have drafts; the others are in-progress or not started).

I am currently working on chapter 4: Security & Manageability patterns (not counting delays mentioned in the previous post).

Also, as I think I've already mentioned, I'll make public at least one pattern per month, if you are interested in a specific pattern in particular (from those which are ready - now chapters 2&3) drop me a note and I'll publish the one that gets the most votes

 


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

December 1, 2006
@ 09:29 PM

My editors at manning think that my chapter 1 of the SOA patterns book is not good enough.

They basically say that the chapter talks about too much theory vs. the other chapters which contain much more down-to-earth stuff (e.g. Edge Pattern, Aggregated Reporting Pattern, Decoupled Invocation Pattern ). Also they’ve said that I spend too many pages explaining what architecture is or taking about distributed system before I get to SOA – which is the topic of the book.

The way I see it, understanding architecture and distributed systems is essential to understanding SOA (from the development side i.e. when you want to design and build services). For example the discussion on quality attributes explains how you can use scenarios to find architectural requirements (and each pattern then has a section on relevant scenarios to help you find if the pattern is applicable to your needs)

I would be very interested in hearing what you have to say (either as comments here or emails to me) about the Chapter’s structure and content (considering most of the books will be patterns like the Edge pattern)

Thanks in advance


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

November 15, 2006
@ 09:33 PM

The business rationale behind going on the SOA road is increasing the alignment of the business and IT, so we divide the business into a bunch of business services and everything is just fine. However the minute we start diving into the SOA implementation details we are swamped by a horde of technologies, cross-cutting concerns (auditing, security, etc.) and whatnot.

For example, in one project I was involved with, we implemented an SOA over a messaging middleware (Tibco's Rendezvous). Just when everything was fine and dandy - along came another project which could potentially use few of the services. Well, almost, it needed a slightly different contract and it also used completely different wire protocol - WSE 3.0 (Microsoft interim solution for the WS-* stack before Windows Communication Foundation). And that's just one simple example - cross cutting concerns and implementation details are everywhere. The question is then:


How can you handle cross cutting concerns like multiple technologies, protocols, changing policies etc. while keeping the service's focuses on its core concerns - i.e. the business logic.


You can get the full pattern from here

[This is an early draft of one of the Service Structural Patterns from my SOA Patterns book]


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

November 13, 2006
@ 11:35 PM

I've posted a new paper on the site. Designing a good architecture is not enough. The paper explains how to make sure the architecture is both relevant ands followed throughout the project. You can download the paper directly from here.


 
Tags: Everything | Software Architecture | SPAMMED Process

November 5, 2006
@ 12:44 PM

I am going to present SOA in one of our internal forums next week - so I thought it would be a good opportunity to dust-off my SOA presentation and give it a little face lift. You can download a copy from the papers and articles section (or get it directly from here).

As always, any comments are welcome


 
Tags: Everything | SOA | Software Architecture

The draft for the first chapter of my SOA Patterns book is available on-line from Manning Publications Co.

The first chapter talks about  software architecture and the inputs the architect can/should use to design one (emphasizing Quality Attributes); Explains the challenges of distributed systems and takes a look at the SOA from an architectural perspective.

You can download the chapter from here

Any comments are welcome (you can also leave your comments at soa@rgoarchitects.com)


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

October 22, 2006
@ 10:57 PM

Roy Osherov recommended this site today - but he also urged me to write more frequently.

This is probably a good opportunity to explain how posts are divided between my 3 blogs

  • First theres the blog on Dr. Dobb's Journal. This blog is published on the "Architecture & Design" section of DDJ portal. I blog there about 3 times a week. Jon (my Editor @ DDJ) prefers a steady stream of blogs over longer posts which means that I break down large subjects (like OO principles, fallacies of distributed computing and the currently running series on the Architect's soft skills) into many parts.

  • The second blog is a new one on Microsoft's Israel blogs site. The aim of this site is to bring Architecture content in Hebrew to the Israeli architects (As can be imagined, most of the technical content available is in English, I thought it was important to generate some content in Hebrew as well)

  • The last blog is this one. My current plan for this blog is as follows

    • Cross posting selected posts from the DDJ site
    • I am posting here complete articles made by editing and aggregating multi-part blogs posts (again such as the fallacies etc.)
    • Pointers to presentations and articles I publish
    • In the near future, I'll start posting bits of my upcoming SOA patterns book (I am currently writing chapter 3). I've already documented 8 patterns (of more than 50 patterns and about 30 anti-patters). I plan to publish here at least some of the patterns here for review (I am still crossing the t's and dotting the i's with my publisher but I expect this to be finalized soon)

so Roy, does seven (7) posts in seven days (including 3 on Ms Israel site, 3 on DDJ and this one) qualify as posting often enough? :)


 
Tags: Everything | General | Software Architecture

[Will also be cross-posted on my DDJ blog]

Working on my SOA patterns book, I thought of this rule for contract versioning which my shameless ego wanted to dub Arnon's Contract Versioning Principle. I was happy playing with this thought, until I realized that there isn't some profound new understanding here, this is just an application of LSP for service contracts.

Liskov Substitution Principle (LSP) which I recently blogged about here as part of a series of blogs on Object Oriented Principles, basically states that a subclass should be usable instead of it parent class. To put this in other words you could say that a subclass should meet the expectations that users of the parent class have come to expect from the parent class's observable behavior.

So LSP applied to SOA would state that:

When changing the internal behavior of a service, you don't need to create a new version of the contract if for each operation defined in the contract the preconditions are the same or weaker and the postconditions (i.e. the outcome of the request) are the same or stronger or in other words the to retain the same contract version, the new version of the service should meet the expectations that consumers of the service have come to expect from the old version's observable behavior

For example, let's say you have a customer service and the contract lets you get a customer's VIP status. If you changed the way the VIP status is calculated (e.g. in the old version the customer had to have 1 million dollars in her account, but now she must have 10 million dollars) there's no need to create a new contract version. However if you introduced a new level of VIP status (e.g. 1 Million = Gold, 10 Million = Platinum) you do need a new version for the contract


 
Tags: Everything | SOA | Software Architecture

October 1, 2006
@ 11:16 PM

I've added a new section on the site www.rgoarchitects.com/Papers to allow easy access to all the papers, presentations and articles I published (and will be publishing e.g. I'll add a paper on architect soft skills in a month or so etc.)

 


 
Tags: Everything | General | SOA | Software Architecture | SPAMMED Process

The October issue of Dr. Dobb's Journal is available and with it my article on the SPAMMED Architecture Framework .

 


 
Tags: Everything | General | Software Architecture | SPAMMED Process

August 24, 2006
@ 10:48 PM

Over the last few months I've posted a series of blogs on DDJ that cover the basic Object Oriented principles (e.g. Single Responsibility Principle, Don't Repeat Yourself, Inversion of Control etc.).

I've assembled all the posts into a single whitepaper which you can get here.

Also you can download the same (plus a little more) material as a powerpoint presentation.

 


 
Tags: .NET | Everything | General | Software Architecture

August 21, 2006
@ 08:58 PM

[Crosspost from my DDJ blog]

When talking about multi-tiered architectures, we need to remember that the tier boundary is significant. The tier boundary is where distribution happens and if you remember the "fallacies of distributed computing", you know not to take that lightly.

A tier is a physical boundary (versus an Edge in an SOA which is a logical boundary, for example) and the implications are numerous. For instance, you need to consider:

  • Trust--who do you let in?
  • Security--what do you send out?
  • Performance--you need to serialize to pass the boundary, and remote data is expensive to fetch.
  • Availability--what happens if you crash?
  • Manageability--can anyone see what's your state? Help you recover?
  • Temporal coupling--can you afford to make synchronous calls?
  • and many similar questions.

Yet many times people think passing a tier is as simple as passing a logical layer. I should know. I made this stupid mistake more than 15 years ago in one of the first distributed systems I designed. I planned this beautiful separation of the UI controls from the business logic (I didn't know it was called "MVC" and that someone else had figured it out ages ago, so I was pretty proud of myself). When you clicked on a button you just used metadata to say that BL should catch it. I had all this wonderful "infrastructure"that handled passing the call to its destination.

But then we wanted to take this n-layer application and put the BL in an "application server" which will handle multiple clients. Oh--now we need to move events over the wire , handle calls from multiple unrelated clients, pass a lot of data back and forth, and what about security... you can imagine the fiasco.

Thus, as Niels Bohr once said, "An expert is a person who has made all the mistakes which can be made in a very narrow field." But you don't have to make the same mistakes. Just remember that a tier is a natural boundary. You know what? You should probably even want to consider it the edge of a cliff at the end of your application--and be careful not to fall down.


 
Tags: Everything | General | Software Architecture

[crosspost from my DDJ blog]

In a comment to my previous post on Architecture vs. Design, Yoni said:

It seems you are categorizing technical issues as architecture and logical issues as design. I think Martin Fowler's definition of "Making sure important things remain decoupled and easy to change" transverses both categories and is easier to follow.

I have a few things to say about this.

First, I don't categorize technical things as "architectural" and logical ones as "design." What I do say is both "architecture" and "design" are types of design where with one you focus on the wider aspects of the solution and quality attributes, while with the other you focus on local and functional aspects.

I don't see how the definition Yoni brings up is a better way to distinguish between the two? Who is to say what is important and what is not? Isn't decoupling an important trait at all level including so called "detailed design" level (e.g. utilizing dependency injection at the class level will give you better testability). Moreover, decoupling is important but sometimes you need to trade that to be able to satisfy a more prioritized quality attribute (if you want to meet a projec's quality, budget and schedule targets); see my definition of what software architecture is.

Another thing is that it doesn't matter that much where the line between architecture and design passes. The distinction between architecture and design is a semantic one that reminds us that the design of a system needs to be done in several levels of abstraction (provided the system is not too trivial). We need to abstract certain aspects of the system in order to be able to grasp the big picture. You cannot (well, I can't anyway) think at a 100 man-years project. You can only think about it at the class level and understand how everything will work together. Again, architecture is there to remind us to focus on a level of abstraction that lets you deal with non-local decisions and to make sure quality attributes are met if we cross the line to design -  No biggie.


 
Tags: Everything | General | Software Architecture

July 25, 2006
@ 11:27 PM

[edited version of post I made on Dr. Dobbs Portal]

Back in April I  provided a definition for "architecture" as one of my first posts on DDJ. I also promised I'll talk about the distinction betwen Architecture and Design. Well this time is now.

When I try to think about it. I see two base criteria to distinguish between the architecture and design:

  • Design deals with local decisions, where architecture is broader. For instance, you "design" the interfaces for your classes, but you "architect" the division into tiers.
  • Design is mostly about the functional requirements, while architecture is mostly about quality attributes. You design how a specific workflow will fulfill a certain use case, but you architect the solution to the system's availability.

It is probably quite evident that this distinction only provides blurry borders between architecture and design; for example, when you have a multi-tier solution and you "architect" the UI and say it will implement MVP pattern. Can this be considered local decision and thus design or is this the overall decision (for the UI) and thus architecture?

The way I see it the exact cross-point from architecture to design is not that important. The point in talking about two distinct activities in the development process is to maintain separation of concern. you need to handle both to make sure a solution will actually work whether you do a little design while architecting or do a little architecture while designing really doesn't matter. Also architects should be involved in both activities anyway...


 
Tags: Everything | General | Software Architecture

July 25, 2006
@ 11:15 PM

Udi Dahan, which is one of the best architects I know has recently created an excellent course on SOA - you can find the details of the syllabus on Udi's site .

I had a chance to work with Udi in the past and the solution we implemented utilized many of the patterns and techniques Udi covers in his course - so these are not just nice theories but rather real stuff that works

 


 
Tags: Everything | SOA | Software Architecture

[crossposted from DDJ]

Yesterday I attended an interesting presentation on SOA by Dr. Donald F. Ferguson, chief architect for IBM's software group.

I was happy to hear him validate some of my thoughts on SOA (e.g., workflows are better kept inside services rather then outside, transaction boundaries should be inside a service, and so on), and introduced a couple of things I didn't know much about (for example, OSGi, a SOA platform for networked devices that's not based on web services. He also presented some nice insights (for instance, looking at the middleware as an infrastructure service and thus nicely unifying SOA and EDA)

On of the insights Donald presented was the use of heuristics as an aid to modeling and validating architectures. Some of the heuristics he mentioned include:

  • Occam's Razor -- avoid needless repetition
  • Don't create something new if you can compose existing stuff to get the same result
  • externalize volatility -- don't put in the code things that are likely to change
  • Focus on "name,value" programming not "offset programming" -- make things easy to understand
  • Different is hard


If you look at heuristics as an abstraction of experience, they can provide as a good tool for keeping yourself on the right track. Some heuristics are universal (maybe the ones mentioned above and a few others like "simplify, simplify, simplify" or "the original statement of a problem is probably not the best one or it may even not be the right one"), but the problem is, as always, deciding (in advance) which heuristics to apply to a problem.

If you interested in using heuristics as an architect tool you may want to look at " The Art of Sysytem Architecting", by Mark Maier and Eberhardt Rechtin. The book discusses the architectures of different system types (collaborative, IT, Manufacacturing, etc.) and provides heuristics for each of these systems.

Heuristics are a good tool to use when you design an architecture and in a way the different design principles (e.g., the single responsibility principle) can also be considered heuristics. Nevertheless it is very important to verify designs by additional methods like code and formal evaluation and not rely on heurisitcs as the only tool.


 
Tags: Everything | SOA | Software Architecture

Last week I published a 3 part article on O/R mapping on my blog @ Dr. Dobb's Portal. The paper describes the benefits and costs of using O/R mapping as well as recommend when O/R mapping should be used.

Here it is as a single whitepaper: Architecture Dilemmas - OR Mappin.pdf (228.78 KB)


 
Tags: Everything | General | Software Architecture

[Crossposted from DDJ]

Following the series of posts that discussed whether architects should code (see Part I, Part II and Part III), I thought I'd expand on how I see the software architect's role.

To be a good architect, a person needs to have experience taking a project from inception to production and maintenance, they should have project management experience, and have spent time working directly with customers. On top of that, know-how and experience is needed in design and technology. Having a diverse experience is what enables the architect to have a wide view on a project, understand the need for pragmatism, better understand stakeholders concerns as well understand the implications of design and technology decisions (on scheduling and budgeting, for instance). Having a wide experience also puts architects in a unique position where they can really help make projects meet their goals.

Software Architecture deals with the system's quality attributes. This makes the architect the ultimate person responsible for making sure the solution meets these quality goals (and I don't mean quality as in low bug count, rather the soundness of the solution and it ability to address the various stakeholders concerns).

To make this quality claim a reality I tend to take a holistic view in regard to the architect's role, which, simply put, means that the architect needs to do whatever it is needed to make the project go forward and succeed. Taking this direction toward quality is only possible if the architect has the wide range of experiences mentioned above.

How does that translate to real life? Here are few examples for "non-architect tasks" from my experience:

  • On one project the system engineering team was working in the wrong direction in regard to analyzing requirements (this cost us six months of delay). I helped introduce the concept of Use Cases and created the project's initial use case model. (You can read the insights I have from my experience in my paper Use Case Methodology for large systems (pdf) )

  • On another occasion one of the project managers was trying to evaluate how a certain technology will help advance the project. I helped her construct the WBS of the activities needed to complete the needed functionality (assuming the technology is in place) and the helped her create the estimates.

  • On another project we had performance bottlenecks related to the technology used (ESRI, displaying vector maps with high refresh rates for displayed objects). I got together with another architect to pin-point the problem, diagnose it, and came up with a solutions (mapping ESRI temp files to a RAM-disk).

  • I worked on a project where time was running out and a milestone was looming fast. I helped introduce the project to some SCRUM-like techniques (why only "SCRUM-like"? Remember that Rome was not built in one day either) and by working closely with the PM helped the team reach the milestone successfully (indeed not all the feature were completed--but the ones that did work were the features important to the end-user which made the milestone a success).

And the list goes on and on. And yes, coding can also be a part of this list (though for me it usually only translates to writing short proof of concepts--and for others it may mean coding some of the tasks with the team).

To sum up, a good architect is not just a lead developer on steroids. An architect should have a much wider range of capabilities and experience. Architects are "enablers". They should use their capabilities to help advance the solution and ensure its overall quality.


 
Tags: Everything | Software Architecture

June 4, 2006
@ 10:56 PM

 

I have amassed more than 30 patterns related to SOA (e.g. SOA Patterns - Decoupled Invocation and SOA Patterns - Aggregated Reporting which I previously published here). I have patterns around security, availability, scalability, composition, adding a UI etc. Some of the patterns are original (I think) and some are based on other peoples work.

 

I am trying to decide whether or not it would be worthwhile putting all these patterns into a book. Writing a book is a very time consuming task (or so I am told) - so I thought I'd run a quick poll between the readers of this blog to see how many of you would be interested in reading (and buying) this book if it will get published.

I know this is not a representative crowd - but it can give me a (very) rough idea on the interest in such a book.

 

Please  send any comments (comments like "forget it, no one would ever want to read anything you write" are also ok) to soa@rgoarchitects.com (or leave a comment here)

 

Thanks in advance - Arnon.


 
Tags: Everything | General | SOA | Software Architecture

May 29, 2006
@ 07:30 PM

[Edited version of post in DDJ]

Since I have been blogging for about a year now on this blog and now also on the DDJ blog. I think  it is time to try making something with more two-way communications.

Consequently, I am going to run a little experiment for a few weeks and see how it goes.

The idea is as follows: If you have an interesting architectural or design dilemma, drop me an email at ask@rgoarchitects.com I'll pick one issue per week and post (on the DDJ blog) the dilemma (anonymously) plus voice my opinion (and/or suggested solution)--and then everyone else can chime in with their comments and insight which hopefully will shed some light on the subject.

I'd be interested to hear both your opinions on this initiative and, of course, interesting dilemmas you are facing. Again, send your dilemmas to ask@rgoarchitects.com)


 
Tags: .NET | Everything | SOA | Software Architecture | SPAMMED Process

I just finished blogging a series of posts on the 8 fallacies of distributed computing on my DDJ blog.

I think it turned out pretty well so I aggregated all the posts into a white paper .I  hope you'd find it useful.


 
Tags: Everything | Software Architecture

[crosspost from DDJ]

Reading the comments on my previous two posts on whether architects should code (here and here) as well as the comments on Johanna Rothman's posts (here, here and here) leads me to a few observations:

The first apparent thing is that the issue is a very loaded. Some people believe it is essential for architects to code, while others (like me) believe that their time is better spent on other issues. (That said, it seems that a small majority of the commenters think architects should code as part of the development team--at least for feedback purposes if nothing else.)

There is a wide consensus (me included)that architects should know how to code and have extensive experience in coding. It is also agreed that architects should be involved in the project--that is, not just drop off the architecture, then disengage.

I still believe that when the project is big enough (that is, big enough to warrent more than one team working on it) the project is better served by the architect getting involved in all the teams, rather than participating as a developer in one of them. If you are an architect and develop as part of the development team you are (or should be anyway) committed--meaning you need to deliver the piece of code under your responsibility at an acceptable quality level as other developers. Which is exactly why you would be less likely to deliver on your responsibilities for the total quality of the project. I assume some of the differences in opinion can be attributed to disagreement on what software architecture is , at least when compared to design).

I also think those who think architects must code see the architect as some sort of a lead developer again. I don't buy that. The architect's role is much broader than that (see also this post by Kevin Seal, which also discusses this issue). I see a holistic view of the architect role, which is making sure the product is delivererable. This may translate to the architect coding a module or two, but it can also translate to a lot of other things. Examples from my experience as an architect include preparing initial cost estimates, iteration planning, helping debug and testing, solving installation problems, analyzing requirements, conducting design and code review, design, and prototyping (yes, that's coding but as I said in the previous posts, that's not writing the production code and this is not having to meet deadlines etc.).

I also liked a comment by Graham Oakses on one of Johanna's posts :

My experience is that an architect is pulled between three poles--the product, the team and the client. The product pole pulls you towards managing the "conceptual integrity" of the design. The team pole pulls you towards mentoring people, helping them build skills, etc (which may mean consciously letting someone write code that you could do much better yourself). The client pole pulls you towards translating between the technical and the client domains (which is often where you get pulled into powerpoint). You need to trade these poles off differently on every project...

To sum up, the answer to "should architects code? " is like so many things in life is--it depends.


 
Tags: Everything | General | Software Architecture

[Crossposted from my DDJ blog]

About the same time I wrote the post on whether architects should code, saying that architects should be able to prototype but shouldn't be part of the dev team (in the sense that the architect shouldn't get coding tasks that results in production code), Johanna Rothman wrote a blogpost that claimed architects must code .

Two days ago she posted a more detailed explanation of her view. I agree with most of the points she made:

  1. Architects need to participate in the project; that is, not be some outsider who just drops her architecture on the team and leaves).
  2. The best way to test a design is to code and run it.
  3. It is beneficial for architects to know to code.
  4. It is important that architects understand the implications of their decisions on the code and developers.

I don't see how architects taking coding tasks serves the greater good, versus their monitoring teams that code and making sure all aspects of the architecture actually fit the problem and work. Again, this may work on smaller projects, but probably not on larger ones.

You may also want to look at two related posts I made in the past
SAF Architecture Evaluation: Evaluation in Code talks about some of the ways architecture can be validated in code.
SAF Deployment: What to do when the architecture seems stable? talks about the architect's involvement in the project when they think the architecture is "finished".

A couple of points regarding the analogy Rothman uses--that is, architects who design bathrooms for hotels. Building architects are seldom a good analogy for software architects (I once used it as well). However, there are far too many differences (maybe I'll blog about that sometime in the future).

This brings me to the second point. This analogy doesn't serve Rothman's point well since building architects never actually participate in laying down brick or installing bathrooms. The fact that hotel bathrooms are not comfortable means that this quality was low on their priorities. In any event, verifying if a bathroom is usable--you don't have to install it just use it. (If you do take the analogy, you don't have to code it just stick around to see what's going on


 
Tags: Everything | General | Software Architecture

Here is another SOA pattern from the list of patterns I am publishing.


One of the core goals of going with SOA is to enable loose coupling. The request-reply communication pattern, which is very prevalent, inhibits this decoupling. The problem is for the caller or consumer of the service - the consumer service is dependant on the timely response of the called service for its normal operations. To help elevate the consequences of this dependency the service that is consumed should maintain QOS (Quality of Service) as part of its contract (it doesn't have to be part of the machine-readable contract but it needs to be defined and adhered to). Consider for example an on-line music store. On a normal business day that can have few thousands of purchases nicely distributed around the clock. And then when a new <name your favorite band here> album debuts they can have much higher peaks than their usual requests load. They still need to be able to handle all coming requests or the (potential) buyers will take their business elsewhere.

 

How can I maintain a level of QOS, handle peaks and high-loads without my service failing?

 

One option is to estimate the peak loads and get enough computation power to ensure you can handle them – this causes problems. One is a problem of waste you can have machines just sitting there twiddling their thumbs so to speak. However the idle computers have purchase, maintenance and operational costs. The other problem is unexpected loads (e.g. Harry Potter craze for an Amazon-like site) – the estimated load might not be enough.

 

Ensuring QOS gets even more problematic when some of the actions performed in the service access resources/services that are not in the under the service control (- e.g. taking to a credit card clearing in the e-commerce example mentioned earlier).

 

Another issue that needs to be take care of is prioritizing requests – a Service most likely handles several types of requests – not all of them need the same level of QOS. You can set the QOS for according to the most demanding request type – but then you may need more resources.

 

Decouple the invocation- separate the reply from the request: Acknowledge receipt in the edge, pass incoming request to a queue, load-balance and  prioritize behind the queue.

 

 

Making the Edge acknowledge the receipt of the request (for our e-commerce example this can translate to "Your order has been received and is being processed, you would get confirmation email when the transaction completes") allows hiding operations that take long time from the service consumers (be that other services or end-users).

 

Writing requests to the Queue is a relatively low-cost operation that can be performed fast thus allowing handling request peaks. The actual handling of the incoming requests can be performed more slowly according to the available resources of the service. The load balancing can be done by setting different number of readers working against the queue.

 

Making the Queue a Priority Queue (or having several queues according to priority) allows for maintaining different levels of QOS for different message types.

Decoupled Invocation can be combined with the Gateway pattern to allow scaling out the service.

 

Decoupled Invocation is enhanced by the use of Correlated Messages pattern which helps relate the request and the reactions.

 

Acknowledge in the Service

Sometimes the initial response needs to involve some business logic and is not just an acknowledgment. In this case the Edge doesn't respond, it just passes the request to the service, the service sends both the initial reaction and the reaction.

 

 


 
Tags: Everything | SOA | Software Architecture

May 8, 2006
@ 10:55 PM

Michael Platt talks about SOA vs. Web 2.0 He provides several links to blogs and article  that basically claim that SOA is dead and long live the new king Web 2.0.

 

One thing I have to say is that if indeed the hype around SOA is starting to calm - this is a very good sign. Finally we can go about adding SOA to our toolset and use it when it is appropriate (not just because management has got to have an SOA). Also it can be a good sign that SOA is maturing.

Another point I'd like to make is that SOA and Web 2.0 are not really related - there is no reason why one should compete with the other. Why using an AJAX front-end makes it impossible to have Services in the backend (it may be appropriate to have a Client/Server/Service Scenario - where the front-ends don't hit the services directly (the other option is Peer/Service) - I may talk about these 2 mini-patterns in my SOA pattern series). Another example where SOA and Web 2.0 can work together is RSS. A service can expose its list of recent changes as an RSS feed (as well as providing the more "traditional" web-services API). Exposing an RSS feed can be an option to implement the Inversion of Communication pattern I mentioned in a previous post).

 

To sum things up - Web 2.0 may be more hyped today than SOA. Web 2.0 and SOA can co-exist and actually complement each other.

In any event I think we (as an industry) should focus more on delivering great applications and solutions rather than fight about whose trend-du-jour is fancier or sexier.

 

[Edited]

After writing about the example of using RSS for Service communication I stumbled today on RSSBus which is an effort to create an ESB on top of RSS protocol ...


 
Tags: Everything | SOA | Software Architecture

April 27, 2006
@ 02:58 PM

[crosspost from Dr. Dobb's Portal]

Test Driven Development (TDD) is, in a nutshell, writing a unit test up front--making it fail, making it work , refactor, and repeat until the product is finished. (If this is new to you, read more at testdriven.com )

So with TDD you get a bunch of unit tests that are also proven as regression tests. That's pretty cool.

TDD also lets you work in small increments while maintaining the working code. That's even cooler.

And lastly TDD has a very good influence on design:

  1. It encourage loose-coupling. When you want make something testable you want to remove the dependencies it has so you can test it by itself.
  2. It makes you think about the interface of the unit under test--how is the interface going to look?
  3. It makes you think about how the unit under test would be used--for example, the behavior of what you are writing (or designing).

Sounds great to me. I think TDD is a great way to do the detailed design. You specify the results (interface + behavior), then implement that design. One thing I don't buy though is that TDD alone will produce an "emergent design" for the whole system. The way I see it is that you have to do some design up-front (assuming your system is not a trivial one) since TDD, being a coding technique, keeps you working at sea-level.

There's also a fundamental matter of scale--it might be possible in theory to start that 100 man-year project as a single object, then refactor it in baby-steps until you'd get the perfect system. I believe that if you don't work at a higher level of abstraction (vs. code), you will not be able to partition the system in a reasonable time. This was true when we moved from assembly code to higher level languages which enabled us to write much more complex software--and it is true today as we need to answer the ever-changing requirements of modern enterprises.

To sum up, TDD is good for testing and it is a good design methodology for the detailed design level. It can be used to drive the overall design on smaller project--but on larger systems we need additional methods and tools to cope with the overall design and architecture.


 
Tags: Everything | General | Software Architecture

April 18, 2006
@ 07:11 PM

There are 113 domain models over at eclipse.org site of all sorts of things.  I guess most models don't have any practical value (what will I do with a metamodel of COBOL) but there are several interesting ones of things like RSS, WSDL, and SDM

 

It is also interesting to note that the models are expressed in several forms including MS DSL models  , UML 

  and images . The transformation from format to format was done by code.

 

(Found via Steve Cook )


 
Tags: .NET | Everything | Software Architecture

As promised, here is the first pattern. If you like this pattern but you think there is something missing to gain better understanding please drop me an email: arnon at rgoarchitects.com . Naturally any other comments are also welcome :)




Getting an SOA right is very hard, not so much because of the technical problems (we know how to deal with those, don't we?), but rather it is very hard to figure where to put the borders and keep the right business alignment.  Assuming you somehow managed that, the real fun begins - you now have to produce reports, dozens and dozens of reports. Many reports will fall within the boundaries of single services (if you have a good partition), however many reports will also require adding data from several services. For example, in a Telco scenario, you may have a Customer, Billing and Provisioning Service (a real-life example would have dozens of additional services) now a customer is calling customer care and you want the CRM to show everything about the customer what outstanding invoices does she have, what equipments and services (GPRS, UMTS, friends and family etc.) she got, what her status as a customer (loyal , VIP, senior citizen …)  open service requests etc. Things get much more complicated when you need to summarize or group data from multiple services 

 

How do you get a decent cross business entities report with the data scattered about in all those services?

 

One possible solution would be to create the report at the consuming end (e.g. UI) visit all of the services involved then do all the grouping, cross-cuts etc. This solution is not very good from the performance perspective (you need to get more data then needed and you have to post-process it). It is also problematic from the flexibility perspective each service involved has to expose interfaces to get the data for the specific query (otherwise you mobilize even more data).

 

Another option is to go straight to the data, you may still need to hit multiple database servers to get to the data but the performance will be better. The problem is this is throwing your service boundaries down the drain and introducing a lot of dependency.

 

A third is to create interim Services ("Entity Aggregation") - this works fine as long as you have real business reasons to do the aggregations (there is an overhead with adding business logic to handle the aggregated data) and as long as you only have few of those  (or you might end up with a single "service" with all the business).

 

Create an Aggregated Reporting Service by building  an Operational Data Store (ODS) to enable creating sophisticated reports on otherwise dispersed data 

 

AggreagatedReporting.PNG


 

 

The ODS is similar in concept to a data mart e.g. data is subject based, integrated, scrubbed etc. However,  the main differences are that the data is up-to-date and that there is little or no history.

 

For incoming data the Aggregated Reporting Edge performs the data transformations from contract data into reporting data. The service updates the ODS by scrubbing the data (can be limited unless the data has to go to a data mart / data warehouse) and then integrating it and De-normalize into subjects.  Incoming report request fill parameters for the pre-prepared reports.

 

One problem with Aggregated Reporting is that it is not a Business Service (i.e. it is a technical solution rather than a business oriented one) - however since unlike Entity Aggregation the data in Aggregated Reporting is Read-Only this doesn't affect the business.

 

Aggregated Reporting is easier to implement when combined with  Inversion of Communication

 

Aggregated reporting with Data Mart/Data Warehouse

 

Instead of just storing recent operational data, this version enhances the depth and complexity of queries that can be executed against the service. The downside is the increased complexity in setting up the data mart - both from the operational costs perspective (e.g. additional storage) and from the design and development perspective (you need to think about long term aspects, indexing etc.) as you also need to scrub data and consider the structure of your schemas much more carefully.

 

 


 

Sidebar: Operational Data Store (ODS)

The ODS is probably the best kept secret of data warehousing technology. It has been around almost as long but it isn't as famous.

The data in the ODS is operational - live data and not static data. The ODS can be thought of the as the cache memory of the data mart / data warehouse.

It is important to note that while it doesn't need the same amount of planning and set-up as a data mart, an ODS still requires careful planning in order to bring real business value.

 

The figure below shows the classical usage of an ODS in an OLTP/Data Mart environment.

 

ODS.PNG

Originally it was thought there would be 4 types of ODS

 

Class I - Near Real-Time synchronization of the ODS with operational data from the OLTP databases.  an implementation of Class I is the preferred type for the Aggregated Reporting pattern

Class II - Update the ODS every four hours or so

Class III - Overnight updates of the ODS

Class IV - the ODS is updated from  the data mart / data warehouse

 

In reality there are more variants - for example a powerful (and complex to build) option is to merge a Class IV ODS with one of the other Classes and get.

 

 


 
Tags: Everything | SOA | Software Architecture

April 13, 2006
@ 10:29 PM

I decide to write a short series of blog post on SOA patterns. These are not patterns that are only usable for SOA, however, I have found them particularly useful in implementing SOAs.

 

This isn’t an exhaustive list of pattern - on the contrary I'll try not repeat patterns which are well known (like  Entity Aggregation  http://patternshare.org/default.aspx/Home.PP.EntityAggregation )

 

I am a little busy these days (e.g. I have to complete an architecture document for one of my projects) - so this post will only introduce the (first batch of) patterns . And the following posts (in the series) will expand on each one (i.e. explain  What to do, usage context, consequences etc.). Then, if I'll get good feedback maybe I'll publish some more.

 

So, what patterns are we talking about here?

Well:

 

  • Gateway - How do you scale a service without exposing too many endpoints?

 

  • Inversion of Communication - How do I get the data from other services without too much coupling?

 

  • Biztalkize - How do I control volatile behavior inside the  service ?

 

  • Aggregated Reporting - How do you get a decent cross business entities report with the data scattered about in all those services?

 

  • Transparent Emergence - How do I know where to find a service?

 

  • Decoupled invocation - How can I handle peaks and high-loads without my service failing?

 

  • Orchestrated Choreography  - How do I expand the behavior of hard-to-change service (e.g. legacy systems exposed as services) ?

 

Well, I hope this sparkle enough interest to make you follow the rest of the posts on this subject :)


 
Tags: Everything | SOA | Software Architecture

Uncle Bob (Apparently Robert C. Martin?)  writes about Architecture as a secondary effect .

The article postulate that : <Quote>
  1. The main goal of architecture is flexibility, maintainability, and scalability.
  2. But we have learned that the kind of unit tests and acceptance tests produced by the discipline of Test Driven Development are much more important to flexibility, maintainability, and scalability.
  3. Therefore architecture is a second order effect and tests are the primary effect

 

I had not thought about this before this round table. Here we were, a bunch of architects and designers, strongly debating the role and procedure of architecture, and the conclusion we come up with is that all the effort and struggle we go through results in a secondary improvement in flexibility, maintainability, and scalability. Writing tests (writing them first) has the primary effect

 

</Quote>

I think Bob is missing/downplaying one very important aspect - and that is level of abstraction.

 

  • Many agree (me included) that code is the final design artifact.
  • Many agree (again me included) that TDD is a powerful design technique (you may want to check out TDD Misconceptions or Rocky Lhotka vs. the world as summed by Jeremy Miler)

 

However since code is (obviously) detailed design you just can't go straight to code (test code or otherwise). In order to cope with a large/complex problem you need to tackle the problem at higher levels of abstractions first. Even more so there may be a need to go through several abstractions levels before you start coding anything. Unfortunately there aren't any real options to test models*.

 

In my opinion you cannot escape designing architecture in other (non-code) models for any, but the most trivial, system.

 

The way I see it the correct approach is to

  1. Work iteratively
  2. Test early - i.e. make sure that the architecture designed really works as soon as possible (see, for example, my post on evaluating architecture in code )

 

So - Is Architecture a secondary effect?

No, sorry, but I really, really don't think so.



* There are some options to allow simulation and validation (i.e. tests ) of models in the embedded world (e.g. http://www.embeddedplus.com/EmbPlusSMST.php or http://www.ilogix.com/sublevel.aspx?id=286) - however I find that these approaches don't scale well to IT problems (which have a lot more variables and are usually much larger than embedded systems) or even to complex embedded system. You just have to specify too much before you get a usable simulation rendering the whole effort useless.



 
Tags: Everything | Software Architecture

April 9, 2006
@ 10:16 PM

One of the roles of the software architect is to act as a mentor/coach. Reviewing some of the designs in one of my projects' teams it seems the time was ripe for doing just that. Thus, last week I gave them a presentation on the basics of good OO design  - which I thought might also be of interest for other people (you can download a copy here - 312KB).

 

The presentation starts with the  7 deadly sins of software design:

  • Rigidity – make it hard to change
  • Fragility – make it easy to break
  • Immobility – make it hard to reuse
  • Viscosity – make it hard to do the right thing
  • Needless Complexity – over design
  • Needless Repetition – error prone
  • Not doing any

 

It is interesting to note that just yesterday I read an interesting piece on what makes good design (i.e. looking from the positive side) by James Shore (found via Sam Gentile

 

 

The main part of the presentation demonstrates the 5 basic design principles (drafted by people like Robert C. Martin , and Barbara Liskov ):

  • OCP open-closed principle - a class should be open for extension but closed for modifications
  • SRP single responsibility principle - a class should have a single responsibility
  • ISP interface segregation principle - there should be separate interfaces for different consumer types
  • LSP Liskov substitution principle - basically design by contract - a sub-class should fulfill the same expectations its suparclass set
  • DIP dependency inversion principle - classes should depends on abstractions, class consumers should depend on abstractions and abstractions shouldn't depend on details.

 

These principles are the basis for  some of the techniques widely used today - few examples include:

Inversion Of Control - builds on OCP

Dependency Injection - a mechanism to allow DIP

Contract First - building on LSP,DIP

 

At the end of the day following these principles helps managing classes dependencies, increase overall loose coupling and cohesion thus increasing the overall quality of design. It sometimes amazes me how using just a  few simple rules can improve maintainability, flexibility and usefulness of designs so much.


 
Tags: .NET | Everything | General | Software Architecture

March 31, 2006
@ 12:46 AM
For those of you who are following my writing on SAF (or the Milestone 1 post for more details) - here is an introductory presentation to SAF (1.2 Mb). I delivered a slightly modified version of this 2 days ago and someone commented he would like to see more information on strategies to deal with quality attributes. I'd be happy to hear any other comments you may have


 
Tags: Everything | Software Architecture | SPAMMED Process

Last week Beat Schwegler and Ingo Rammer visited Israel with the Software Factories Seminar  (The link is to the same seminar presented in  Finland -  videos are in English)

 

Software Factories is not a new idea  - see for example "Software reuse: From Library to Factory" by M. L. Griss  (published in 1993(!)) which talks about "Software Factories" and "Domain Specific kits": components, frameworks, glue languages etc.  The current Microsoft  incarnation of Software Factories takes a similar approach focusing on Domain Specific Languages, Frameworks but also adding important aspects like multiple viewpoints, patterns and designers. The idea is that  building on modern technologies, as well as learning from the mistakes from sister approaches to code generation (OMG's MDA, in case you are wondering) will enable us to build something that is useable.

 

Microsoft seems to be taking some steps in the right direction (GAT is probably the best example). Nevertheless there is still a long way to go before we can realize the dream of "factories" for vertical applications

 

This is evident if you take a look at the  current crop of DSL examples. These are either some Horizontal languages and tools  (i.e. not aimed at a specific business domain) like UIP designer or the GAT4WS beat talked about @ the event or even worse UML designers... ('nuff said).

 

The other thing is, that developing real factories, and getting them right is a really complicated  task, which requires a lot of domain knowledge and effort. In the recent Microsoft Architect's Journal  there's an article by Jack Greenfield and Mauro Regio regarding a software factory for the Health Level 7 ( a standard for Health organization collaboration) - As far as I know this project has been under way for more than a year now (first time I heard about it was Feb. 2005) and yet the article ends with :

 

"Our experience in developing a factory for HL7 collaboration ports

has shown that we need to define better frameworks, tools, and processes

to specify the factory schema, to manage factory configuration

in a flexible and extensible way, and to better understand how

and when domain-specific languages should be used."



My expericen with MDA shows similar results. Nevertheless, they continue to say that:

 

"At the same time, initial implementations of extension mechanisms like GAT and

DSL have proven their value, filling significant gaps in software factory

infrastructure, and pointing to future innovation in that area.

 

So, maybe there's hope after all :)


 
Tags: Everything | Software Architecture

I just read an excellent post by Gregor Hohpe talking about the motivation for Event Driven semantics for service communication.

Gregor gives an example of a shipping service listening on order events and address change events to produce shipments.

It is nice to see how architectural approaches transcends business domains so well -  The Naval C4I project Udi Dahan  and myself are working on, we basically try to take the same approach. For example: A Sensors service publishes its status every predefined time - The sensor knows if something is wrong with its state. A sensor, however, doesn't know if the problem is important or  not. We designed an Alerts service that listens in on status messages, based on (changing) business rules a certain status may trigger an alert event (which a UI can then choose to display); a severe alert may result in an SMS alerting a technician to come and have a look.

 

However, while this approach is very good for inter-service communication -things aren't as rosy when it  comes to interacting with UIs. The point is UIs  are based on interaction so the request-reply idiom (should actually be implemented as request-reaction) is much more prevalent. Users really want to know their request is being taken care of

 

Another lesson we learnt is that since services make go on-line and off-line independently of each other, it is not enough just to support event listening for event aggregators to be up-to-date. One option is to relay on reliable messaging to any event posted will eventually get to the listener - there are several problems with this approach for example:

  •  For one, you need a reliable message transport which might be a problem e.g. you may not be able to use JMS/MSMQ between enterprises and/or the protocols you use don't support it (e.g. WS-RM is not durable see here  and here )
  • Even if you have reliable communication,  if one service has been offline for a long period of time (where long is defined by the communication load) - it may be a waste of time (or plainly wrong)  to process old events that are no longer valid

 Another option to handle this situation is to supply in the contract request for current state (the current state can be published using the same message structure used by the matching event). The advantage here is that a server coming on-line can quickly and efficiently get up-to-speed on the current situation.

 

The event thinking is relatively on par with Take-it-or-leave-it approach for contracts construction, but as I said in the previous post on contracts, I think it is more beneficial to know about your consumer and take their input into account

 

Apropos EDA,  I also learnt today that the Micro-Services  strategy Udi and me came up with had already been "invented" several years ago. It is called SEDA (Staged Event Driven Architecture) there's a nice presentation explaining it here 


 
Tags: Everything | SOA | Software Architecture

Yoni Commented  on my SAF - Deployment post :

 

"Somewhere in your blog you've mentioned Martin Fowler's "Who needs an Architect" article in a positive way. However, it seems that in contrast to Fowler's notions of an architect who's job is to "remove architecture by finding ways to eliminate irreversibility in software designs" - you are advocating making long-term binding decisions in the initial stages of the development cycle"

 

It is very hard to argue with Martin Fowler (as Gregor Hohpe put it in a recent post "Now who would want to argue with Martin Fowler? His opinions are facts :-)" ) However, I think that devising solutions for flexibility (such as the database schema example in the article) are also architectural decisions. What makes things even "worse" is that these decisions are usually made at the expense of other quality attributes (e.g. performace) or other project constrains (e.g. time to market) - meaning that a decision on flexibility is a making  tradeoff  or in other words an architectural decision "par excelance".

 

Thus - architectural decisions, by definition,  are early  - whether we like that or not. Furthermore, Flexibility is an important quality attribute, one the architect brings into the table (in the same way that other stakeholders bring their concerns - e.g. the Project manager who want the project to end on time and on budget). It is the architect's role to balance quality attributes, understand the tradeoffs and make the decisions that will enable the project to achieve its goals. Many of these decisions have to be made early so that the project can move on, some of these decisions have to be made early to prevent ad-hoc architecture (un-balanced, spur of the moment decisions that will be hard to change). I believe very few decisions can be postponed without doing anything (i.e. without making some active decision to introduce flexibility).


 
Tags: Everything | Software Architecture

I've just found out (via  Gianpaolo's blog) that Roger Wolter (former PM of Service Broker) started blogging . He is going to focus on  Data in a Service Oriented world. I had a chance to work with Roger for a short time, which was enough to notice that if anyone knows about data, it is him.  I guess there is no surprise there, considering his past at Microsoft working on :SQL Server Service Broker, SQL Express, SQL XML Datatype, SOAP Toolkit, SQLXML and COM Plus.

His first post (after the obligatory "hello world" post) is about Service Broker positioning (vs. MSMQ, Biztalk and WCF) - Subscribed


 
Tags: .NET | Everything | SOA | Software Architecture

February 22, 2006
@ 11:17 PM

It took awhile but I finally managed to complete describing all the activities of SAF at least at some level of detail.



What is SAF?


There is very little guidance on how one can go about designing/developing an architecture for a software project. The SPAMMED architecture framework (SAF) aims  to help fill this gap. 

 

SAF  is a set of activities that an architect can follow when she sets out to design an architecture. These activities helps the architect to keep abreast of the project's needs and the drivers that affect the architecture.

 

Overview posts:

Getting SPAMMED for architecture

SPAMMED State chart

 

 

SAF Activities


The Activities of SAF include

  • Stakeholders -identify the stakeholders   - Anyone with vested interest in the project (end users, clients, project manager, developers etc.) These are the people you will have to explain you architecture to. These are the people that have concerns that the architecture will have to satisfy (or most likely balance). Thus, the fist step is to identify and rank them.

 

  • Principles – list Principles, Goals and Constrains. These are the properties you wish your architecture to have (or lack) based on your previous experience. Constrains can also come from your stakeholders (e.g. management decided that all project should be .NET, a deadline etc.)

 

  • Attributes –  discover quality attributes, the non-functional reqeirements, that (once prioritized) serve as the guide for the overall goodness of the system (Performace, Availability, scalability etc.)

 

 

  • Model – model (and document if needed)  the architecture as seen from different viewpoints (list of viewpoints is stakeholder driven). Example for viewpoints include package diagrams, deployment diagrams, DB Schema etc. etc.

 

  • Map – Technology mapping, buy vs. make decisions etc.

 

  • Evaluate – Since architecture is the set of decision that are hardest to change it is worthwhile to spend some time trying to evaluate if they are indeed correct before commencing on

 

  • Deploy – Software architectures are not a fire and forget thing. As an architect you still have to make sure that the guidelines set are indeed followed and even more importantly that the architecture chosen indeed match the project’s needs and doesn’t have to be reworked.

 

Feedback & Future


I would be very interested to hear any feedback on SAF - you can send  feedback to saf@rgoarchitects.com or just leave a comment

 

On the next SAF posts I am going to talk about SAF alignment with development methodologies (RUP, MSF 4, SCRUM etc.) as well as strategies for following or acting upon the activities.

 

 

 

 

 




 
Tags: Everything | Software Architecture | SPAMMED Process

The last activity of SAF is deployment of the architecture.  This step can make the difference between an ivory-tower architect and one whose designs are actually used in real software projects.

Deployment of the architecture actually  means two things

  1. Verification and feedback loop. - making sure the architecture is actually the right one.
  2. Governance - making sure that the architecture is followed by the designers and developers

Verification and Feedback loop

It is common practice to think of software development as an iterative process. Knowing that Software architecture  is early and that it encompass the important decision which are hard to change, it is probably wise to think of the first few iterations as architectural ones. You would try to work with the team and form the major abstractions, hopefully come up with a blueprint for processes etc. However, the fact that you've decided only the (let say) first two iterations are architectural ones, doesn't mean that new non-functional requirements won't emerge during later iterations. Since practice shows us it is (almost always) futile to try to analyze "all" the requirements before we do any design - this is almost bound to happen.

Be ready to monitor the project's progress after the "architectural iterations" to see and deal with any emergent requirements. As the stakeholders (e.g. product owner, project manager and the architect) try to prioritize tasks/requirements from iteration to iteration, hopefully most of the architectural issues would both surface and be taken care of the first few iterations (remember - architectural decisions are the ones that are hard to change…) - but unfortunately this isn't always so. I recently audited the architecture of a project (which has been running for more than a year) where they identifies the need for transactions, but following some stupid 80/20 rule decided that most of the system does not need that (deadly sin #1) and that they'd be able to easily patch it in later down the road (deadly sin #2) - so not enough just to identify a need, you also need to deal with it early.

The first few iterations also serve as a feedback loop on the suitability of the architecture, sure you've tried to evaluate  the architecture both in code  and on paper , but once it is deployed, or used for real - that is its real test. You will probably want to allocate some time to sit with developers and designers and actually see how the architecture is used and the implications of your decisions.

Governance

It isn't enough to design a software architecture, show it off at some architecture review and maybe at product demonstrations. Designers and developers are more than likely to bend your architecture to their convenience - this can happen for a multitude of reasons - for example:

  • They flipped the bozo on you  - they think they know better
  • They don't see the big picture and they try to optimize locally
  • They don't know any better, they just do what they always did
  • They believe they are following your design but they didn't really understand what you've meant (or alternatively - you didn't explain yourself very well)
  • They cut corners e.g. to meet dead-lines (just to appease the project manager breathing over their shoulder)
  • Etc.

When the architecture is compromised, it can definitely  have severe  effects on later iterations and the overall stability/usability of the system. It is important to note that  it may even result in harsh implications for the organization - imagine someone circumvents some auditing mechanisms you've put in place to comply with SOX or Basel II.

What this means for the architect is that she cannot disengage from the project completely even once the architecture is stabilized (i.e. after the first few iterations). Architect involvement is needed for design reviews, maybe key code reviews , on mile stones  etc.

I think it is one of  the architect rolls to act as a "comptroller" and oversee that the project stays on track as far as the architecture goes. Unfortunately there aren't too many automated tools to help with that. DSLs is one direction that may help (look for example at the Guidance and Automation Toolkit here  and here  ). I've also recently saw an application that claims it can enforce and monitor SOA policies on e.g. on contacts (for both Java and .NET). I hope we'll see more of these tools in the future - meanwhile it is up to us, architects, to do that as part of our overall responsibility.

 


 
Tags: Everything | Software Architecture | SPAMMED Process

February 8, 2006
@ 11:26 PM

In the previous post I said "don't bubble exceptions out of your service" -  Ebenezer Ikonne asks  "Well I wonder what the verbiage of the exception should be?  If a null pointer occurred in the service, what message should I return back to the consumer of the service?"


First off, lets consider the meaning of bubbling  the exception - what would a remote consumer, sitting on some other company's server do with a "null pointer" exception?! - the consumer doesn't have any control on the resources or life cycle (or anything else for that matter)  of the service it is trying to consume. Also if it depends on the internal problems of the service it consumes it makes it (the consumer)  much less autonomous.

 

So, what's the other option? Well, as I mentioned in my previous post it is best if the service can "pretend" nothing really happened e.g. log the incoming message before doing anything and then if there's an exception respond (if the contract requires response by a deadline) with a "got your message, working on it, you'd get a confirmation message soon" sort of reaction. If the exception occurs before the incoming message is saved then it is probably needed to respond with "out of service, try again soon" if the edge is not up then you (as a consumer) should (finally) get an exception (the protocol failed - the message you've sent did not arrive)

 

By the way a I think that a somewhat similar principle is true for bubbling exceptions across layers in a layered architecture 


 
Tags: Everything | Software Architecture | SOA

February 6, 2006
@ 11:32 PM

Take a look at Jeff Schneider's blog - for some nice Lego illustrations of composite applications concepts 


 
Tags: Everything | General | Software Architecture

February 6, 2006
@ 11:27 PM

In one of the discussions  on the MSDN Architecture Forum someone  mentioned that when a service invocation results in  an exception his company standard is to:

 

  • Log the error before exiting the service

  • Create a new SOAP error (exception) with minimal info ("The requested service failed") and that error includes (innerException) the original error (This way, if someone receives my error and they are not familiar with my errors - they will get a simple soap error. If they are familiar with my error, they will have the ability to inspect it further)

 

I think this is this is the very distinct  anti-pattern of how to do SOA i

Lets look at few reasons why:

  1. Log the error - that is fine and everything - maybe it'll help during the post mortem, but the operations people should also be notified somehow. Otherwise you have a dead service there and no one knows

  2. "exiting the service"  - Services shouldn't fail - a failed service can mean a lost business opportunity.   When  you can't service the requests, you should at least be able to maintain the "well know end-point" up and running and let everyone know that the service is, well, "out of service - be back in XYZ". The preferable solution is to still  be able to queue incoming  requests and handle them later (this may not be possible if part of the policy (or SLA) for the service is to react within few seconds, but again, in many other circumstances it is very plausible.

  3. SOAP Exception - why through a SOAP exception - the protocol/communication worked fine...

  4. "...(innerException) the original error" - do not expose internal implementation out side of the service - only what's in the contract - in other words don't, just don't bubble exceptions out of your service.

 

It is very easy to make, what seems like, a small concession on the purity of your service, but your SOA concept and loose coupling specifically can deteriorate very rapidly on even the smallest compromises


 
Tags: Everything | Software Architecture

January 22, 2006
@ 10:32 PM
Being an Architect is a lonely job. You get to interact with all the stakeholder one can think of, and sure, everybody has an opinion, but in the end you get to make the important decisions by yourself. Even when an organization has  several architects, many times only one is assigned to a project at a given time. Well maybe it is time to move to pair architecting (pair programming redux for architects)

 

Over the past few months I've had the chance to work with another architect (Udi Dahan ) on an the architecture for a new product line.

 

This actually proved to be a very positive experience:

  • You get informed feedback for ideas
  • You get to look at a problem from more angles
  • Working together helps refine the design (instant reviews and mutual feedback)
  • You can play good cop/bad cop (or bad cop/worse cop :)) vs. The different stakeholders (PM, Devs, SMEs etc.)
  • You can divide the work to get more things done (be less of a bottleneck)
    • It is also easier to work at the different levels required with less context switching ( presenting to non-technical customers, working with programmers, convincing upper management etc.)

 

Now, few iterations into the project, the architecture is pretty stabilized, but, we're still working together only now we mostly divide the work between us. I get to do the "fun" stuff - working with the marketing guys; working on the schedule and iterations with the project manager; etc. While Udi plays the "Architect as a coach" with the developers of the team as well as redesigning the clients (After we gave too much slack to the client team during the previous 2 iterations)

 

Now I wouldn't recommend bringing too many architects into the fray as this can easily degrade to a "design by committee" sort of effort  but it is definitely  beneficial to have more than one architect working on a project.

 


 
Tags: Everything | Software Architecture

About a month ago Microsoft launched two Forums aimed at architects on the MSDN forums

   Architecture Forums: General Forum and Architecture Forums: Modeling and Tools

Here is the introduction as posted by Simon Guest :

"Welcome to the Architecture General Forum. This forum is for discussing general issues and experiences related to architecture and designing solutions on the Microsoft platform.  This forum is moderated by several members of Microsoft's Architecture Strategy Team.  We welcome all questions and comments related to architecture, although we recommend that product specific questions (for example, "I can't get x to install") are directed to more appropriate forums.  "


 
Tags: Everything | Software Architecture

January 18, 2006
@ 10:51 PM

This is actually a bit of old news - last month the IASA (International Association of Software Architects) renewed web-site was launched with few interesting articles by Scott Ambler, Simon Guest and others.

You can find the site here


 
Tags: Everything | Software Architecture

On the previous post on Architecture evaluation I talked about evaluating a candidate architecture in code. This post is dedicated to evaluation on paper.

 

I remember one system I was working on, I was keen on making the architecture asynchronous and message oriented (it was all circa 2001 by the way) However, I was new on the team and my role (as the project's architect) wasn't well defined so I had to do many compromises and get wide acceptance in order to get anything going forward. We set a team to try to come up with a suitable architecture, since each of the team members had his/her own experience we came out of these meeting with more than 20 (!) different candidate architectures (actually there were fewer architecture variations but they were multiplied by the possible technology mappings). Trying to decide which was the best option to follow we trying to conduct some sort of a QFD process where several members where in charge of the weights and the rest where in charge of evaluating and scoring the different categories (per option). Like most "design by committee" efforts this also proved a doomed from the start - and the option everybody disliked got the highest score. If you are wondering what happened - we scraped this effort and started from scratch in a more sensible way (which included a detailed prototype) - what's important for the purpose of this post is that it got me thinking that there must  be a better way to do evaluate architectures. Well, a lot of research and several projects later I think that there are few techniques that give much better results.

 

The first methodology I stumbled upon was ATAM (short for Architecture Tradeoffs Analysis Method), developed by SEI.

ATAM is a rather lengthy and formal method to evaluate architectures it requires a lot of preparation and commitment from the different stakeholders. You can get an overview of the process from the following (~130K) ATAM presentation  I prepared few years ago (While this is probably not  the best presentation in terms of presenting it to a crowd (I know better now :) )  it does provide a good overview of the 9 ATAM steps).

 

ATAM is explained in more details in "Evaluating Software Architectures", the book also details two more evaluation methods SAAM (which I'll let you read in the book) and ARID (Active Reviews for Intermediate Designs).

ARID, like ATAM, is a scenario based technique, meaning that as part of the evaluation process you need to identify scenarios where the system's quality attributes (see Quality attributes - Introduction ) occur or manifest themselves. The main idea in ARID is that for each (prioritized) scenario the participants try to draft code that solves that scenario utilizing/following the  design  under test. The results of the effort are then evaluated for ease of use, correctness etc.

There's a good introductory whitepaper on ARID in SEIs web site.

 

Note that ARID is more suited to agile/iterative development (compared with ATAM) since (as its name implies) it doesn't require the architecture to be completed and finalized up front.

 

While I was working for Microsoft, I stumbled upon another evaluation method called LAAAM (which is now a part of MSF 4.0 for CMMI Improvement). LAAAM which stands for Lightweight Architecture Alternative Analysis Method  is also scenario based and like ARID is more agile alternative to ATAM.

 

In LAAAM you create a matrix which has scenarios on one dimension and architectural approach, decision or strategy on the other dimension. Each cell is evaluated on three criteria:

  • Fit - how viable is the approach to solve the scenario (including risk, alignment to the organization's standards, etc.)
  • Development Cost
  • Operations Cost

 

LAAAM was developed by Jeromy Carriere while he was working for Microsoft (he is now working for Fidelity Investments in Boston).

 

SAF works well with all of these techniques, as one of the basic steps is to identify the quality attributes and write down scenarios where these attributes manifest themselves in the system (see Utility Trees - Hatching quality attributes  )

 

To sum things up -

There are several ways to evaluate software architectures on paper - ATAM, ARID, LAAAM and few others (I didn't discuss here)

Scenarios based evaluations help verify quality attributes are being taken care of by the suggested architecture

Paper based evaluations can help reduce the number of options to few (hopefully one or two) leading solutions which can then be evaluated in code (as the previous post on this subject suggested)

 


 
Tags: Everything | Software Architecture | SPAMMED Process

January 3, 2006
@ 08:31 PM

For the first post for 2006 I'd thought I'd throw-in my .2 cents about SOA

(Note: this is 1.5MB ppt)

 


 
Tags: Everything | Software Architecture

Today I attended a presentation on Service Orientation (SO) by Clemens Vasters. The presentation itself was very interesting, Clemens mentioned that SO is about business alignment and loose coupling (two claims I all heartedly agree with). He also made some claims about SOA which I only partially agree with (but that's not the point of this post).

 

Clemens, used the business practices of  the pre-computer era to build the case for service orientation - utilizing several metaphors such as Department = Service and folder+form = message etc.

 

From these metaphors we can basically  infer 2 types of service communication patterns:

  • one way messages  - a service consumer can ask a service to perform some action
  • Request/Reply - A consumer asks for a service and waits for the results (synchronously or asynchronously)

 

 

I guess there's a general limitation to metaphors - since a metaphor is (a kind of) a model of a problem, it doesn't catch/include all the properties or possibilities of the problem. Using a metaphor, we should be careful not to neglect options that are not covered by it.

 

In this particular case, it is hard to see other possible ways for services to communicate. Unlike business practices of the 1920s, things can be handles in parallel, a service may not care who deals with data it publishes e.g. in a publish subscribe  scenario or if there's an external (to the service) workflow that takes care of routing (read: orchestration in MS's lingo or Choreography in IBM's lingo) data to interested services. For example:

Using a publish/subscribe communication An Ordering service may publish any approved order and both the invoicing service  and a logistics service can trigger a JIT inventory process to ensure that supplies will be available to fulfill the order.

 

By the way, I think this point (regarding metaphors) is also valid for the "system metaphor" in extreme programming (but that's a point for another post)

 


 
Tags: Everything | Software Architecture

Reading the November issue of Crosstalk I came across an interesting article on managing Architectural dependencies using a Dependency Structure Matrix (DSM) - rather than dependency graphs induced by UML. The article is called "Dependency Models to Manage Software Architecture" by Neeraj Sangal and Frank Waldman"

The article talks about the value of using DSM technique for managing architecture evolution as requirements change (analyzing ANT as an example) - Apropos "Architecture Evaluation",  I think the DSM technique can also be useful in early stages of the architectural design as a way to infer dependencies and to help promote loose coupling

 


 
Tags: Everything | Software Architecture | SPAMMED Process

 

In the introduction to architecture evaluation I said there are two approaches to evaluating a software architecture. This post talks about the first approach - evaluating an architecture in code.

 

POCs

The first evaluation-by-code tool is the  Proof of Concept (POC for short). Building a POC is about building a minimal amount of code implementing  a focused area of the architecture or the architecture's  technology mapping. The aim of the POC is to help weight alternatives (when you are contemplating which way to go), lower technical risks or lower stakeholders'  anxiety over an architectural choice.

POCs map quite well into XPs spikes

 

Lets look at a few POC examples (examples from my past projects)

 

Example 1: Validate the feasibility of an architectural direction

On one project we inherited this ugly application incorporating its own proprietary cgi web-server,  the architectural decision was to keep this as a black-box and develop the project using a better more scalable architecture (though we still needed to utilize functionality form the C++ server now and then). The challenge to make this happen was to be able to maintain and  pass the session from the rest of the application (JSP, Servlets and J2EE) onto the C++. A (successful) POC that tackled this issue allowed us to advance in the chosen architectural direction, reducing the risk significantly.

 

Example 2: Validate a technology mapping

On another project I worked for (when I was with Microsoft) we analyzed the project quality attributes and found that there's a need for near-fault tolerance (fail-over in 5 seconds or less). The architectural solution that we decided on was to use an active server and a semi-active one (on-line, ready to take over server that constantly applies state from the active server)* - for technology mapping we considered several options (e.g. fault-tolerant hardware ). One of the option considered was using SQL Server 2005 Database mirroring to keep the two servers synchronized (DB mirroring gives you a failover of the DB in about 5 seconds or less). In order to verify this direction I set up a small Proof of concept to verify that this direction is viable. I was told that after I left Microsoft, further investigation of the issues found led to Microsoft's decision to postponed Mirroring for now. 

 

Example 3: comparing alternatives

We wanted to compare MSMQ vs. using an existing distributed object middleware both in terms of performance and usability (is it developer-friendly). We crafted 2 POCs one for each technology which enabled us to compare the two approaches head-to-head.

 

POCs help evaluate alternatives and  lower risk in specific areas of the architecture (and design for that matter). However, POCs will not give you a feel on how the overall architecture  will play together - enter prototypes

 

Prototypes

A prototype is basically a working simplified model of the system. There are many characteristics to distinguish between different types of prototypes (hi-fidelity/low-fidelity, global/local etc.) - let focus on two:

  • Horizontal prototype - which models wide aspects of a single layer, i.e. many features with little details. The most common example for Horizontal prototype is a  user interface prototype which is used to test the overall interaction with the system.
  • Vertical prototype -Implementing some sub-system or a limited set of features across all layers /modules.

 

The Vertical prototype is useful way evaluating, getting a feel and understanding of how the different components, that makes  the architecture, work in unison without getting bogged down in all the fine details of the system's functional requirements.

 

Example: Using a prototype to evaluate an architecture alternative.

We were getting ready to embark on a rather large project (we did the prototype around the release of .Net 1.0 and the project is still going on…). We wanted to understand the capabilities  and limitations of .NET. We chose a limited aspect of the system (which we considered the most risky), chose some of the designated team-leaders and took an architect from Microsoft Consulting Services to help us build the "by the book" architecture.

We did a very extensive prototype, total effort of 3-4 man-years including all the preliminary work and the post-mortem analysis. We gained a lot of insights on what .NET can and cannot give us out of the box, we understood the limitations of the components we integrated (e.g. ESRI's limitations in displaying near-realtime moving objects) and we  also used the preliminary prototype (which was a performance hog) as a platform for running POCs for other architectural and technological directions. Additionally, once we solved the performance problems, we also used it as a demo for the client.

By the way, this experience also had some additional positive residual effects like, getting the team leaders up-to-speed on the (then) new technology, jelling the core team  etc.

Taking all the information gathered during the prototype we were able to design a better, more robust architecture for the project itself (which the architect, that came after I left the project, managed to mangle - However that's another story altogether :) )

 

I've found that in most cases exploratory prototypes, or "Throwaway prototypes" are more useful as they really let you get to the crux of the matter quickly, i.e. getting all the components connected the way the architecture dictates to test their interactions and usage. Again, the idea here is to focus on evaluating the architecture, not on the implementation details of the overall solution. Nevertheless, once the architecture is more mature you may choose one of the prototypes and evolve it into the actual system (sort of turning it into an architectural skeleton).

 

Architectural Skeletons

Once you've decided on a candidate architecture (i.e. the architecture you want to use for the project) your first iteration or two (This might not be the first iteration as  you may already done a couple or so prototype iterations) should be focused on creating the architecture skeleton.

Architecture skeleton is about implementing the minimal set (bare bones, so to speak) of the project's functionality that is needed to connect all the pieces in a meaningful, integrated way (for example it can includes an implementation of a single thread in a use case or a important story). It is somewhat similar to a prototype, with 2 differences :

  • It has to  implement real functionality of the system (though the functionality is, usually, very thin)
  • You don't throw it away (hopefully anyway)

 

Most current methodologies (RUP, MSF for CMMI Improvement, XP etc.) support the notion of architectural skeleton (though not using this name). In RUP, for example, you would have the architectural skeleton up and running at the end of the elaboration phase - a running architecture which you can expand and functionality to in the construction phase.

 

It is important to implement a skeleton (vs. starting to implement the different components and try to integrate them later) as it gives you a relatively early opportunity to actually test if your architecture holds, and it is much better to find errors, especially architectural ones, as early as possible.

 

Summary

I demonstrated  3 "tools" to enable evaluation of architectural decisions in general and the overall architecture in particular:

  • POC - focused on a specific area
  • Prototype - overall architecture with "simulated" behavior
  • Skeleton - "barely running" implementation of the chosen architecture.

 

The problem with these approaches, especially prototype and skeletons is that they require a relatively long time as well as resources to implement. We need some additional tools in our evaluation toolset to allow us to focus on architecture alternatives that are most likely to match our needs.

I think that there are such tools, and on the next post on architecture evaluation I will try give my view on what they are and how to use them.

 


* other options are active-active and active-passive (e.g. Windows clustering)

 


 
Tags: Everything | Software Architecture | SPAMMED Process

Ok, so you've designed this grand glimmering SOA, and you have a few dozens services each assigned to a development team.

 

As the iterations spans the contracts have to be amended (be that WSDLs, plain old messages, object APIs or whatever) - do not leave the task of changing, aging or maturing to your developers, team leads or anyone else - I strongly believe that the architect, having defined the boundaries between the major components (services in this case) is the one also responsible for the space between the services. Anything that travels this space is in the realm of the architect*

 

In my opinion, failure to be in charge of the contracts can result in major breakage of the architecture (chatty interfaces, security problems, services dependency to name a few of the possible and likely problems)

 


 
Tags: Everything | Software Architecture

I've been a little busy in the last couple of weeks and unfortunately, didn't have time to blog - to get my self started again - here is a short introductory post on the subject of Architecture evaluation:

I said  in "what's Software architecture" - architecture is both an early artifact and it also represents the significant decisions about the system - or to sum it up  "Architecture is the decisions that you wish you could get right early in a project.”  (Ralph Johnston*). That is exactly why I made evaluation one of the key steps in SAF. We want to raise the level of certainty that the direction(s) we are taking are indeed the right ones and we will not hit a wall later on. Especially considering most projects these days are iterative, which makes this even more challenging.

But, how do you evaluate an architecture? (on a give set of quality attributes) How can you tell if utilizing, say, SOA, will yield better results than distributed objects ? Is using fault-tolerant hardware better than using a database (on performance vs. robustness trade-off)? and so on and so forth.  Even if we say that flexibility and the ability to cope and embrace change is the most important trait (read quality attribute)for our architecture   we still need a way to evaluate which approach will give us the most flexibility

The way I see it there are basically two approaches for evaluating software architecture

  • In code - as in proof of concepts, architectural prototypes and architectural skeletons
  • On paper/discussion based -so called  "formal" methods: ATAM, ARID, LAAAM to name a few

 

I will try to use the next few posts on SAF to elaborate on these issues - where, the next post on SAF - Evaluation will explain and demonstrate the "in code" methods. The post following that, will talk about the paper based methods, and the third post on the subject, will try to contrast the two approaches and  talk about their respective pros and cons

 


*As quoted by Martin fowler in "Who needs an Architect"  - which, by the way, provides for a very interesting reading on architecture and the architect's role.


 
Tags: Everything | Software Architecture | SPAMMED Process

November 11, 2005
@ 12:35 AM

I've posted in the past about the importance of stakeholders for the architecture design process (in  "Stakeholders everywhere"). I recently stumbled upon an interesting article called "Understanding Organizational Stakeholders for Design Success" by Jonathan Boutelle that goes into depth discussing both the importance of stakeholders and the process of mapping thier interests and influence (which I also mentioned in my post)


 
Tags: Everything | Software Architecture | SPAMMED Process

The next step in SAF after Modeling is technology Mapping. While mapping  is not a part of the architecture per se, it is, in my opinion,  an important and sometime crucial step.

Before I rumble on explaining why I think this is an important step, let me try to define what exactly do I mean by "technology mapping"

 

Architecture in essence is technology neutral - it describes the major components (read objects/services/components etc.) and their interactions -  but it doesn't specify what technologies and what technologies, products or existing assets will be used to implement it - that's where Technology Mapping steps in.

 

For example, in a previous post (Architectural Modeling - a First Step ) I presented the following block diagram as a possible view for an architecture (Layer diagram):

 

 

One possible technology mapping for this (view of an) architecture  is depicted in the following diagram:

 

 

Before I continue any further, I should probably say that I think that technology mapping is basically  a design activity and not an architectural one. But wait, if that is true, why am I mentioning this as a step in a framework for building architectures (SAF ..) ?

 

Well, besides the obvious answer, that it helped me create a nice acronym for the process - being the second M in SAPMMED,  Technology mapping can have a significant impact on the ability to actually implement the architecture. The wrong mapping can make adhering the architectural guidelines very cumbersome and sometimes nearly impossible.

 

For example, in one of the projects I am currently working on we made the decision to create an SOA (surprise, surprise*). We decided that the various services shall communicate using a service-bus and our intention was to map that to a messaging product (such as Microsoft's MSMQ  or Tibco's Rendezvous ). As it happens, the powers that be (i.e. upper management) decided that there is no room for investing in a new inter-process communication solution and that the project will be forced to reuse the existing solution. The existing solution, in this case, is a proprietary distributed objects solution developed in house. While our intention was to promote message-oriented development and contracts. The existing middleware, being a distributed object framework induce chatty RPC-like contracts (you can read more on RPC vs. message-orientation in "Mort gets the message" and few of the other posts on John Cavnar-Johnson's blog ). The implications of this technology decision is that the architecture team has to closely pay attention** (on daily level for the initial iterations) to what the designers and developers do so as to ensure that the architectural decisions are kept. 

 

On another project I worked for in the past we designed the solution to use an application server (multi-tiered hardware architecture) however the solution also had to incorporate ESRI's ArcGIS which (at the time) only worked in a "two-tier" client/server manner with its underlying data. In order to support the implementation of an application server (which we felt was really needed) we had to (decide and) develop an independent "entity layer" on top of ArcGIS. Failure to notice that the technology mapping implications  would have resulted in the architecture not being implemented (and serious scalability issues for that particular project).

 

The second reason for including Technology mapping in the architectural process is to help promote reuse in projects by making this as an activity at the architectural level (i.e. both early in the process, and making the decision to  reuse a component  a global decision governing the solution).

Making reuse a first-class citizen in the architectural effort can greatly help promoting product-line approach (examples for additional factors that can help promote product-lines include domain-driven design or concepts like software factories)

 

To recap:

  • Technology mapping is about deciding which products, technologies and existing assets are going to be utilized for implementing the architecture.
  • Technology mapping can have a significant impact on the ability to adhere to the architecture to the point where under certain mapping constraints you may need to reevaluate the architecture (i.e. is it worth-while to go "against the stream" and implement the architecture using a technology/components whose architecture.
  • Deciding which existing assets can be reused at the architectural phase helps to create systematic reuse (vs.  opportunistic reuse) which improves time-to-market and solution quality (assuming you carefully choose assets for reuse )

 

I recommend either making technology mapping a required step in the architectural process or alternatively revisiting the architecture once this step occur as part of the design.

 


*when you get over the incredible amount of hype around it (I believe) SOA does have several distinct business and technological advantages - I'll try to elaborate more on this in one of the next posts

 

**I'll talk more on what happens once the architecture is "released" for public consumption when I'll coved D- Deployment phase of SAF


 
Tags: Everything | Software Architecture | SPAMMED Process

October 26, 2005
@ 12:36 AM

I claimed in the past (on my "what's software architecture" post) that Architecture is a type of design - if that is true, an interesting question is do we also have architectural patterns ?

 

I think the answer is yes - there are architectural patterns they are also called architectural styles - I actually like this term better as it is helps differentiate them from design patterns; for example I agree with  Harry Pierson's observation  that many of the patterns in Martin Fowler's PoEAA are indeed technical patterns at an engineering or design level and not architectural patterns.

 

The difference between architecture styles and design patterns is similar to  the distinction between architecture and design - architectural styles effect the solution globally or at least it affects major parts of the solution and not solve local issues. Although it is interesting to note that some architecture styles have been well known well before the notion of design patterns for software was introduced.

 

SEI's architecture glossary defines Architecture Style as

"A specialization of element and relation types, together with a set of constraints on how they can be used."

That's a good start but it might be a little hard to understand we can basically say that an architectural style  defines a family of solutions in terms of a pattern of structural organization using a vocabulary of  components and connector types (plus constraints on their use). It is also worth mentioning that different styles can be combined to create compound or derived styles.

I'll try to illustrate what a style is using an example:

 

One of the basic architecture styles is "Layered architecture":

 

The layered style is composed of layers (the components) which provides facilities and has a specific roles. The layers have communication paths / dependencies (the connectors).

In a layered style a layer has some limitations on how it can communicate with other layers (the constraints). Typically a layered is allowed to call only the layer below it and be called only by the layer above it (but there are variants e.g.  a layer can call to any layer below it; vertical layers that can call multiple layers; etc. - as long as the layers communication paths are limited by some rules)

 

You can see the application of layered style all over the place for example: logical software layers (e.g. presentation component, UI Controllers, business processes, business components, data entities, data access layer) , SOA layers (fundamental , Intermediate, Process), physical tiers (e.g. database server, application server, web server and clients)  etc.

 

There are many other architecture styles including for example pipe and filter, push based, peer-to-peer, blackboard, MVC, PAC, or the more recent Adaptive Object Models, REST and SOA (some will probably disagree that SOA is a style - but I'll try to explain why I think it is on another opportunity). I'll try to talk a little about some of them in the next posts.


 
Tags: Everything | Software Architecture | SPAMMED Process

Ok, so we've identifies stakeholders, set principles and guidelines, found out what are our architectural requirements and we want to start modeling already - especially considering architectural modeling is great fun (for techno-geeks like myself anyway).  However, before we just start to create an endless  flurry  of blocks, boxes, arrows and whatnot, it is probably worthwhile to take a few minutes to think which Models we want to create and what views do we want to use to communicate them otherwise we may end up doing a very good job thinking about and  describing something that doesn't interest anyone and gives no real value.

 

IEEE 1471 defines (an architectural) view as a representation of a whole system from the perspective of a set of concerns (where concerns are the key interests of the different stakeholders)

A view is comprised of parts of one or more models to demonstrate the how the concerns are covered

A view is (sort of) an instantiation of the pattern defined in a viewpoint (see more below)

 

So for an example if our concerns are concurrency and timing issues looking from this viewpoint can  be looking at the threads and process model of the solution and we can use views like process diagrams and timing diagrams to visualize it.

 

One thing we can do (over time) is to create a viewpoint repository and then when faced with a set of concerns  we can easily choose which views to create.

Rich Hilliard suggests that a viewpoint would hold the following information:

  • Viewpoint name
  • The stakeholders addressed by the viewpoint
  • The stakeholder concerns to be addressed by the viewpoint
  • The viewpoint language, modeling techniques, or analytical methods used
  • The source, if any, of the viewpoint (e.g., author, literature citation)

  A viewpoint may also include:

  •  Any consistency or completeness checks associated with the underlying method to be applied to models within the view
  • Any evaluation or analysis techniques to be applied to models within the view
  • Any heuristics, patterns, or other guidelines which aid in the synthesis of an associated view or its models

 

If you are just starting out you can turn to various frameworks to see which views they have and use that as the basis of your repository for example:

RUP suggest a 4+1 set of viewpoints:

 

Microsoft is currently moving from the MSF 3.0 model (4 spheres - contextual, conceptual, logical, physical crossed with 4 "views" -business view, applications view information view and technology view totaling in  16 viewpoints) to the following set of viewpoints:

 

Other examples are DODAF (3 main viewpoints with 26 sub-viewpoints), RM-ODP (5 viewpoints)  and Zachman Framework (with 36 viewpoints)

          

         Lastly the software architecture document  template  I've posted also contains a list of possible views

 

 

What about a minimal set of viewpoints? Well, I don't know about a minimal set - there are however few viewpoints that  usually get used:

Some sort layer view (usually block diagram)

A logical view (main classes/packages)

A deployment diagram (tiers, zones etc.)

A view to show concurrency and timing issues.

On SOAs there's also a service view (services, policies)

 

To wrap-up -

  • Choosing which views to create (viewpoints) is an important step before embarking on modeling.
  • Viewpoints are chosen according to the stakeholder's concerns (to help make sure the concerns are addressed and to communicate the architecture better).
  • Growing a viewpoint repository is a way to reuse your knowledge of concerns to views mapping.

 
Tags: Everything | Software Architecture | SPAMMED Process

October 6, 2005
@ 10:14 PM
 

Harry Pierson writes in his blog a very interesting piece investigating the tenets of good models.

 

While I tend to agree that there is a place for precise formal models that can be transformed easily to lower levels.

I would also like to argue that

  • I think that imprecise models are also very useful, since at different points in time during the development you cannot fully specify all the finest details (even if for the "current" level of abstraction) esp. since most projects these days are iterative.
  • Which brings me to the next point - for imprecise models - I don't necessarily think that there's a need to keep (all of) them updated during the development life-cycle. The high-level designs can be replaces by detailed designs and they in turn can be replaces with the code itself - good code explains itself beautifully :) .
  • You should carefully weight the ROI for creating such a precise model. For example I happened to work on a large (hundreds of man-year) project where the initial thought was to use a tool (Vitech's Core) to for requirements analysis. The benefit was that (if done right) the model created can be "run" using their built-in simulator. After spending more than half a year (of a rather large team) we finally decided to drop this precise model for a much less precise model of use-cases which allow for a varying level of abstraction. It should be noted though, that (cross-subsystems) use cases are later refined into a DSL  which is actively used for generating cross-subsystem interfaces and simulate  missing system during  integrations.
  • Another point from the former example is on the timing of requiring the precision. Modeling tools should allow several levels of precision, since in earlier stages you (usually) cannot determine all the bits and bytes that will allow for a "deterministic transformation"

 

 

Just my 2 cents


 
Tags: Everything | Software Architecture

October 5, 2005
@ 10:29 PM

 

I've added a sample skeletal  template for a Software Architecture Document  on the SAF page.

 

Few notes on the template:

  • It is based on formality level required for large safety critical projects - most projects do not need this level of formality and can (and should) do with fewer views
  • When there isn't a specific requirement by the customer to create an "official" architecture document - you probably want to have it just barely good enough
  • The view list in this template is not an exhaustive list - these just the views I've used most often. (more on views on the next post :) )

 

 

Any feedback (either as a comment or to my email) is more than welcome


 
Tags: Everything | Software Architecture | SPAMMED Process

September 30, 2005
@ 08:17 PM

Michael Platt talks   about the basic questions related to architects (what is architecture?; why are architects needed?; what are architects?; what skills do architects have?; and what are the types of architect). I agree with some of the things he says - especially the discussion on the architect skills.

I think Michael's definition of architecture is too simplistic (see more below) and I don't agree  with his classification of architects

 (here    is what I think on architecture types). For example he bundles solution architects under technical architects  I believe solution architect also have a lot to do with the problem domain and not just the technical or technological sides of the solution.

 

He also defines a "product architect" :

"Product level architects have an in depth understanding of the use of a specific product in a technical architecture domain such as Lotus notes in the messaging domain. Typical product architects are Exchange, SQL Server (normally as a DBA), Windows, and Networking etc"

I don't think that people under these roles are actually architects - they are definitely specialist and they may very well be experts but the breadth of their designs is very local in the scope of a complete solution and their skills will never be used on their own -  for example even if you do a data warehouse  project designing the database is (a very important) part of the project but there's still a lot to do with getting the data into there and deciding what data you want to store.

Architecture talks about things in the global (in the context of a project/solution/product line/enterprise) and design deals with local issues (how to model the UI, optimize the DB, set up Lotus Notes etc.)

 

What do you think?


 
Tags: Everything | Software Architecture

September 28, 2005
@ 11:22 PM
The next step in SAF (after "quality Attributes") is Modeling.  Webster's dictionary defines "Model"  as (among other things) : "3 : structural design; 7 : Archetype ; 10b : a type or design of product (as a car);11 : a description or analogy used to help visualize something (as an atom) that cannot be directly observed" - as I mentioned in my what's software architecture post there's no single structure that IS the architecture - this means that we'll have to look at the architecture from different angles (viewpoints) - for example a block diagram such as the diagram below  (accompanied with some documentation that explains it) may tell us something about the layers that will be used to solve a problem it tells us nothing about the business that we are trying to solve.

 

 

We need to augment with more views on the system (such as the next diagram) so that we can better visualize and convey the architecture of the system.

 

 

 

IEEE 1471-2000  "Recommended Practice for Architectural Description of Software Intensive Systems"  defines the relationship between the Stakeholders, their concerns, viewpoints, views and models:

(IEEE 1471 model adapted from a presentation by Rich Hilliard )

 

By the way, the fact that we need to look at the architecture from different viewpoints doesn't  necessarily  mean that the documentation isn't just POW("plain old whiteboard") - The formality of the documentation is driven by the project style (agile/formal) and other stakeholder constraints (company standards, customer requirements  etc.)

 

Since models participate in views (which in turn conforms to viewpoints -  which address the stakeholder's need/interested in) I consider choosing the views a prerequisite for modeling. Thus, on the next post on modeling I am going to talk a little about choosing views (suggested minimal set,  viewpoint library etc.). Once I'll get that off the table I'll try to talk a little on architectural styles and patterns and follow that with some strategies for dealing with quality attributes before moving to the next SAF phase (Mapping).

 

 

 


 
Tags: Everything | Software Architecture | SPAMMED Process

September 19, 2005
@ 11:02 PM

Rich Turner writes about architecture & change. I agree with a lot of what he says.

I often find myself explaining to stakeholder (PMs, developers, etc.) that the only constant thing in the project is that it is that it is changing :)

 

 

One project I am currently working on,  exhibits this inherent change - For one the  customers don't really know yet what - they want (think) that they would be able to use the exact same solution for a  several very different configurations (ranging from standalone computer to a server farm with dozens of clients) etc. 

What do we, as architects, can do to handle these situations? Well, what I usually do is add a lot of modifiability related quality attributes (see utility trees) these can include requirements for interoperability, adaptability, changeability etc. etc. (An example scenario may be: provided with a verified algorithm,  replacing an existing navigation  algorithm shall take less than 3 man-weeks).

 

While I will (probably) talk a lot more on strategies for handling quality attributes in posts regarding modeling (as part of my SAF posts), SOA (sans hype…) is a good strategy to cope with flexibility (i.e. changing requirements). The explicit boundaries and contract first approach help localize changes  also the resulted loose coupling help to replace, add and remove services easier, Lastly the basing communication on policy helps postponing issues that has to do with the network (QOS, security etc.)

 

 

A completely different angle on the issue of changes -  is that sometimes it can be problematic to allow it, even at the expense of missing some of the customer's changing requirements. An example (maybe the only one) is for critical systems, where a defect can result in loss of lives  (two polar examples are medical systems on the one hand and weapon systems on the other). In order to be able to ensure software safety (Identifying hazards, proper fault tree analysis etc.) there's a need to affix the requirements for longer periods compared with other types of projects.


 
Tags: Everything | Software Architecture | SPAMMED Process

In previous posts (here  and here) it seems I downplayed the importance of functional requirements (vs. quality attributes) on the architecture. Nevertheless the functional requirements do have few  important roles in shaping/looking at  the architecture. One aspect of the functional requirements role was demonstrated in the scenarios that describe the instantiation of quality attributes within the system. Lets look at a couple of other aspects.

 

As I mentioned in a previous post no single structure is the architecture - this means that in order to present an architecture it is needed to describe it from several different angles or viewpoints (I'll spend some time taking about these in the next few posts on SAF). One of these has to do with the domain architecture for the solution. This can include identifying business areas trying to identify services (in an SOA), identifying major components for a component based architecture, or even identifying major use cases in a "use case view" (as part of a Unified Process 4+1 approach )

 

Listening to "Almost Cut My Hair" by Crosby, Stills & Nash, in the background kinds of brings me to the 3rd area where  I see functional requirements meets architectural decisions. When there are extreme/hard/important requirements there are many times where you have to decide whether to cut your hair and  bend the entire design to answer that requirement (i.e. make that a global decision - hence an architectural one) or satisfy it by a specialized local solution (i.e.  Postpone it to the design phase). For example, I once worked on a system where we identified one area of the solution that needs a (relatively) high update rate (low latency, high throughput for updates coming from an external system, processing it within the system and  all the way to the user's workstation window). While this was both predicted to be frequently used and an essential requirement for the success of the system the majority of the system's functionality did not have these characteristics. The decision I've made was to treat it as a local issue (to be given a specific solution @ design time). (Unfortunately ?) I moved to another company before the project ended, and the architect the followed me took the decision the opposite decision (i.e. to make the solution for that problem an architectural solution for all the system's "entities") - which resulted in making all the interactions within the system (even the simplest ones) asynchronous. I recently  had a chance to see the effects of this decision on the system's schedule, robustness and complexity, well let me just say that the lesson here is that while cutting your hair (making an architectural decision) is not an irreversible decision, you cannot just undo it instantly and it can take quite a long time (=money) to correct things.

 

To sum things up

  • Functional requirements manifest themselves as part of the utility tree (as the scenarios),
  • It can also be important to view the architecture from the functional perspective
  • Significant/important functional requirement should be weighted for their influence on the system's architecture (should they get a local treatment (as part of the design) or affect the system globally)

 
Tags: Everything | Software Architecture | SPAMMED Process

September 12, 2005
@ 07:29 PM

Grady Booch (one of the "three amigos" - fathers of UML) added a gellery of Architecture views from the literature (jpg's) to his Architecture Handbook site

Note: registration needed


 
Tags: Everything | Software Architecture

In the previous post  about SAF I introduced the concept of quality attributes. I wrote that using a "utility tree" approach is a very good way to identify, document and prioritize quality attributes. The purpose of this post is to expand on this issue

 

As I mentioned before, MSF 4 for CMMI improvement make use of LAAAM (developed by Microsoft's Jeromy Carriere )

) for assessing the architecture (it is used there for assessing the architecture, which is also a good place to use it - but I'll talk about that when I get to E(valuation) of SAF.). LAAAM also builds on a "utility tree, below are the sub-activities mentioned in the MSF beta bits:

 

  • Examine quality of service requirements and product requirements to determine the key drivers of quality and function in the application.
  • Construct a utility tree that represents the overall quality of the application. The root node in the tree is labeled Utility.
  • Subsequent nodes are typically labeled in standard quality terms such as modifiability, availability, security. The tree should represent the hierarchical nature of the qualities and provide a basis for prioritization.
  • Each level in the tree is further refinement of the qualities. Ultimately the leaves of the tree become scenarios.
  • For each leaf in the utility tree, write a scenario. The scenario is in the form of context, stimulus, and response. For example, "Under normal operation, perform a database transaction in fewer than 100 milliseconds."
  • Open the assessment matrix template. Enter each scenario as a row in the assessment matrix.

 

ATAM (by SEI) - (another architecture evaluation methodology) talks about a similar process with the addition of prioritization:

 

  • Select the general, important quality attributes to be the high-level node
    • E.g. performance, modifiability, security and availability.
  • Refine them to more specific categories
    • All leaves of the utility tree are “scenarios”.
  • Prioritize scenarios
  • Present the quality attribute goals in detail

 

This post is going to cover writing the scenarios, their prioritization and what's missing from both these methods (since they are evaluation methods) - ways to help us identify which quality attributes to use in the first place.

 

First, before we delve too much into details, here is an example for what the end result might look like (taken from http://www.akqit.ch/w3/pdf/bosch_atam.pdf - I am trying to see what I can publicize from project's I've been involved with - but I guess this will have to be later, i.e.  in a separate post)

 

It is hard to explain exactly how you would go about eliciting the quality attributes and their refinements (I think that the best way to do that would be through a workshop - but it's hard to do that over a blog :) - it does, however, include the same techniques you would use to elevate any other requirement -either by building on your past experience from similar systems but  mostly by working closely with your stakeholders:

  • Interviews - meeting with individuals stakeholders to discuss their view of the system
  • Brainstorming - meetings with multiple stakeholders trying to come with attributes and scenarios
  • Reading written requirements (if available) - e.g. RFPs, use cases , project risks document etc.

 

To help with the elicitation, I'll try to give you some list for the first two levels (Attributes and refinements) that can serve as a repository or checklist when you are working with the stakeholders.

 

I already provided  a relatively long list of quality attributes to draw from to create level 1 of the tree (though the list is not an exhaustive one) in the previous post .

 

For the next level 2 of the tree (refinement) consider the following lists for the common quality attributes (most from A Method?< Analysis Tradeoff Architecture the to Scenarios General of Applicability)

 

  • Performance
    • latency
    • deadline
    • throughput
    • jitter
    • miss rate
    • data loss
  • Availability -
    • time period when the system must be available
    • availability time
    • time period in which the system can be in degraded mode
    • repair time
    • boot time
  • Modifiability / Replacability / Adaptability /Interoperability
    • difficulty in terms of time
    • cost/effort in terms of number of components affected
    • effort
    • money
  •  Efficiency
    • Resource X (CPU/Memory/…) usage on average per unit of time
    • Max usage of Resource
    • Availability of resource over time
  • Usability / Learnability  / Understandability / Operability
    • task time
    • number of errors
    • number of problems solved
    • user satisfaction
    • gain of user knowledge
    • ratio of successful support requests to total requests
    • amount of time/data lost

 

The scenarios are the most important part of the utility tree, the main reason is that the scenarios help us understand the quality attributes needed, and more importantly, by tying the attributes to real instances in the system the scenarios help make these goals both concrete and measurable.

 

A couple of things that are important to note about scenarios

  • First and foremost - Scenarios should be as specific as possible.
  • Scenarios should cover a range of
    • Anticipated uses of the system (“use case” scenarios) - what happens under normal use
    • Anticipated changes to (growth scenarios) - where you expect the system to go and develop
    • Unanticipated stresses to the system ("Soap opera scenarios" or exploratory scenarios , pushing the envelop etc.) 

 

 

Scenarios are basically statements that have a context a stimulus and a response and describe a situation in the systems where the quality attribute manifests itself.

Context - under what circumstances

Stimulus - trigger in Use case lingo

Response - what the system does.

 

 let's look at few examples to try to clarify this:

 

  • Under normal operation, perform a database transaction in under 100 milliseconds (Use case)
  • Remote user requests a database report via the Web during peak period and receives it within 5 seconds (Use case).
  • Add a new data server to reduce latency in scenario 1 to 2.5 seconds within 1 person-week. (Growth)
  • An intrusion is detected, and the system cannot lock the doors. The system activates the electromagnetic fence so that the intruder cannot escape (Use Case)
  • For a new release, integrate a new component implementation in three weeks. (Growth)
  • Half of the servers go down during normal operation without affecting overall system availability (Soap opera)
  • Under normal operations, queuing orders to a site which is down, system suspends within 10 minutes of first failed request and all resources are available while requests are suspended. Distribution to others is not impacted. (Use case)
  • By adding hardware alone, increase the number of orders processed hourly by a factor of ten while keeping the worst-case response time below 2 seconds (Soap opera)

 

If we take one of these (e.g. "An intrusion is detected, and the system cannot lock the doors. The system activates the electromagnetic fence so that the intruder cannot escape ")

The stimulus - An intrusion is detected

Context - the system cannot lock the doors.

Response - the system activates…

Or another one (Half of the servers go down during normal operation without affecting overall system availability)

Stimulus - Half the servers go down

Context during normal operation

Response - without affecting overall ...

 

 

The last step is prioritizing the scenarios, it is common to use 2 criteria  (though you can use more)

  • Importance  to system success
    •  High, Medium, Low
  • Risk/difficulty in achieving
    • High, Medium, Low

 

The interesting scenarios (where you would focus) are the ones with high priority (H,H);(H,M) and (M,H) - these will be used as input for the modeling step of SAF

 

I'll try to provide  samples based on my experience in one of the future posts.

 


 
Tags: Everything | Software Architecture | SPAMMED Process

I just found this on the new CodeGallery on GotDotNet (via Brad Wilson's blog). The paper provides for an interesting reading and discusses some of the issues I was going to cover regarding Architecture Evaluation (I will probably blog about them anyway, but that's just because I like to write :) ).

Also note that the document is labeled "Microsoft Confidential." on the first page - I am guessing that the document status changed and they forgot to remove this notice - but it might also mean the document will not be there for long...


 
Tags: Everything | Software Architecture | SPAMMED Process

September 1, 2005
@ 08:01 PM

Harry Pierson talks about architecture and software architects. He quotes and adopts Alan Cooper 's* view of an architect : "...The architect is responsible for determining who the user is, what he or she is trying to accomplish, and what behavior the software must exhibit to satisfy these human goals..."

 

I agree that customer focus is a very important aspect of the architect's work - To quote from WWISA's definition of the architect's role:

"Client advocacy is the cornerstone of the architect’s role ... An architect ceases to be an advocate if tethered to a prescribed set of technologies, tools, or methodologies, only narrowing the solutions available to the client…". Marc Sewell in "The Software Architect's Profession: An Introduction" expands this view comparing software architects to construction architects, i.e. the architect role is to represent the client vs. the construction (development) organization.

 

Nevertheless - I believe that the only way for the architect to accomplish these feats is to do design - no - not "low-level" design of local issues, but yes the design of the overall system, be that partition into business-aligned services advocated by SOA or identifying strategies needed to cope with fault-tolerance (an availability issue).

Yes "Coming up with the lists of functional requirements and non-functional constraints is the architecture problem" (as Harry wrote in a previous post) is an important part of the architect's job - but it is far from being the only part, this is just the preface to the actual work of laying out a solution that can support these requirements and especially the quality attributes (non-functional reqs.). (To use Marc Sewel's analogy) Just as building architects design blueprints for buildings, Naval architects design blueprints for ship building  - the software architect has to draw the blueprints which the development teams will use.

Lastly, Jack Greenfield et. al. pointed out in "software factories" the model of one level of abstraction are the specifications for the next level of abstraction thus the requirements  are the specification of what the system does (without specifying what the solution is) and architecture is the specification of what the solution is (without specifying how it should be implemented)

 

It is also important to note that the customer is not the only stakeholder whose concerns has to be considered (and balanced) by the architect (see a previous post I made on stakeholders) - This aspect  is even more intensified in "real-life" since the architect is more often than not a part (or hired by) the developing organization and not the client (i.e. you don't usually see a client that hires an architect to represent it vs. the development contractor).

 

Having said that, I would also like to comment on the quote "architecture is design but not all design is architecture" - what I meant to imply by that is not that architecture is some sort of "good design", but rather, it means that architecture is a certain level of design that takes into account global decisions which affects the whole system (and identifies key local decisions) and that there are other levels of design which are not architecture (what Harry calls engineering)

 


* Alan Cooper is considered as the father of VB. He is  also the author of very enlightening books on interaction design: "About Face 2.0" and "The inmates are running the asylum" 

 


 
Tags: Everything | Software Architecture

August 15, 2005
@ 10:16 PM

If you are following this or Udi Dahan's blog, you've probably already read that we are currently working on a project together.  The project we're working on has a lot (and I mean a lot) of flexibility requirements (adaptability, replacability, interoperability etc.) which pretty much pushed us to work service-orientations.

Naturally (?), we invested  a lot in partitioning our services around logical boundaries, which will allow autonomy and loose coupling. we also tried to make sure this partitioning is aligned with the business goals of both the current project and the future road-map (the same architecture should grow to serve a product line).

The more uncommon (and probably more interesting) design decision we made is  regarding the internal structure of the services. We have an additional level of services (i.e. Micro services) that partition the service internally. The big difference however is that the alignment of the services boundaries is technical/design driven and not business driven. For example we can have a micro service to deal with persistence and another to do number crunching (for important algorithms) etc.

What are the benefits of this architectural decision?

  • Unified semantics both inside and outside the service
  • Unified solutions for availability
  • Fine grained control on scale-out strategies and bottleneck handling (scale-out in the service level or the micro-service level is relatively easy)
  • Built in solution for increasing concurrency (remember, partitioning along technical decisions)
  • Contract based negotiations allow for increasing flexibility in identified areas (e.g. set important algorithms as a micro service and it will be easier to upgrade them later to a more advanced version) 

The main cons are additional complexity and increased latency inside the service. also there's the risk of partitioning the micro services to too fine-grained services and increase the latency further (watch out for chatty contracts)

Other things you to pay attention to - You probably don't want to have fine-grained micro services (just as much as you wouldn't want them in the services level). Also, it is probably not a very good idea to have too many levels of micro services (it seems that a single  level below the service is just right , but I don't have enough data to validate that yet).

By the way, it was Udi that dubbed the term "Micro Service" - so if you decide to use it, sent the royalties to him :)

 

 


 
Tags: Everything | Software Architecture

August 10, 2005
@ 09:50 PM

Udi Dahan and me are currently architecting an interesting little-big project (little as it has a very limited scope for a first deliverable on a very tight schedule and big as the stakeholders require the architecture being drafted now to fit/scale for some very aspiring plans for the future - past this deliverable). It is very interesting working together seeing how different people have different styles, approaches and even directions looking at the same set of requirements and problems (and that's when we basically do have a common approach).

Yesterday Udi posted on one such difference - in the area of documenting our architectural designs. Udi thinks it is more simple and understandable to use sequence diagrams while I opted for using UML 2.0 communication diagram 

The main reason for me to choose this diagram )over sequences) was to show off my UML skills (?!) - well, not really - rather the reason was that we were trying to describe to overall context of the UI. Using a Communications diagram helps alleviate the responsibilities of the classes in the (UI) context by showing both  the relations of the main classes (actually class instances) and the main flows of messages between them. Using a single diagram allows for understanding (once you get yourself familiar with the syntax) how all the object collaborate to achieve the mutual goal of making the UI tick (I guess a better name for this diagram is collaboration diagram -but that means something else in UML 1.x).

Also as Scott Ambler mentions in The Object Primer 3rd Edition: Agile Model Driven Development with UML 2 collaboration diagrams are useful when use cases aren't the primary requirements artifact (which is the case in the particular project we're working on).

As for "visual shock" - well  I think the class diagram is less intuitive and straightforward compared with the communication diagram. what exactly is the difference between aggregation and composition and how is that different from association. Also when would you use association over dependency etc. Class diagrams seems to be "simple" because we are used them. Communication diagrams

Lastly talking about the "white-board" approach  - compared with sequence diagrams,  using communication diagrams it is much simpler to rearrange message and/or to insert a new messages in the middle of a sequence (just draw a new message and renumber the other messages)

 


 
Tags: Everything | Software Architecture

August 9, 2005
@ 11:09 PM

I read today a post by  David Ing called "An Overly Long Guide to Being a Software Architect" . David talks about different aspects of a software architect - among those things he mention two important soft skills for architects namely Organizational Politics and communications. Two additional soft skills (or competencies) that an architect needs are  strategic thinking and Leadership (There may be some others but I think these 4 are the main ones).

Dana Bredemeyer measures competencies from 3 viewpoints - what you know, what you do and what you are.
For example looking at Organizational Politics

  • What you know - Who the key stakeholders are, what they want from the business and personal perspectives
  • What you do - Communicate, network, build relations, sell the vision/architecture
  • What you are - comfortable with compromise and conflict, able to look at situations from several viewpoints, articulate, patient, resilient, sensitive to power flow within the organization.

Or if we look at leadership

  • What you know - yourself
  • What you do - mentor others, motivate others, build teams and set their vision, influence decision makers
  • What you are - committed, passionate, credible.

Brdemeyer also supplies competency elaborations (levels for each competency and and guidelines on how to advance yourself between the levels) for Strategic alignment, Organizational Politics  and Leadership .

Another interesting source on architecture competencies is a book called "Software Architecture - Organizational Principles and Patterns"   by David M. Dikel, David Kane and James R. Wilson.
The authors detail a reference model of the organizational aspects of the software architecture process (vs. for example the SPAMMED Architecture Framework (SAF) which details the more technical aspects of the process).
The model takes about 5 principles :

  • Vision - Deals with the mapping of future value to the constraints on the architecture and how well understood, flexible etc. are the architecture's structure goals
  • Rhythm - the predictability and recurrence in the exchange of deliverables between the architecture group and the architecture consumers.
  • Anticipation - The extent to which the architects predict, validate and adapt the architecture to changing requirements (as well as technologies, competition etc.)
  • Partnering - the extent to which the architects interact with the architecture stakeholders to allow maximizing the value delivers or received by the different parties
  • Simplification - achieving the "zen" (clarification and minimization) so to speak of both the architecture and the organizational environment where it lives.

An example for the patterns and antipatterns that relates to Stakeholders (first step of SAF)

  • Criterion : The architect continually seeks to understand who the most critical stakeholders are, how they contribute value and what they want.
  • AntiPattern - Phone Doesn't Ring
  • Pattern - Know Thy Stakeholders
  • Criterion: Clear compelling agreements exist between stakeholders
  • AntiPattern: Lip-Synching
  • Pattern: Reciprocity

The book naturally continues to describe what these patterns are :).

The important takeaway from this is that while knowing every nook and cranny of "the framework formerly known as Indigo" (WCF) will probably won't harm you -  technical competency alone will only take you so far as an Architect and  you can not afford to neglect growing and working on the aforementioned soft skills .

 


 


 
Tags: Everything | Software Architecture | SPAMMED Process

I recently saw a post by James Gosling (via David Strommer ) called the Eight Fallacies of Distributed Computing . These are eight assumptions on network almost anyone new to distributed computing assumes  which proves to be wrong in the long run (and thus cause big problems and headaches).

I thought I'd try a to complement this list by adding few realities on distributed systems and data

  1.  Expect a certain level of entropy in the system -  Sites are never fully synchronized (unless you stop new data from pouring in)
  2. You can only afford to cache immutable data
  3. It very very hard to be able to scale indefinitely
  4. Observing global state is only possible via control messages
  5. It is hard to achieve distributed consensus (membership in a cluster, total order, commitment etc.)
  6. Expect to debug by log-files

There are probably many others - but these are the first few that came to mind :)

 


 
Tags: Everything | Software Architecture

July 24, 2005
@ 09:18 PM

Architecture Description Languages (ADLs) sound like a great idea when you look at the "spec sheet"

ADLs represent a formal way of representing architecture
ADLs are intended to be both human and machine readable
ADLs allows simulation and analysis of architectures – completeness, consistency, ambiguity, and performance
ADLs can support automatic generation of  the system (as it is represented by the architecture)

There are quite a few ADLs out there xADL , ACME/ADML,  SSEPRapide,  Wright  and the list goes on and on.

I am guessing most of you didn't hear about any of those (well ACME appears in all those Willy E Coyote vs. Road Runner cartoons but that doesn't count). I think that the culprit lies in the fact that most (if not all) these come from the academic world where the focus lies on the models (in terms of semantics, completeness, rigorness etc.) and not on the practical use and applicability to day-to-day issues. Another problem lies in the fact that main-stream tools used by the industry.

By the way, UML isn't considered an ADL for several reasons for example the weak integration between the different model that inhibits automatic analysis.

The reason I am bringing the issue of ADLs up is that looking at the new architecture designers in Visual Studio 2005  (the Application Designer , the logical data center designer ,the System Designer  and the Deployment Designer) actually form a set of DSL (Domain Specific Languages) that together can be treated as an integrated ADL. Furthermore Microsoft also provides an SDK   for the"System Definition Model" that is the underlying model behind all these DSLs which lets interested parties extend and build additional designers.

While this model is not complete in several aspects ( some designers like the logical data center designer are too limited or there aren't enough views to cover all the architectural description) it is a good starting point in bringing the ADL concept into more practical and usable form.


 


 
Tags: .NET | Everything | Software Architecture

This is part II of P is for Principles (and Guidelines and Constraints...) - Iteration I

In the previous entry I've said that it is helpful to provide a list of constraints and principles as it helps in limiting the scope of the solution and directing it toward good (and/or required) practices (Architectural, technological or business aligned). However, I also claimed that a simplistic list is problematic - I'll try to demonstrate this through a couple of real-world examples:

 I recently reviewed the software architecture document (SAD) of a rather large software project, I saw that they it is mentioned that the project uses "service oriented architecture" - reading on, I saw the architecture builds on a distributed "shared memory" where every client and server has a full image of all the data entities, further more data entities are intertwined without any boundaries whatsoever. when I asked what's service oriented in that, I was told that the underlying distributed "shared memory" engine provides several services like dissemination, scheduling etc. The point here is that a catchy name can mean different things to different people.

On another project some of the stakeholders mentioned it is important that the solution would have an "Open Architecture" - sound good to me, but what the hell does that mean? based on open standards? promote extensibility (easy to add features)? promote replacability (easy to replace components)? all of the above? something else?

Furthermore if you don't understand the implications behind each of the principles you name -  it gets very tempting to create a "Buzzword Oriented Architecture"- we want SOA, AOP, Software Factories, Smart clients, GRID, fault tolerance and whatnot... (Note - It is recommended to proceed with caution if you see too many buzzwords in your guidelines/goals. You might be trying to accomplish too much and/or you have too much marketing influence).

So what makes a good (or at least better) description for a principle?I currently use the following template:

Name - something easy to remember (e.g. SOA, Layered Architecture etc.)
Description - What does it mean
Rationale / Benefits -  Why do we want to apply this principle
Implications - What does it mean to use it 
Alternatives - What else - What are the other options we considered and why we didn't use them.
Scope/Exceptions - when and where does it apply

Note: I used to use a simpler template (whiteout Implications and Scope) the current version is based on a template by Ilia Fortunov from Microsoft UK. I can explain that further but it will probably be easier to understand through examples - so here are a couple from projects I worked on:

Principle Name: Code Generation
Description:Generate specific implementations and allow users to configure generated code via designers.
Rationale/Benefits: Increase longevity of the domain model (help separate from technical implementation). Reduce bugs via use of design tools.
Implications: Need to "templatize" solutions and develop code generators (need to check commercial solutions)
Alternatives: 100% object orientation and generic implementations. rejected due to tight coupling to technology, performance implications and impact on code readability.
Scope/Exception: Domain entities and any "aspect" related implementation (logging, security etc.)

Note: this is for a solution that has to use .NET 1.1 - had it been for a solution that relies on .Net 2.0 Another option - using Generics  might have been a viable solution

Another example:

Principle Name: Distributed Database
Description: Each site shall have a separate independent copy of the DB and will have to synchronize its data with connecting sites (Note that I use the term synchronize and not replicate - as replication is a specific technical solution)
Rationale/Benefits: The inter-site communication medium is unreliable plus sites has to maintain autonomy.
Implications: The system has to cope with partial data. There's a need for conflict resolution policy. We need to consider idempotent messages for inter-site communications. Need a distributed primary keys management scheme.
Alternatives: Federated database - problematic since (some of the) data will not be available when sites are disconnected. Another solution weighted is a combination of centralized server and "off-line" capabilities upon disconnection - rejected as it would be more complicated (based on past experience) and have high-dependency on bandwidth (for on-line work)
Scope/Exceptions - Database layer and inter-site communication.

The next thing we have to deal with are constraints.As I've said in the previous post constrains originate from the different stakeholders and limit the scope of the solution. To document the constraints in a meaningful way I use the following template:

Name: something easy to remember
Definition: What does it mean
Implications: What does it mean for the architecture? what are the limitations it places.
Scope: where will we feel the impact
Origin: who placed this constraint and why

Again, lets look at a couple of examples:

Name: Use .NET 1.1
Definition (pretty obvious...)
Implications: use tools/products compatible with .NET 1.1; don't relay on .NET 2.0 capabilities (important for the mapping stage of the SAF)
Scope: All the system
Origin: using .NET is a company policy; we need .NET 1.1 since all the existing tools (e.g. ClearCase, XDE) only support 1.1

And another example:

Name: Deadline
Definition: We must have a working deliverable in 5 months.
Implications: Strive to reuse existing assets; try to migrate code from legacy version; try to model (an extensible) simple architecture (don't try to solve everything now - but try to leave flexibility for future growth)
Scope: Mapping stage; Architecture documentation; quality attributes of the solution
Origin: Customer  (time-to-market)

One thing important to remember is that we are not interested in all the project's constraints (we are- only not in this context). The meaningful constraints are those that have implications on the architecture.

To summarize. Principles and Constraints help you limit the scope of the intended solution architecture. Principles are based (mostly) on past experience, constraints must be followed (and are originated from the different stakeholders). Using only a catchy phrase to describe either of these (Principle or Constraint) can prove to be problematic (creates confusion, doesn't really add anything etc.) and it is better to think about the implications of applying the principles and constraints. Lastly, principles (at the first stage) should be taken with a grain of salt - as they may not be suitable for the current requirements - you should be ready to reiterate and update them once you know more about the requirements (The end result would be to have a list of guidelines which are actually used for the solution's architecture).

Both templates (principles and constraints) are available for download on the template section of the SAF page.


 
Tags: Everything | Software Architecture | SPAMMED Process

Whenever you start a new project, even if you think you start for scratch - you don't really start from scratch. You always bring your past experience into the fray. Developing an architecture for a project is not different. the "Principles"  stage of the SPAMMED Architecture Framework (SAF) is about bringing in your "lessons learned" and "best practices" as  baseline rules or starting point for the architecture you are trying to build. laying down a good set of principles will help you limit the scope of the solution and focus on proven tactics.

It is important to note that principles you set should be treated as recommendations - the main reason for that is that they are based on past requirements and experience and not on current requirements.

An example for a principle may be "use layered architecture"  Layered architecture is the practice where you define several level/areas of processing and limit the communication paths between them (common patterns are: layer can talk to the layer just below; layer talks to any layer below it etc. - the important characteristic is that communication is restricted). Layered Architecture brings a lot of benefits especially in promoting flexibility in deployment, modifying implementation etc. some of you reading this may think this is a trivial principle - however, sometimes it is good to put even trivial assumptions on the table, furthermore not everybody agrees that it is always needed see this  for example. Lastly using a layered architecture has its risks for example the different layers can't scale to the same extent you may find yourself with scalability issues down the road. (I'll give more detailed examples for principles in the next post on this subject)

A similar notion to principles, in the sense of limiting the solution scope is constraints. However, unlike principles, these are not recommendations but rather these are limitations you have to follow. Constraints are set by stakeholders (their origin may be company standards, customer standards).There are several types of constraints:

  • Technical - limit platform, reuse existing system/solution/component, follow a particular standard  e.g. use Windows, .Net (alternatively use Linux, Java), use Web-services
  • Organizational - follow a particular process, availability of the customer e.g. use RUP , MSF, company standard # 15 ..
  • Business - time/money (deadline, budget). Another interesting example for this is "application freeze" - when an organization forbids change for a period - something some organizations did just before Y2K. [thanks to Andrew Johnston from www.agilearchitect.org for this one]

So how does it works - well, you bring in your architecture team along with some of the technical stakeholders - go through one of those "brainstorming" meetings and come up with a list - e.g.:

  • Build on Open standards
  • Reuse
  • SOA
  • Layered Architecture
  • support scale-out
  • .
  • .
  • .

Wait, what's wrong with this picture - well, for one it is too simplistic view (it doesn't say much) and even more importantly it not accurate (i.e. it can mean different things to different people) - on the next post on this subject I will show a better way to document and understand principles along with a few examples from real projects.

 


 
Tags: Everything | Software Architecture | SPAMMED Process

[thanks to Eliaz Tobias from MS Israel for the link]

The beta version of MSF 4.0 formal is available for download. This is an interactive process which is compliant with CMMI level 3.0 (see http://www.sei.cmu.edu/pub/documents/02.reports/pdf/02tr029.pdf chapter 7 - maturity levels)  and supported by the tools (i.e. Visual Studio Team System 2005). This can be great news to organizations like the one I work for these days who are certified for CMMI 3.0 and can (hopefully) stay compliant with less bureaucracy.

Looking at this process from my perspective (i.e. as an Architect) it also looks interesting. It defines several roles for architects (e.g. in domain technical level as Subject matter experts and in the solution level as an Architect).  The process suggested is tailored/aligned with VS2005 capabilities (and thus somewhat limited) however many of the steps are both viable and important. For example it has parts that do with quality of service requirements (what I call Quality Attributes in SPAMMED).  My particular favorite step is "Assess Alternatives (LAAAM)"  which I helped introduce ( :) ) in my previous job as an Architect for Microsoft Consulting Services

 


 
Tags: .NET | Everything | Software Architecture

July 7, 2005
@ 08:32 PM

It is not a secret that user involvement increase the success rates of software projects. We can basically look at the architecture as a mini-project inside a project. The users in that case (as my entry on Stakeholders shows (or tries to)) are the stakeholder.

Richard Demers talks about Architecture Control Board (on his smalltalk blog - oh my :) ) as a way to increase the involvement and of stakeholders for large software projects. The idea is to engage as many stakeholders groups as possible on a periodical basis and to set up a review and change board that approves changes in requirement and also review and approve  the architecture (I'll talk more about something similar when I'll get to E - Evaluation in SPAMMED).

Richard lists 13 points in regard to the Architecture Control Board - the most important ones (in my opinion) are

  • the ACB set the requirements scope for the architecture (what requirement/quality attributes should be accounted for)
  • the ACB  review and criticize the architecture - they don't, however, design it or vote for its correctness.
  • the architecture team has final say on architectural decisions (though an escalation path to upper management should exist)
  • the documentation approved by the ACB is the ultimate deliverable of the architecture team (i.e. the "Software Architecture Document")
  • go read the rest :)

While establishing a "formal" ACB in smaller-scale projects is probably an overkill you may still want to follow some of these tactics on a less formal basis to increase your stakeholders involvement and more importantly cooperation.

 

 


 
Tags: Everything | Software Architecture | SPAMMED Process

July 6, 2005
@ 04:48 AM

There's a lot of confusion on what Service Oriented Architecture (SOA) is and isn't. Martin Fowler sums it up nicely in his Bliki (Blog+Wiki). Clemens Vasters blogged something similar back in May. Whether you agree or not - one thing is sure - there's so much hype around SOA these days that it is hard to understand the realities.

 


 
Tags: Everything | Software Architecture

The Composite UI Application Block (CAB) is an interesting architectural solution for smart client development. It is essentially a plug-in architecture that brings the WebParts concept to the desktop.

The CAB has the following elements:

  • WorkItems. These are the classes which represent use cases in your application and contain the business logic for those use cases.
  • SmartParts. There are the building blocks of a Windows-based application - similar in concept to WebParts.
  • Workspaces. These are helper classes that can display SmartParts with a uniform style.
  • UIElements. These are elements such as toolbars and menu bars that are shared by SmartParts within the application.
  • Support services. These include:
    • Event broker service to manage the publishing and subscribing of events between SmartParts.
    • State management service to hold shared state for areas within the application. This provides an option to encrypt the state before it is stored.

The CTP is now available on GotDotNet (needs .NET 2 beta)


 
Tags: .NET | Everything | Software Architecture

July 4, 2005
@ 06:34 AM

It should come as no surprise that the first pillar of the SPAMMED Architecture Framework (SAF) are the stakeholders - after all at least some of these are the people/organizations that are the cornerstone of the software project itself.

What exactly is a stakeholder - EIA 632 , a standard for System Engineering, defines a stakeholder as "An enterprise, organization, or individual having an interest or a stake in the outcome of the engineering of a system".

Sounds good enough to me :) - but before we delve into more details on how to identify these stakeholder, what do we do with that information, we first have to understand why it is important to us as architects. The short answer was already mentioned stakeholders (most obvious examples are Customer, Project Manager) are what makes the project tick. That, however is just the beginning of it. One of the primary responsibilities of the software architect (much like the project manager by the way) is to balance the stakeholders interest to ensure the success of the project. In the SAF sense the stakeholder are important for several reasons:

  •  The solution is developed to serve their needs or goals (at least some of them i.e. customer, end users, management)
  •  They serve as the source  for constraints (and sometime principles)
  •  Their concerns can help elevate needed quality attributes
  •  Stakeholders can (and should) help evaluate the architecture
  •  The documentation of the software architecture is targets at stakeholders

Several stakeholders are pretty common to any project . The following list shows them along with samples for some of the concerns (or vested interest) they have regarding the project:

  • Customer - Functionality, price 
  • End-User - Ease of use, performance
  • Project Manager - On time delivery, development costs
  • Management - Price, reuse
  • Developers - structure and dependency between components, interesting technology
  • Maintainers - ease of debugging, modifiability
  • Testers - Testability, Traceability
  • Security Analysts - security
  • Project New comers - structure and dependency between components, traceability
  • Customer’s IT group - ease of installation, stability

This list is a nice starting point but it is just that - a starting point. There are still a few things we may want to do

  • Identify additional stakeholder - sometimes there are less common or obvious stakeholders (e.g. System engineers, shareholders, safety analysts, the general public etc.)
  • Map stakeholders relevance - Jaap Schekkerman from  suggest prioritizing stakeholders by power vs. interest. I believe the privatization should also include the (stakeholders) concern importance:

  • Lastly you may want to document your stakeholders so as not to forget what they are interested in. This documentation can include the constraints they place their concerns (translated into quality attributes) and a list of viewpoint that are needed to satisfy them (i.e. explain the architecture to them). A template for documenting stakeholders is available from the SAF page.

 
Tags: Everything | Software Architecture | SPAMMED Process


Everybody talks about "Software architecture" these days (your humble servant included...) - but what the hell is it?
I mentioned in my first  SPAMMED Architecture Framework post. that Software Architecture "deals with the major components or structures, their relationships and interactions. It encompasses the major (read hard to change) decisions and their rationale and every system has an architecture (even if it is a default one)" - OK sounds nice, but is it enough?

Probably not (If I thought it was enough - I wouldn't have wrote this post...). There are literally dozens! of definitions for what (solutions) Software architecture is ( you can see many of them here). I am not going to quote all of them, instead, Lets spend some time looking at some of the more prevalent characteristics found in most definitions

  • Architecture is Early - It represents (well at least, should represent) the set of earliest  design decisions which are both hardest to change and most critical to get right.
  • Every system has an architecture - even if it is just a default one (i.e. it can be described using reverse engineering) it still there.
  • Architecture is about breaking a system into components and setting boundaries. It doesn't describe all the components - it usually deals with the major components of the solution. Also it doesn't describe the complete characteristics of components - it mainly deals with their interfaces or other aspects that has to do with their interactions.

which brings us to the next point:

  • Architecture is about the relationships and interactions of components. Again we are interested in the  behaviors of the components as it can be discerned from other components interacting with it.
  • Architecture explains the rationale behind the choices (vs. the choices not made). It is important to understand the reasoning as well as the implications of the decisions made in the architecture since their impact on the project is large. Also it can be beneficial to understand what alternatives where weighted and abandoned (for future reference, when/if things needs to be reconsidered, and for anyone new to the project that needs to understand the situation).
  • There isn't a single structure that is the architecture - there's a need to look at the architecture from different directions or viewpoints to fully understand it.

There's a very interesting standard called IEEE 1471-2000  "Recommended Practice for Architectural Description of Software Intensive Systems"  which defines the relations between the system stakeholders and the different viewpoints of the architecture. It serves as a good reference for understanding how to document an architecture - it also means that identifying the needs of the different stakeholders will tell us a lot what views (details of the architecture from a specific viewpoint) we need to have.

  • Architecture is the first design artifact where a system’s quality attributes are addressed

Stakeholders are also the main source for these "quality" requirements and It is the architect's responsibility to balance the quality attributes of the system. This and the previous point are the main reason that the SPAMMED process first step has to do with identifying stakeholders  (And I'll elaborate more on this on the next SPAMMED related post - "Stakeholders Everywhere")

Apart for the quality attributes side of the last point - it also basically state that:

  • Architecture is design (but not all design is architecture) 

Which raises another interesting question - what's the difference between architecture and design. But this will have to wait for another post as well


 
Tags: Everything | Software Architecture | SPAMMED Process

Assuming the previous post made you curious  - here are a few links to get you started:

 


 
Tags: Everything | Software Architecture

June 21, 2005
@ 08:11 PM

The architect doesn't talk, he acts.
When this is done,
the team says, "Amazing:
we did it, all by ourselves!"
(17) (The Tao of software architect - Philippe Kruchten)

On the surface - When it comes to agile development the role of the software architect is a little more blurred.

The most obvious aspect where architects is for the technical architecture. An experienced technical architect, can greatly enhance any project by steering the designs into the "best" directions (under the chosen platform constraints). The technical architect can also promote reuse etc.

Additionally while the requirements change a lot and not fully defined, the quality attributes of a system are more stable - if you need performance, then you need performance!. More so,  there are some qualities that are inherent to agile process - for example you want to put an emphasis on flexibility and maintainability - if your developed solution does not have these it is going to do refactorings. Thus, depending on the size of the project you may want to use an architect to help set the ground rules in the first couple of iterations.

Another area where architect involvement can be very beneficial is when you try to scale an agile project a good choice is to try to break the project to smaller loosely coupled projects (see this paper  for example)  - well, this is just  what we (architects) live for...

By the way, few agile processes, define the architect role up-front, one such process is MSF 4 Agile. Note that MSF (Microsoft Solutions Framework) version 4 comes in two "flavors" one that is aligned with CMMI  and the other a (much) more light version, the afford mentioned MSF 4.

 

How does the SPAMMED process fare with an Agile project - (surprisingly enough) I would say pretty well

First of the architect should be hands-on i.e.  part of the development team (most likely the technical lead)

  • Stakeholders, you would probably want stakeholders on-site for any architecture related  meeting (just as much as you would want the customer on site for other activities)
  • Principles would include things like TDD, Simple Implementations, Refactoring
  • Quality attributes would hold Flexibility and maintainability and a couple or so of the important project qualities (performance, availability..)
  • For the modeling you would chose very few views and would try to focus on ones that have manifestation in deliverables
    A good example for this is the Application designer  in Whidbey (see screen shot below) - the result of which is the projects structure for the solution

  • Mapping doesn't really change
  • Evaluation would be focused on proof of concepts and/or skeletal architecture.
  • Deployment - well,  being part of the team... you would notice if your decisions were wrong or circumvented

To sum up - yes Agile projects can and many times should use an architect to help it stay on track - hey, even XP has an architectural spike...

 


 
Tags: .NET | Everything | Software Architecture | SPAMMED Process

June 20, 2005
@ 10:20 PM

I am writing an entry on the architect's role in an agile project and I noticed I am using the term "Technical Architect". My first intention was to go on and explain that within that  post, but I then thought it would be better to give a specific entry that paints the complete picture (well, at least, as I see it :) )

There are three basic classes of software architectures:

  • Infrastructure Architecture - Has to do with presenting an workable infrastructure solution (types, deployment and configuration of servers, LAN, etc.) it basically deals with the overall layout of the  "out of the box" solutions that help solve the problem. An example for this is MS's Windows Server System Reference Architecture 
  • Business Architecture - Concerned with the business model as it relates to an automated solution. It has to do with the structural part of requirements analysis, and it is usually domain specific (sometimes the job of a business analyst). Domain architecture is ideally technology neutral, although more often then not it is cluttered by technical constrains.
  • Technical Architecture - Specific to technology and the use of this technology to structure the technical views (esp. Technology Mapping) of an architecture. Technical architects usually have a very good understanding of a technology (e.g. .NET, J2EE etc.) and how to best solve problems using that technology.


Additionally there are two compound classes

  • Solutions Architecture - Specific to a particular business area (or project) but still reliant on being a technical focal point for communications between the domain architect, business interests and development.
  • Product Line Architecture - The architecture that has to do with families of related solutions. It is basically the same as solutions architecture only with an extended scope. A product line architect has to promote reuse and identify commonalities between several solutions while on the same time providing each of the solutions with enough specifics as to make it a good viable solution in its own right.

Lastly there's the more comprehensive architecture class which is Enterprise Architecture (EA). EA deals with the governing logic and strategy of the a firm's core business processes and IT capabilities. It is a set of recipes (policies and principles) along with technical constrains (that are set for the different solutions within the enterprise). The EA is concerned with cross project/solution architectures and tries to lay the rules needed to achieve the business standardization and integration requirements of the firm’s operating model.


 
Tags: Everything | Software Architecture

June 18, 2005
@ 08:16 PM

One of the most interesting things that MS is building these days is the set of tools around their software factories initiative.

Software Factories, is actually a methodology that builds on the concepts of Domain Specific Languages (DSLs) along with frameworks, patterns and guidance.

The idea is to bring the notion of domain modeling and DSLs from horizontal markets (e.g. the form designer in any modern IDE) to the everyday use of vertical markets (i.e. your next application) and help realize the promise of product lines. Many of the ideas in software factories are not new (see for example this article from 1993) - however, today,as the Jack Greenfield et al  state in their book, a number of needed technologies has (finally) matured enough to make it feasible. Furthermore  What's unique in this effort is that Microsoft is making the effort to back the ideas with tools  that will enable architects and developers to put them into use.

A software factory needs 3 elements a schema, a template and tool to support them (a development environment). The schema is basically the recipe containing the overall information on how things should work together. The template is the set of DSLs, samples, frameworks that are used to create the factory and lastly we need  the IDE that supports them (e.g. Visual Studio "Whidbey" - although, at least technically this can also be achieved in Eclipse etc.)

I am, personally, very interested in this initiative. We already use code generation all over the place (DAL, serializations, Data Entities etc) and employ a domain driven approach even today. I would love to have "user-friendly" tools that will let me as an architect achieve these goals more easily.

One final note - yes I am aware of MDA (Model Driven Architecture) but all the MDA tools I've encountered thus far are lacking in their usability. This, by the way, brings me to one caveat with software factories - it is not based on UML 2.0. I would rather have a DSL that is based on stereotypes, tagged values and OCL. This would allow for a specific view using the appropriate designer but also for backward compatibility (and a downgraded view) on other tools. I do agree though, that UML 2.0 has its own share of problems - but that's a matter for another discussion altogether

 


 
Tags: .NET | Everything | Software Architecture

June 11, 2005
@ 08:16 PM

The previous post on SPAMMED introduced the different pillars of the process. This one focuses on their interactions

The state chart below (modeled using Sparx System's excellent Enterprise Architect) shows the possible transitions between the different process steps

Eliciting stakeholders is usually the first step to take - the reason is that stakeholders serve as the base for several of the following steps for example stakeholders concerns serve as a guideline for deciding which views to document during the modeling step.

The next step is documenting principles and goals - which are based on past experience. The real reason however that this step follows the Stakeholders elicitation is that the step also includes documenting constrains, and most of these are set by the different stakeholders (e.g. use .NET/Java because it is a company standard).

Quality Attributes builds on the former two steps, deciding which quality attributes are important and balancing them is based again on the stakeholders concerns and the principles you've set.

You would then follow this with some modeling and then technology mapping.

Up to this point the process has been pretty much "waterfall-like", however after you evaluate your former steps, you may find that you need to revise any of the preceding steps. This is why you want to get here as soon as possible. do not try to complete and "finalize" the architecture and only then perform the evaluation. Developing the architecture, is very much like the other parts of the development life-cycle, and can benefit greatly from a short feedback loop. Evaluating early helps prevent the architecture from being a bottleneck holding all the development process and even more importantly, building too much structure based on false assumptions/decisions.

Deploying the architecture, i.e. releasing it to the designers and developers is the next step (assuming the evaluation was OK). Once the architecture is "out-there", the realities of the chosen technology/product, changing requirements, budget/talent/ time constrains and most likely the mere iterative nature of modern software development will probably still make you evaluate and retrace some of your steps as a result. I believe this refactoring is a healthy procedure -  in the first few iterations (the actual number depends on methodology used, iterations length as well as project size) of your project. Nevertheless if the number of times the architecture has to be reevaluated and changed or alternatively the number of changes is large - it is probably a sign that your architecture is in need of a serious overhaul.


 
Tags: Everything | Software Architecture | SPAMMED Process

June 11, 2005
@ 02:33 PM

Microsoft has recently went public with their Architect certification program  - While (naturally) they expect knowledge of the Microsoft platform ("about a quarter of the emphasis" to quote the site), it also has a lot of of emphasis on architect competencies and general know-how. Way to go MS :)

On the Java side of the fence there are already architect certification programs e.g. Sun's or BEA's.

There are also two (that I know of) non-vendor specific certifications - one from the Open Group (they also brought us TOGAF - which is worth its own discussion) and the other from the Software Engineering Institute (SEI @ Carnegie Mellon university)

I 'd be happy to learn of any other architect's certification programs if you know them


 
Tags: Everything | Software Architecture

June 11, 2005
@ 06:07 AM

Well, I haven't heard about that one yet - but there are few other groudps and organizations out there. Here are the ones I know about:

 

 

 

 

 

 

 

 

 

 


 
Tags: Everything | Software Architecture

General overview of the SPAMMED process for developing software architectures
 
Tags: Everything | Software Architecture | SPAMMED Process