Moving to architectures like SOA that increase the number of overall “moving parts” or components in the system means that reliability is going down. It is simple math really – if you have 10 components each with a 0.99 reliability then the total reliability is 0.99^10 or 0.904 and that’s before we take into account messages traveling over the wire and the network’s reliability (or lack thereof). What this does is leave us trying to build reliable systems from (a growing) bunch of unreliable components. I know, I know, there’s nothing new here. We’ve been using techniques like redundancy, statelessness etc. to help mitigate this since the beginning of times. With these techniques we decrease the “Mean Time Between Failure” (MTBF) but increase  the “Mean Time Between Critical Failure” (MTBCF) or the system’s overall MTBF.

Another aspect of reliability (and reliability calculations) is MTTR or “Mean Time To Repair” which in software mainly has to do with how much time does it take before we know something is wrong. The usual approach to that is monitoring which I’ve written about in the past (e.g. the blogjecting watchdog pattern). In this post I want to expand a little on another approach , which while not common in IT systems, can be useful at times.

Enter the BIT – which is short of “Built In Tests”. BIT is a technique I picked up when I worked on multi-disciplinary systems that also included embedded systems. Each and everyone of the embedded systems we developed (or integrated into the solution) supported BIT . Actually they usually supported several types of BIT at least PBIT, CBIT and IBIT

  • PBIT – Power-On Built In Test – usually a short test the system runs to make sure all of its components are ready to go. You actually saw this one a lot of times since this is what motherboards do as you turn them on (all the blips and lights etc.)
  • CBIT – Continuous Built In Test – Make sure the system is functioning, even when it isn’t really busy so we’ll know about problems before we actually try to use the system
  • IBIT – Initiated Built In Test – provides a way to find out exactly what’s wrong when one of the other test types failed

BIT is very understandable for embedded systems, after all these are closed boxes with limited access to their innards and inner workings. but isn’t that also true for SOAs? After all we are building a bunch of blackboxes that interact to provide some business benefit, how can we be sure that everything is working fine esp. when we don’t control fully control some of the parts?

As mentioned above, a system, especially a distributed one, is built from relatively unreliable components. A continuous test helps us make sure things are working as expected. What we are doing is taking some of the code we wrote to run integration and acceptance tests (which runs a scenario end-to-end) deploy it as a service into the system which we call “liveliness check” and have it run periodically. Every time the liveliness  runs it sends a notification (twitter message) so we know the test itself works. If it fails it sends more notifications (twitter, Email, SMS etc.) to an administrator.

This liveliness or CBIT serves as an early warning system. Since the end result is known in advance we can have a pretty good idea if something went wrong. E.g. we know how much time it should take for a test Id, we know what the result of that image is etc. The fact that it works even when the system is in low utilization means we can find out about problems and deal with them before they happen to end-users. That’s a big plus for us.

The advantage over regular monitoring solutions (this is not an either/or – monitoring is also needed) is that you know the specific business scenarios are properly working, which is a higher confidence that things are ok from knowing a specific server or service is running.

On the flip side, or the downside of adding a periodic liveliness is adding complexity into the system. In our case, we have to add a process to clean the traffic data added by the test messages. Also, while we try to make the system behave as usual as much as possible,  certain parts of the system will have to know about the test messages and handle them differently. Again, in our case the reporting has to know to disregard test messages and not count them. This is even more problematic in other types of systems, for instance if you simulate an order, you don’t want the purchase order to actually go out to a supplier.

To sum this up, adding a liveliness check as part of the system to create a continuous built-in-test can increase your confidence that things are working as they should. It can also help you identify problems earlier. Like everything in life, it doesn’t come without tradeoffs and you should weight your benefits vs. costs before utilizing it in your systems.


 
Tags: SOA | SOA Patterns | Software Architecture

November 13, 2009
@ 05:42 PM

There’s no Architecture in Business Service Orientation ! There, I’ve said it, there are no two types of SOA.  I am not trying to say that business-level service orientation doesn’t exist or isn’t valuable. However I am trying to say that labeling that SOA harms both Service Orientation at the business level and SOA (a.k.a. “technical SOA”)

For the record here are my definitions for both Business Service Orientation and Service Oriented Architecture (SOA)

Business Service Orientation is an IT  paradigm at the enterprise-level  that aims to componentize and partition the business’s software and to get composability and flexibility (and thus achieve better business and IT alignment etc.). Service Orientation can be implemented using various enterprise architecture practices (around governance, portfolio management etc)  as well as various software architectures including (but not limited to )  SOA, EDA, BPM, REST and combinations of them.

Service Oriented Architecture as an architectural style for building systems based on interacting coarse grained autonomous components called services. Each service expose processes and behavior through contracts, which are composed of messages at discoverable addresses called endpoints. Services’ behavior is governed by policies which are set externally to the service itself. SOA is derived of four predating architectural styles, namely Client/Server, Layered System, Pipes and Filters and Distributed Agents.

image

To reiterate -  calling Business Service Orientation SOA serves only to muddy the water and make both terms nebulous. If you just have to have a TLA just call them  BSO & SOA

PS

I know that in “What is SOA anyway” I also refer to two SOAs. I didn’t have enough confidence to say it bluntly when I wrote that paper, but I did emphasized Service Orientation for BSO and Architecture for SOA. Also regarding a more formal definition of SOA, that shows how it is derived from the four other styles – I started explaining that quite some time ago (Intro, Client-Server, Layered, Pipes and Filters) – I still need to explain Distributed Agents and summarize (about time I’ll do that)


 
Tags: SOA | Software Architecture

Yesterday I gave a talk on SOA pattern on the European Virtual Alt.Net user group. You can find the recording of that talk here as well as download a pdf of the slides.

Before I’ll talk a little about the substance I want to say a few words about office-live meeting (the platform used for the presentation). To sum this in one word the experience was horrid. It took me more than 35 minutes just to upload my presentation. Then I had to switch to windows XP (VM in parallels) to speak since it has problem with Windows 7 (low sound volume). However, the worst thing is that throughout the presentation I constantly lost control of the slides progress (i.e. couldn’t move the slides forward), which was very distracting. 

Anyway, if ignoring all that, I think overall the presentation is still beneficial and  addresses a  few interesting issues that are challenging like flexibility, reporting and management of SOA. If I am to sum the presentation I’d say that  when you build a system on SOA you get a system built of (relatively) a lot of components of questionable reliability. You can reap a lot of benefits in the flexibility department, but you have to address several challenges in the performance, availability, management (etc.) departments. Additionally you  need to look at the overall solution from an holistic viewpoint since different parts of the solution can push in different direction or  only cover part of the picture.

Lastly thanks to Jan and Colin for organizing the event and for all the attendees for giving me an hour and half of their time


 
Tags: .NET | SOA | SOA Patterns | Software Architecture

I begun writing SOA patterns a long time ago. I was making nice progress when suddenly  xsights happened and my free time evaporated. Now,  2 years or so later, we’re finally in production. Since the progress on the book has, hmm how shall I put it, been hampered by xsights,   I thought it would, at least, be appropriate to share a some details on how the ideas presented in the book (written, half-written and yet-to-be-written) are being put to use. As it happens, this also coincided with  Jan Van Ryswyck & Colin Jack  asking me if I’d be interested to in presenting something to the European Virtual Alt.Net group (E-VAN).

So here we are. Next week, I am going to be talking about SOA patterns. I am going to present a few common SOA challenges (availability, flexibility, Reporting, multi-tenancy.) and discuss the patterns and implementation we are using to meet them.  I am still finalizing the presentation so if you have any questions you’d like me to answer, feel free to send them to me (either my email or a comment here) and I’ll do my best to address as many of the questions as possible

Hope to see you there.

Start Time: Monday, October 05, 2009 07:00 PM GMT*

End Time: Monday, October 05, 2009 08:30 PM GMT

Attendee URL: http://snipr.com/virtualaltnet (Live Meeting)

VAN Calendar: http://www.virtualaltnet.com/Home/Calendar

(*) 08:00 PM UK, 09:00 PM Brussels/Israel, 02:00 PM EST and 11:00 AM PST


 
Tags: .NET | SOA | SOA Patterns | xsights

I noticed that the images and code samples are a little off on the blog (I have to admit I just pasted it from word, and we all know the great HTML that produces…). To help remedy this I am also making this pattern available in PDF from.

The next pattern I am going to publish is actually an anti-pattern called “NanoServices”, which as the name implies is about making services too small. I hope to have that ready early next week. Next after that would be the “Aggregated Reporting” pattern. Aggregated Reporting is aimed at solving the dispersed data problem that autonomy and a lot of services creates.

Any thoughts (on the pattern or otherwise) are welcomed


 
Tags: .NET | Java | SOA | SOA Patterns | Software Architecture

September 8, 2009
@ 06:53 PM

1.1 Reservation

When you use transactions in “traditional” n-tier systems life is relatively simple. For instance, when you run a transaction and an error or fault occurs you abort the transaction and easily rollback any changes – getting back your system-wide consistency and peace of mind. The reasons this is possible is that a transaction isolates changes made within it from the rest of the world. One of the base assumptions behind Transactions is that the time that elapses from the beginning of the transaction until it ends is short. Under that assumption we can afford the luxury of letting the transaction hold locks on our resources (such as databases) and mask changes from others while the transaction is in progress. Transactions provide four basic guarantees – Atomicity, Consistency, Isolation and Durability, usually remembered by their acronym - ACID.

Unfortunately, in a distributed world, SOA or otherwise, it is rarely a good idea to use atomic short lived transactions (see the Cross-Service Transactions anti-pattern in chapter 10 for more details). Indeed, the fact that cross service transactions are discourages is one of the main reasons we would to consider using the Saga pattern in the first place.

One of the obvious shortcomings of Sagas is that you cannot perform rollbacks. The two conditions mentioned above, locking and isolation do not hold anymore so you cannot provide the needed guarantee. Still, since interactions, and especially long running ones, can fail or be canceled Sagas offer the notion of Compensations. Compensations are cool; we can’t have rollbacks so instead we will reverse the interaction’s operation and have a pseudo rollback. If we added one hundred (dollars/units/whatnot) during the original activity we’ll just subtract the same 100 in the compensation. Easy, right?

1.1.1 The Problem

Wrong – as you probably know, it isn’t easy. Unfortunately, there are a number of problems with compensations. These problems come from the fact that, unlike ACID transactions, the changes made by the Saga activities are not isolated. The lack of isolation means that other interactions with the service may operate on the data that was modified by an activity of other sagas, and render the compensation impossible. To give an extreme example, if a request to one service changes the readiness status of the space shuttle to “all-set” and another service caused the shuttle to launch based on that status, it would be a little too late for the first service to try to reverse the “all-set” status now that the “bird has left the coop”. A more down to earth (pardon the pun) business scenario is any interaction where you work with limited resources e.g. ordering from a, usually limited, stock.

Consider, for instance, the scenario in figure 6.1 below. A customer orders an item. The ordering service requests the item from the warehouse as it wants to ship the item to the customer (probably by notifying another service). Meanwhile on the warehouse service the item ordered causes a restocking threshold to be hit which triggers a restocking order from a supplier. Then the customer decides to cancel the order – now what?

6.1

Figure 6.1 Chapter 6 focus is about connecting Services with Service consumers in the levels and layers beyond the basic message exchange patterns.

Should the restocking order be cancelled as well? Can it be cancelled under the ordering terms of the supplier? Also a customer requesting the item between the ordering and cancellation might get an out of stock notice which will cause him to go to our competitors. This can be especially problematic for orders which are prone for cancellations like hotel bookings, vacations etc.

Another limitation of compensations and the Saga pattern itself, for that matter, is that it requires a coordinator. A coordinator means placing trust in an external entity, i.e., outside (most) of the services involved in the saga, to set things straight. This is a challenge for some of the SOA goals as it compromises autonomy and introduces unwanted coupling to the external coordinator.

The question then is

How can we efficiently provide a level of guarantee in a loosely coupled manner while maintaining services’ autonomy and consistency?

We already discussed the limitations of compensations, which of course is one of the options to solve this challenge. Again, one problem is that we can’t afford to make mini changes since we will then be dependent on an external party to set the record straight. The other problem with compensations is that we expose these “semi-states” – which are essentially, the internal details of the services, to the out-side world. Increasing the footprint of the services’ contract, esp. with internal detail, makes the services less flexible and more coupled to their environment (See also the white box services anti-pattern in chapter 10)

We’ve also mentioned that distributed transactions is not the answer since they both lock internal resources for too long (a Saga might go on for days..?) as well as put excess trust on external services which may be external to the organization.

This seems like a quagmire of sorts, fortunately, real life already found a way to deal with a similar need for fuzzy, half guarantees – reservations!

1.1.2The Solution

Implement the Reservation pattern and have the services provide a level of guarantee on internal resources for a limited time

6.2

Figure 6.2 The Reservation pattern. A service that implement reservation consider some messages as “Reserving” in which it tries to secure an internal resource and sends confirmation if it succeeds. When a message considered as “confirming” the service validate the reservation still holds. In between the service can choose to expire reservation based on internal criteria

The Reservation pattern means there will be an internal component in the service that will handle the reservations. Its responsibilities include

§ Reservation - making the reservation when a message that is deemed “reserving” arrives. For instance when an order arrives, in addition to updating some durable storage (e.g. database) on the order it needs to set a timer or an expiration time for the order confirmation alternatively it can set some marker that the order is not final.

§ Validation – making sure that a reservation is still valid before finalizing the process. In the ordering scenario mentioned before that would be making sure the items designated for the order were not given to someone else.

§ Expiration – marking invalid reservation when the conditions changed. E.g. if a VIP customer wants the item I reserved, the system can provision it for her. It should also invalidate my reservation so when I finally try to claim it the system will know it’s gone. Expiration can also be timed, as in, |we’re keeping the book for you until noon tomorrow”

Reservations can be explicit i.e. the contract would have a ReserveBook action or implicit. In case of an implicit order the service decides internally what will be considered as Reserving message and what will be considered as confirming message e.g. an action like Order, will trigger the internal reservation and an action like closing the saga will serve as the confirming message. When the reservation is implicit the service consumer implementation will probably be simpler as the consumer designers are likely to treat reservation expiration as “simple” failures whereas when it is explicit they are likely to treat the reservation state.

Reservations happen in business transactions world-wide every day. The most obvious example is making a ordering a flight. You send in a request for a room (initiate a saga) saying you’d arrive on a certain date, say for a conference, and check out on another (complete the saga). The hotel says ok, we have a room for you (reservation) – provided you confirm your arrival by a set-date (limited time). Even if everything went well, you may still arrive at the hotel, only to find out your room has been given to another person (limited guarantee). The idea of the reservation pattern is to copy this behavior to the interaction of services so that services that support reservations offer a sort of “limited lock” for a limited time and with a limited level of guarantee. Limited level of guarantee, means that like real life, services can overbook and then resolve that overbooking by various strategies such as fist come, first served; VIP first served etc

It is easy to see Reservation applied to services that handle “real-life” reservations as part of their business logic, such as a ordering service for hotels (used in the example above) or an airline etc., However reservations are suitable for a lot of other scenarios where services are called to provide guarantees on internal resources. For instance, in one system I built we used reservations as part of the saga initiation process. The system uses the Service Instance pattern (see chapter 3) where some services are stateful (the reasons are beyond the scope of this discussion). Naturally, services have limited capacity to handle consumers (i.e. an instance can handle n-number of concurrent sagas/events).

This means that when a saga initialized all the participants of the saga needs to know the instances that are part of the saga. As long as a single service instance initiates sagas everything is fine. However, as illustrated in figure 6.3 below, when two or more services (or instances) initiate sagas concurrently they may (and given enough load/time they will) both try to allocate the same service instance to their relative sagas. In the illustration we see that both Initiator A and Initiator B want to use Participant A and Participant B. Participant A has a capacity of 2 so everything is fine for both Initiators. Service B, however, has limited capacity so at least one of the Sagas will have to fail the allocation, i.e. not start.

6.3

Figure 6.3 : Sample for a situation that can benefit from the reservation pattern

The reservation pattern enabled us to manage this resource allocation process in an orderly manner by implementing a two pass protocol (somewhat similar to a two phase commit). The initiator asks each potential participant to reserve itself for the saga. Each participant tries to reserve itself and notify back if it is successful – so in the above scenario, A would say yes to both and B would say yes to one of them. If the initiator gets an OK from all the involved services (within a timeout) it will tell all the participants the specific instances within the saga (i.e. initiate it).

The participants only reserve themselves for a short period of time. Once an internally set timeout elapse the participants remove the commitment independently. As a side note, I’ll just say that the initiator and other saga members can’t assume that the participant will be there just because they are “officially” part of the saga and the system still needs to handle the various failure scenarios. The Reservation pattern is used here only to help prevent over allocation and it does not provide any transactional guarantees.

A reservation is somewhat like a lock and thus it “somewhat” introduce some of the risks distributed locks presents. These risks aren’t inherent in the pattern but can easily surface if you don’t pay attention during implementation (e.g. using database locks for implementation).

The first risk worth discussing is deadlock. Whenever you start reserving anything, esp. in a distributed environment you introduce the potential for deadlocks. For instance if both participants had a capacity for single saga, initiator A contacts participant A first and participant B next and initiator B used the reverse order – we would have had a deadlock potential. In this case there are several mechanisms that prevent that deadlock. The first is inherent to the Reservation pattern, where the participants release the “lock” themselves. However, for example, if there is a retry mechanism to initiate the sagas (as both would fail after the timeout) and the same resources will be allocated over and over there may be a deadlock after all

Another risk to watch out from when implementing Reservations is Denial of Service (whether maliciously or as an byproduct of misuse). DoS can happen from similar reasons discussed in the deadlock (i.e. if you incur a deadlock you also have a DoS). Another way is via exploiting the reservations by constantly re-reserving. Depending on the reservation time-out, regular firewalls might fail detecting the DoS so you may want to consider using a Service Firewall (chapter 4) to help mitigate this thread.

Besides the risks discussed above, another thing to pay attention to is that when you introduce Reservation, you are likely to add additional network calls. The system discussed above mention that when it introduce another call tell the Saga members which instances are involved in the saga.

In addition to the Service Firewall pattern, mentioned above, another pattern related to Reservations can be the Active Service pattern (see chapter 2). The Active Service pattern can be used to handle reservation expiration when implemented by timed. Note however, that sometimes better, resource-wise, to handle expiration passively and not actively as we’ll see looking at s implementation options in the next section.

1.1.3Technology Mapping

Unlike a lot of the patterns in this book, the Reservation pattern is more a business pattern than a technological one. This means there isn’t a straight one-to-one technology mapping to make it happen. On the other hand, code-wise, the pattern is relatively easy to implement.

One thing you have to do is to keep a live thread at the service to make sure that when the lease or reservation expires someone will be there to clean up. One option is the Active Service pattern mentioned above. You can use technologies that support timed events provide the “wakeup service” for you. For instance if you are running in an EJB 3.0 server you can use single action timers i.e. timers that only raise their event once to accomplish this. Code listing 6.1 below shows a simple code excerpt to set a timer to go off based on time received in the message. Other technologies provide similar mechanism to accomplish the same effect.

Code Listing 6.1 setting a timer event for a timer based on a message to set the timer (using JBOSS )

public class TimerMessage implements MessageListener {

@Resource

private MessageDrivenContext mdc;

.

.

.

public void onMessage(Message message) {

ObjectMessage msg = null;

try { #1

if (message instanceof ObjectMessage) {

msg = (ObjectMessage) message;

TimerDetailsEntity e = (TimerDetailsEntity) msg.getObject();

TimerService timerService = messageDrivenCtx.getTimerService();

// Timer createTimer(Date expiration, Serializable info) #2

Timer timer = timerService.createTimer(e.Date, e);

}

} catch (JMSException e) {

e.printStackTrace();

mdc.setRollbackOnly();

} catch (Throwable te) {

te.printStackTrace();

}

}

.

.

.

(Annotation) <#1 some vanilla code to process a message and get the interesting entity out of it >

(Annotation) <#2 Here is where we set the single action timer based on the info in the message we’ve just got>

Timer based cancellation, as described above, might be an overkill if the reservation implementation is simple. For instance the Reservation in listing 6.2 below (implemented in C#) is used by the participants discussed in the Saga and reservation sample discussed in the previous section.

Code Listing 6.2 Simple in-memory, non-persistent reservation

public Guid Reserve(Guid sagaId)

        {

            try

            {

                Rwl.TryWLock();

                var isReserverd = Allocator.TryPinResource(localUri, sagaId);

                if (!isReserverd) #1

                    return Guid.Empty;

//Some code to set the expiration #2

                return sagaId; #3

            }

            finally

            {

               Rwl.ExitWLock();

            }

        }

(Annotation) <#1 The allocator is a resource allocation control, which manages, among other things, the capacity of the service. If we didn’t succeed in marking the service as belonging to the Saga, we can’t allocate the service to the specific Saga>

(Annotation) <#2 Here is where we need to add code to mark when the reservation expired, the previous example (6.1) used timers , we’ll try to do something different here>

(Annotation) <#3 successful reservation returns the SagaId this assures the caller that the reply it got is related to the request it sent – a simple Boolean might be confusing >

Since the Reservation in listing 6.2 does not involve heavy service resources (like, say, a database etc.), we can implement a passive handling of reservation expiration, which will be more efficient than a timer based one. Listing 6.3 below shows both a revised reservation implementation which removes timeout reservation before it commits. Note that an expired reservation can still be committed if no other reservation occurred in between or the capacity of the service is not exceeded.

Code Listing 6.3 passive reservation expiration handling (added on top of the code from listing 6.2)

public Guid Reserve(Guid sagaId)

        {

            try

            {

                Rwl.TryWLock();

                RemoveExpiredReservations(); #1

                var isReserverd = Allocator.TryPinResource(localUri, sagaId);

                if (!isReserverd)

                    return Guid.Empty;

                OpenReservations[sagaId] = DateTimeOffset.Now + MAX_RESERVERVATION; #2

                return sagaId;

            }

            finally

            {

               Rwl.ExitWLock();

            }

        }

private void RemoveExpiredReservations()

        {

            var reftime = DateTimeOffset.Now;

            var ids = from item in OpenReservations where item.Value < reftime select item.Key;

            if (ids.Count() == 0) return;

            var keys=ids.ToArray();

            foreach (var id in keys)

            {

                OpenReservations.Remove(id);

                Allocator.FreePinnedResources(id);

            }

        }

(Annotation) <#1 Added a small method (RemoveExpiredReservations which also appears in the listing) to clean expired reservations. This method is ran everytime the service needs to handle a new reservation request and it cleans up expired reservations. Note that there is no timer involved, reservation are only cleaned if there is a new reservation to process>

(Annotation) <#2 Instead of a timer the reservation is done by marking down when the reservation will expire>

The code samples above show that implementing Reservation can be simple. This doesn’t mean that other implementations can’t be more complex. For example if you want/need to persist the reservation or distribute a reservation between multiple service instances etc., but at its core it shouldn’t be a heavy or complex process.

Another implementation aspect is whether reservations are explicit or implicit. Explicit reservation means there will be a distinct “Reserve” message. This usually means there will also be a “Commit” type message and that the service or workflow engine that request the Reservation might find itself implementing a 2-phase commit type protocol, which isn’t very pleasant, to say the least.

The other alternative is implicit where the service decides internally when to reserve and what conditions to commit the reservation and when to reject it. As usual the tradeoff is between simple implementation to the service and simple implementation for the service consumer

1.1.4Quality Attributes

As usual, we wrap up pattern by taking a brief look at some business drives (or scenarios) that can drive us to use the reservation pattern.

In essence, the main drive to reservation is the need for commitment from resources and since it is a complementary pattern to Sagas it also has similar quality attributes. As mentioned above Reservation helps provide partial guarantees in long running interactions thus the quality attribute that point us toward it is Integrity.

Quality Attribute (level1)

Quality Attribute (level2)

Sample Scenario

Integrity

Correctness

Under all conditions, failure receive payment within 5 business days will cancel the order and shipping

Integrity

Predictability

Under normal conditions, the chances of a customer getting billed for a cancelled order shall be less than 5%

Table 6.2 Reservation pattern quality attributes scenarios. These are the architectural scenarios that can make us think about using the Decoupled Invocation pattern.

Reservations is a protocol level pattern which that involves Reservation involves exchange of messages between service consumers and services. The next pattern is one of the enablers of such message exchange , it is also a one of the confusing pattern since a lot of commercial offerings which include it include gazillion other capabilities - yes I am talking about the ServiceBus


 
Tags: .NET | Java | SOA | SOA Patterns | Software Architecture

A lot have been written and said about multiple use (or reuse depending on your definition) of services. I want to touch one aspect of this with this post.

As a general rule, the more something is generic or small the easier it is to use it in different contexts, for example Hash tables are used all over the place in a lot of programs. The Hash table is a generic container and carries very little in terms of business context so it is very easy to use it. A corollary to the above mentioned rule is that the more specific is something the harder it is to use it in different contexts. Unfortunately (from the “use” point of view) specific domain logic is exactly what we strive to have with SOA.  The value of services is derived from the business value they can generate. To add insult to injury, there’s also a limitation on how small we’d want a service to be. The fact that  communicating with  a service requires communication over a network means that if we’ll make it too small, the overhead in getting to it (serialization, network traffic, security etc.) can out weight its utility (an anti-pattern I call nano-services)

Well, one thing you can try to do is remove the business context from the services. before you flame me about  how this matches my previous statement that services’ value comes from the business value or domain know how they provide, you should note that I said “business context” and not business logic.

Let me try to clarify this with a concrete example from my current system. At xsights we provide image identification services for mobile devices. for instance when you see a movie ad, you can take a picture with your mobile, send it to us, via MMS, a specific client or video call, and we provide related information such as the trailer, where to buy tickets etc. Our initial offering supported only video calls (for business reasons irrelevant for this post). In a video call you have a constant stream of incoming video from the handset (10-15 frames per second) so we (try to) identify frames as they come. We mostly use event driven architecture over SOA so (a partial) flow looks something like the following (the events occur in the context of a saga). An extractor service listens on an RTP stream, extract and preprocess images and raises a FrameArrived event on each new frame. An Identification GW decides how to handle an incoming frame and directs it to one or more algorithmic workers (this isn’t event driven). After a successful identification the Identification GW raises a LinkFound event. And a Call Flow service takes it from there:

image 

if we didn’t get an identification within a timeout we can ask the user to better aim the camera or whatnot (behavior controlled by the CallFlow service)

When we first added support for MMS  we wanted to use the same identification logic – there’s a slight difference though: in a video call you have a constant stream of low-quality images where as in an MMS you get a single high(er) quality shot. To add support for MMS we needed to add some logic to the identifier so that it will know whether the origin of the image is an MMS message or a video call. If it is the first then the Identifier needs to raise a “failed to identify” even when it finished processing the image (the video call can use a timeout instead)

But that’s the wrong way to do it – since now we need to know which sagas are MMS sagas and which are Video call ones. Not to mention we would probably need some other “special” logic to handle clients (which indeed we needed) . If we go down this lane and add more and more business context to the identifier we make it less autonomous – even though we are using events they are no longer about the business of the service events (like “FrameArrived” from the extractor) they are system context events (“MMSIdentificationFailed”) our identifier is gaining more and more “reasons to change” and is becoming tightly couples to specific contexts. So yes, we using it over and over again but the costs for that are getting higher with each such reuse

What’s a better way? Remove the business context from the service and focus on keeping the business logic and rules. In this case that would be a NoMatchForFrame event for each failed frame. In an MMS related saga there would be a service that listens to this event, in a video call related saga no service will listen on the event*. Once the business context is removed our identification GW focuses only on its core business activity (routing images to algorithmic workers and notifying the world on success/failure. Adding support for client behavior becomes much easier in this can, in fact the identification GW doesn’t need any changes to support this scenario.

To sum this post – if you want to increase the chance to use services in different contexts you should strive to remove the context specific bits outside of the services. This will simplify the services themselves as well as increase their autonomy


* Out communications framework allows for different event wiring (or route) depending on the saga “type” so actually the event won’t even fire in a video call  as our communication framework will identify there aren’t any subscribers. This is very good from the service point of view as it allows it to fire events and letting the the communications framework worry about the context. The saga initiator is the only place where the context has to be specified (I’ll expand on this in another post)
 
Tags: SOA | SOA Patterns | Software Architecture

June 23, 2009
@ 10:03 PM

In one of my previous posts (Rest: good, bad and ugly), I made a passing comment, about how I think using CRUD in RESTful service  is a bad practice. I received a few comments / questions asking why do I say that – so what’s wrong with CRUD and REST?

On the surface, it seems like a very good fit (both technically and architecturally), however scratch that surface, and you’d see  that it isn’t a good fit for either.

REST over HTTP is the most common (almost only) implementation of the REST architectural style - to the point REST over HTTP is synonymous with REST. I would say most of the people who think of REST in CRUD terms, think about mapping of the HTTP verbs.

CRUD which stands for Create, Read, Update and Delete, are the four basic database operations. Some of the  HTTP verbs, namely POST, GET, PUT and DELETE (there are others like OPTIONS or HEAD) seem to have a 1-1 mapping to CRUD. As I said earlier they don’t. The table below briefly contrast HTTP verbs and CRUD

Verb CRUDdy Candidate Actually
GET SELECT (Read) Get a representation of a resource. While it is very similar to SELECT it also has a few features beyond an out-of-the-box SELECT e.g. by using If-Modified-Since (and similar modifiers) you might get an empty reply.
Delete Delete Maps well
PUT Update Put looks like an update but it isn’t since:
1. You have to provide a complete replacement for the resource (again similar to update but not quite)
2. You can use PUT to create a resource (when the URI is set by the client)
POST Insert It can be used to create a   but it should be a child/subordinate  one. Furthermore, it can be used to provide partial update to a resource (i.e. not resulting in a new URI)
OPTIONS ? Get the available ways to continue considering the current state or the resource
HEAD ? Get the headers or metadata about the resource (which you would otherwise GET)

The way I see it,  the HTTP verbs are more document oriented than database oriented (which is why document databases like CouchDB are seamlessly RESTful). In any event, what I tried to show here is that while you can update, delete and create new resources the way you do that is not exactly CRUD in the database sense of the word – at least when it comes to using the HTTP verbs.

However, the main reason CRUD is wrong for REST is an architectural one. One of the base characteristics(*) of REST is using hypermedia to externalize the statemachine of the protocol (a.k.a. HATEOS– Hypertext as the engine of state). The URI to URI transition is what makes the protocol tick (the transaction implementation by Alexandros  discussed in the previous post shows a good example of following this principle). 

Tim Ewald explains this  nicely (in a post from 2007…) :

… Here's what I came to understand. Every communication protocol has a state machine. For some protocols they are very simple, for others they are more complex. When you implement a protocol via RPC, you build methods that modify the state of the communication. That state is maintained as a black box at the endpoint. Because the protocol state is hidden, it is easy to get things wrong. For instance, you might call Process before calling Init. People have been looking for ways to avoid these problems by annotating interface type information for a long time, but I'm not aware of any mainstream solutions. The fact that the state of the protocol is encapsulated behind method invocations that modify that state in non-obvious ways also makes versioning interesting.

The essence of REST is to make the states of the protocol explicit and addressableg by URIs. The current state of the protocol state machine is represented by the URI you just operated on and the state representation you retrieved. You change state by operating on the URI of the state you're moving to, making that your new state. A state's representation includes the links (arcs in the graph) to the other states that you can move to from the current state. This is exactly how browser based apps work, and there is no reason that your app's protocol can't work that way too. (The ATOM Publishing protocol is the canonical example, though its easy to think that its about entities, not a state machine.)

If you are busy with inserting and updating (CRUDing) resources you are not, in fact, thinking about protocols or externalizing a State machine and, in my opinion, miss the whole point about REST.

CRUD services leads and promoted to the database as a service kind of thinking (e.g. ADO.NET data services) which as I explained in another post last year is a bad idea since:

  1. It circumvents the whole idea about "Services" - there's no business logic.
  2. It is exposing internal database structure or data rather than a thought-out contract.
  3. It encourages bypassing real services and going straight to their data.
  4. It creates a blob service (the data source).
  5. It encourages minuscule demi-serices (the multiple "interfaces" of said blob) that disregard few of the fallacies of distributed computing.
  6. It is just client-server in sheep's clothing.

The main theme of this and the previous post is that if we try to drag REST to the same old, same old stuff we always did we wouldn’t really get that many benefits. In fact, the “old” ways of doing that stuff are probably more suitable for the job anyway since they have been in use for a while now. and they are “tried and tested”  (“You can’t win an argument with an idiot, he’ll just drag you down to his level and beat you with experience” …). REST is just  a different paradigm that RPC, ACID transactions and CRUD.


* I know I sound like a broken record on that but our industry has a history diluting terms to a point they almost stop being useful (SOA comes to mind..). The way I see it you can have 3 levels on your way to REST over HTTP:

  • You can be using HTTP and XML/JSON – this is level 1 or “Using standards”.
  • You can be using the HTTP verbs properly and/or applying document oriented communications – this is level 2 or “Rest-like” interface
  • You can conform to all REST constraints and be at level 3 or “RESTful”.

All levels can be useful and bring you merit but only the 3rd is REST


 
Tags: REST | SOA | Software Architecture | Trends

June 15, 2009
@ 11:10 PM

 

Yesterday I read an interesting paper called “RETRO: A RESTful Transaction Mode”. On the good side, I have to say, it is one of the best RESTful models I’ve seen thus far. The authors took special care to satisfy the different REST constraints, unlike many “RESTful” services (e.g. twitter that returns identifier and not URIs). On the downside is I think a distributed transaction model is bad for REST or in other words I don’t see a reason for going through this effort and jumping through all these hoops.

Why?

For the same reasons transactions are wrong for SOA and  why WS-AtomicTransactions is wrong for SOAP web services:

  • Service Boundary – RESTful or otherwise is a trust boundary. Atomic transactions require holding locks and holding them on behalf of foreign service is opening a security hole (makes it much easier to do a denial of service attack)
  • You cannot assume atomicity between two different entities or resources. Esp. when these resources belong to different businesses.
  • Transactions introduce coupling (at least in time)
  • Transactions hinder scalability – It isn’t that you can’t scale but it is much harder

For rest it is even worse - Since using hypermedia as the engine of state change means that the hypermedia actually  describes the protocol, we clutter the business representations (the representations of real business entities like customer, order etc.) with transactional  nitty-gritty as the authors say:

“our model explicitly identifies locks, transactions, owners and conditional representations as explicit, linkable resources. In fact, every significant entity in our model is represented as a resource in order to comply with this constraint.”

This also means the programming the resources themselves will get much more complicated

I think that if you want to reap the benefits of REST you should keep the protocol simple and focus on the business and technical merits you can get not bog it all with needless complexity. It seems to me that RETRO is a good mental exercise to show transactions can be RESTful. I think, however that it is an overkill for RESTful implementations.

RESTful architectures will be better off with BASE (Basically Available, Scalable, Eventually Consistent) and/or ACID2 (Associative, Commutative, Idempotent and Distributed) models –or at least the Saga model (which the authors intend to tackle next) which  is a better candidate (IMHO) for achieving distributed consensus.


 
Tags: REST | SOA | Software Architecture

This is another post (<Rant>) about WCF default behavior and how it can make the life of developers miserable ( you can also check out “WCF defaults limit scalability”  and “Another WCF gotcha - calling another service/resource within a call”)

Anyway, the trigger for this is a post by Ayende called “WCF works in mysterious ways”.  Ayende posted some code he wrote which was throwing a serialization exception. You can see his post for the full code, but in a nut shell he was defining a large object graph (8192 objects that contain other objects) and was trying to send that over the wire. Here’s a short excerpt from the service definition:

   1: [ServiceBehavior(
   2:        InstanceContextMode = InstanceContextMode.Single,
   3:        ConcurrencyMode = ConcurrencyMode.Single,
   4:        MaxItemsInObjectGraph = Int32.MaxValue
   5:        )]
   6:    public class DistributedHashTableMaster : IDistributedHashTableMaster
   7:    {
   8:        private readonly Segment[] segments;
   9:  
  10:        public DistributedHashTableMaster(NodeEndpoint endpoint)
  11:        {
  12:            segments = Enumerable.Range(0, 8192).Select(i =>
  13:                                                        new Segment
  14:                                                        {
  15:                                                            AssignedEndpoint = endpoint,
  16:                                                            Index = i
  17:                                                        }).ToArray();
  18:        }
  19:  
  20:        public Segment[] Join()
  21:        {
  22:            return segments;
  23:        }
  24:    }
  25:  
  26:    [ServiceContract]
  27:    public interface IDistributedHashTableMaster
  28:    {
  29:        [OperationContract]
  30:        Segment[] Join();
  31:    }
  32:  
  33:    public class NodeEndpoint
  34:    {
  35:        public string Sync { get; set; }
  36:        public string Async { get; set; }
  37:    }
  38:  
  39:    public class Segment
  40:    {
  41:        public Guid Version { get; set; }
  42:  
  43:        public int Index { get; set; }
  44:        public NodeEndpoint AssignedEndpoint { get; set; }
  45:        public NodeEndpoint InProcessOfMovingToEndpoint { get; set; }
  46:  
  47:        public int WcfHatesMeAndMakeMeSad { get; set; }
  48:    }

As you can see in line 4 – the service is properly decorated with an attribute to enlarge the number of objects in graph. so looking at the code I initially suggested he add a few ServiceKnowType and DataContract/DataMember attributes on the data classes (as the serialization sometimes needs some guidance. After that didn’t help I actually ran the code and then I noticed that the code was missing setting that same attribute – on the client side. So to fix the problem, the client side code below

   1: var channel =
   2:     new ChannelFactory<IDistributedHashTableMaster>(binding, new EndpointAddress(uri))
   3:         .CreateChannel();
   4: channel.Join();
Need to change to something like
   1: var channelFactory =
   2: new ChannelFactory(binding, new EndpointAddress(uri));
   3:  
   4: foreach (var operationDescription in channelFactory.Endpoint.Contract.Operations)
   5: {
   6:  
   7: var dataContractBehavior =
   8:  
   9: operationDescription.Behaviors[typeof(DataContractSerializerOperationBehavior)]
  10:  
  11: as DataContractSerializerOperationBehavior;
  12:  
  13: if (dataContractBehavior != null)
  14: {
  15:  
  16: dataContractBehavior.MaxItemsInObjectGraph = int.MaxValue;
  17:  
  18: }
  19:  
  20: }
  21: var channel=channelFactory.CreateChannel();
  22: channel.Join();

The main problem I find with this piece of code is the fact that it is needed at all. As the post’s title suggest I find this behavior greatly affects the loose coupling of anything that uses WCF (services or other components).

WCF requires that any change you make to the channel on the server side would be reflected in the channel on each and every client (e.g. we have a similar setting where we enlarge message sizes for webHttpBinding and there are many other such examples).

Sure, you say, that is just like adding a new field in the contract isn’t it? – Well no it isn’t since unlike anything else which appears in the (verbose as it is) SOAP contract these changes in default values, which are purely a WCF design choice, are not documented. Again, the changes in default values are not part of the contract. These are things you need to remember to pass on to you service consumer. So not only do I pay the overhead of having an explicit contract (e.g. vs. REST) – it really doesn’t work.  It means that two components who use the same contract may not  be interchangeable if one returns more data (in this case). It means that the two sides are coupled by the need to change these defaults and for what? WCF is smart enough to know how long is the message; WCF is smart enough to handle the message (if I encourage it by setting a behavior) why can’t it add 2 and 2 by itself?

Sometimes I just wish WCF had a TrainingWheels or DemosOnly attribute I could just set to false and make all this crap go away…

</Rant>


 
Tags: .NET | SOA | WCF

I recently read a post by  Tim Bray where he states that building on web technologies let you get away with believing some of the fallacies of distributed computing.

I personally thinks he is a little optimistic in that claim.

On “The network is reliable” – Tim says that that the connectionless of HTTP helps (it does) and that GET, PUT and DELETE are idempotent helps as well. I say that GET, PUT and DELETE only if the people implementing the server side make them so – i.e. consider the fallacy. The fact that the HTTP says they should be idempotent doesn’t automatically make each implementation compliant

On “ Latency is Zero” – Tim says the web makes it worse – but, he claims, users got used to that. Even if they did I think that users are just part of the picture since the programmable web is also making strides. Also as Tim says it is actually worse. Not to mention that “Latency isn’t constant” either

On “Bandwidth is infinite” – Again Tim agrees that it is worse but people learn to note it. Again learning that it is there doesn’t mean the fallacy is gone just that people are less likely to presume it

On “The Network is secure” – Tim says its probably the “least-well-addressed by the web” – no argument here

On “Topology doesn’t change” – Tim says URIs help mitigate it – Again Tim is assuming people make URIs permanent or will always return a temporary redirect/permanent redirect when a URI change – good luck with that.

On “There is one administrator” – Tim says that yes that’s the case but who cares. Well, an example I usually give is that time when I deployed an ASP.NET which worked for a while – until the hosting company decided to change their policy to partial-trust (the app. needed full-trust) – when that happens to you. You care. If you mashup with someone else, you care etc.

On “Transport cost is Zero” – Tim says it is the same as for Bandwidth – i.e. worse.

On “The network is homogeneous” – Tim says that that’s this is the “web’s single greatest triumph”. I actually agree to that as long as all of you stick to using the web’s ubiquitous standards (http, XML/JSON ) if you have parts of your application that can’t use that you still need to pay attention

One thing I am really  puzzled by is Tim’s conclusion :

“If you’re building Web technology, you have to worry about these things. But if you’re building applications on it, mostly you don’t.”

Since even according to him only 4 fallacies are covered by the web… (I think only 1)

In any event, I agree that the web standards and REST in particular, do contain guidelines that take into consideration the fallacies. However it is still up to developers to understand the problems they’ll create if they don’t follow these guidelines. Assuming that that is indeed the case, is well, overly optimistic in my experience.

You can also read a paper I published a few years ago which explains the fallacies  and why they are still relevant today.


 
Tags: REST | SOA | Software Architecture

Michael Poulin @ ebizq doesn’t like the Active Service pattern I suggest you read his post first but in a nutshell Michael sees two possible ways to understand the term Active Service:

“a) service view - a service that actively looking for companions to complete its own task
b) consumer view – a service which triggers its own execution by itself”

…and he doesn’t like both…

I think that both of these definitions aren’t that far… and I like both :)

The way I see it there are two concern here

1. Are services only reactive (“passive”)  ? - i.e. The service only “works” when it gets a request from a service consumer (user/another service/an orchestration engine) ? If the service also has at least one thread working to do internal stuff (e.g. scavenging outdated data, pre-fetching data from other service etc.) then that’s what I call an Active Service (option “b” above)

2.  How do services get data they need to complete a request when they actually get a request – There are many possibilities here: events, pub/sub, an orchestration engine that takes care of that, services that check for a known contract in a registry and then go to that service, even hardcoded. The options where the service looks for other services (e.g. using a registry) is option "a” above.

So basically all the options are valid a service can be a+b just a or just b or none and, in my eyes, these are orthogonal concerns.

Regarding pre-fetching – I think this can be beneficial as a way to achieve caching. Note that if you control both sides and you’ve got the needed infrastructure then it is probably better to push changes (eventing or pub/sub) but that’s not always the case.

In the comment I left on Michael’s blog I talked about different strategies for services “There are several strategies for that - one is to take that knowledge out of the service (e.g. using choreography or orchestration), providing a subscription and/or wiring infrastructure i.e. something that will tell you where to find certain contracts, hard coding , registry , using uniform interfaces (e.g. REST) etc.”

lets take a concrete (albeit very very simplistic) scenario to illustrate some of the approaches

Business scenario: When a customer makes an order we want to give a 5% discount for preferred customers. A customer get’s a proffered status upon a business decision (annual orders of 1M$ or knowing the CEO or whatever) and the status lasts for a year from the date it was introduced.

For the sake of this discussion say we have two services (again this is overly simplified) an Ordering service and a Customer service.

Here are a few technical options

Technical Scenario 1.

Customer places and order, the ordering service talks to “the” customer service to check if the customer deserves a discount if she does. the ordering service then updates the order with the discount and present it to the customer to finalize the order.

Technical Scenario 2.

Same as 1, with the ordering looking for a service that matches the customer contract it knows about

Technical Scenario 3

The ordering service asks “the” Customer service twice a day for a list of discounts and caches the result. When the user sends her order. it calculates the price and present it to her

Technical Scenario 4

Same as 3, with the ordering looking for a customer service (not using a known service)

Technical Scenario 5

The customer service sends a message to known subscribers whenever a new customer status occurs. The ordering service listens on that and update its internal cache. When the customer places her order, the ordering hits the cache for the discount

Technical Scenario 6

same as 5 but publishing an event to unknown subscribers

Technical Scenario 7

The customer service publish an event with the discounts (or changes in discounts) twice a day. The ordering service listens on that and update its internal cache. When the customer places her order, the ordering hits the cache for the discount

Technical Scenario 8

The customer order is passed to an orchestrating service, which hits a customer service for a discount and then passes all the data to an ordering service

There are quite a few more options and variants on the options listed but which one is best?

Yeah, you’ve guessed it -  it depends.It depends since each option has its own strength and weaknesses which can work best in different circumstances . It also  depends on the available infrastructure, on the structure of other services, on the services being internal or external etc.

for instance scenario 1 is less flexible than most others but it is simple to implement. There is coupling in time between ordering and customer (both have to be up for the order to complete). Scenario 4 needs to solve the problem of finding other services (e.g. using some kind of registry, or other services “pushing” their existence or whatever) but when a customer makes her request it (most likely) have all the needed info to process that request, making the ordering service more autonomous. As a side note, the fact that different approaches to achieve the same end-goal work in different situations is why I decided  to write patterns in the first place

Lastly, in case you are wondering the scenarios are:

1 – choreography with pre-known (configured or hardcoded) companion services

2 – choreography with “active service” of type a (ordering is active)

3- choreography with “active service” type b (ordering is active)

4 – Choreography with “active service” type a + b (ordering is active)

5 – pub/sub (e.g. using an ESB)

6 – eventing

7- eventing with “active service” type b (customer is active)

8 - orchestration


 
Tags: SOA | SOA Patterns | Software Architecture

May 12, 2009
@ 10:54 PM

I recently got a request from Alik for my opinion on REST. I think  this might be interesting for a wider audience and decided to blog my answer here.

Note: I also have a REST presentation I prepared awhile ago, which is downloadable from here (ppt)

The good

As you probably know REST is an architectural style defined by Roy Fielding for the web which is built on several foundations (client/server, uniform interface etc.) which gives it a lot of strength in affected areas. The top three in my opinion are:

  • (relatively) Easy to integrate – a good RESTful API is discoverable from the initial URI onward. This doesn’t suggest that a any application calling on on your service will automagically know what to do. It does mean however that the developer reading your API trying to integrate it has an easier life. Esp. if since hypermedia provides you the roadmap of what to do next.
  • Another feature for ease of integration which has to do with REST over HTTP (THE most common implementation of REST ) is the use of ubiquitous standards. Speaking HTTP which is the protocol of the web, emitting JSON or ATOMPub means it is much easier to find a library that can connect to you on any language and platform.
  • Scalability – stateless communication, replicated repository make for a good scalability potential.

do note that, as with any architecture/technology – a bad implementation can negate all the benefits

image

other REST goodness are things like the notion of the URI, idempotance of GET in  REST over HTTP etc.

The Bad

Some of the  problems of REST aren’t inherent problems of the architectural style but rather drawbacks of the REST over HTTP implementation. Most notable of these is what’s known as “lo-rest” (using just GET and POST) – While technically it might still be RESTful, to me a uniform interface with 2 verbs is too small to be really helpful (which indeed makes a lot of the implementation unRESTful see “The Ugly” below)

One problem which isn’t HTTP specific is handling REST- programming languages are not resource oriented so the handling code that maps URIs to tends to get messy. Actually Microsoft did a relatively good work with implementing Joe Gregorio’s idea of URI mapping which helps alleviate  some of the problem. On the other hand it is relatively hard to make the REST API hyper-text driven (Which is a constraints of REST)

Lastly and most importantly REST is not the answer to everything (see also another post I made on using REST along with other architectural styles) – e.g. most REST implementations I know do not support the notion of pub/sub (Roy did suggest a REST implementation called WAKA that enables this but most people never even heard of it). be weary of the “Hammer” syndrome, REST is a good tool for your toolset but it isn’t the only one. 

The Ugly

In my opinion there are 2 main ugly sides for REST. The first is Zealots. That isn’t something unique to REST any good technology/idea (Agile, TDD etc. ) gets its share of followers who think that <insert favorite idea> is the best thing since sliced bread and that everybody should do as they do or else.

The real ugliness comes from the misusers – There’s a lot of mis-understanding. The fact that REST over HTTP has become synonymous with REST leads people to think that HTTP is REST. I recently read a REST book review on Colin’s blog where “the author states that although hypermedia is important in REST it isn't covered in the book because WCF has poor support for it” i.e. a book on REST which ignores one of the important constraints of the style..

Other mis-uses include building an implementation that is GETsful  (ie. does everything with http GET) or doing plain RPC where the URI is the command, doing CRUD with HTTP verbs etc. etc.

The point is that REST seems simple but it isn’t – it requires a shift in thinking (e.g. identifying resources, externalizing the state transitions etc.). However, as noted above, done right it can be an important and useful tool in your toolset


 
Tags: REST | SOA

May 8, 2009
@ 10:59 PM

Apropos the Blogjecting  Watchdog pattern,  In addition to blogging I recently added to our system the ability to twitter. I am using Tweet# from DimeBrain (thanks Mark Nijhof for the tip via twitter).

Tweet# makes using tweeter really simple (I included the code below in case you find it useful).

The tweeter sender is part of a PostOffice service (I thought that it would be problematic to present it as SpamServer which was its original name :) ).

image Update 11/05 Here it is working on our staging environment :)

A few points about our design in general that are interesting in this regards are

  • The PostOffice is a “Server” type service – we have 3 types of services: server which has one instance per node, channel which has multiple instances per node and algorithmic which has one instance per core
  • The PostOffice implements a pattern I call “Legacy Bridge” – which is basically an SOA version of an adapter+facade in OO terms. The post office supports the events (over WCF) mechanism we have in our system from one side  and connects to external systems (SMS, coupons and twitter) on the other. The PostOffice, basically contains an Edge Component which accepts the requests and funnels them to *Sender classes that interact with the external systems.
  • from contract design perspective – The events I added into the system are StatusEvent and AdminStatusEvent (and not TwitterEvent and DirectMessageEvent). this is better, in my opinion, as it carries the intent of what I want to achieve. It also means that if I choose to change technology or use multiple destinations the events will stay meaningful. For instance, the AdminStatusEvent will be used by our monitoring system to send a notification if the system crashes. I’ll probably want that as an SMS, maybe even a phone call as well as a twit (so the AdminStatusEvent will have a severity to designate how it should be handled)
   1: using System;
   2: using System.Collections.Generic;
   3: using System.Linq;
   4: using System.Text;
   5: using Dimebrain.TweetSharp.Fluent;
   6:  
   7: namespace xsights.Apps.PostOffice.Server.Twitter
   8: {
   9:     class TwitterSender
  10:     {
  11:         private string account;
  12:         private string password;
  13:         private string admin;
  14:  
  15:         public TwitterSender(string tweetAccount, string twitterPassword,string adminAccount)
  16:         {
  17:             account = tweetAccount;
  18:             password = twitterPassword;
  19:             admin = adminAccount;
  20:         }
  21:         public void Update(string msg)
  22:         {
  23:              foreach (var tweet in BreakToTwitts(msg))
  24:             {
  25:                 var update =
  26:                     FluentTwitter.CreateRequest().AuthenticateAs(account, password).Statuses().Update(tweet).AsJson();
  27:  
  28:                 update.Request();
  29:             }
  30:         }
  31:  
  32:         public void SendAdminMessage(string msg)
  33:         {
  34:             foreach (var twit  in BreakToTwitts(msg))
  35:             {
  36:                 var dm =
  37:                 FluentTwitter.CreateRequest().AuthenticateAs(account, password).DirectMessages().Send(admin, twit).AsJson();
  38:  
  39:                 Retry(2,dm.Request,false);
  40:             }   
  41:             
  42:         }
  43:  
  44:         private IList<string> BreakToTwitts(string originalString)
  45:         {
  46:             var list = new List<string>();
  47:             for (int i = 0; i < originalString.Length; i += 140)
  48:             {
  49:                 var len = 140;
  50:                 if (originalString.Length - i < 140) len = originalString.Length - i;
  51:                 list.Add(originalString.Substring(i, len));
  52:             }
  53:             return list;
  54:         }
  55:  
  56:         private void Retry(int retries, Func<string> call,bool shouldThrow)
  57:         {
  58:            
  59:             try
  60:             {
  61:                 call();
  62:             }
  63:             catch (Exception ex)
  64:             {
  65:  
  66:                 if (retries > 0)
  67:                     Retry(--retries, call,shouldThrow);
  68:                 else
  69:                 {
  70:                     if (shouldThrow)
  71:                         throw;
  72:                 }
  73:             }
  74:           }
  75:           
  76:         }
  77:     }
  78:  
  79: }

 
Tags: .NET | OO | SOA | SOA Patterns

As I mentioned in the previous post I got a few interesting questions lately. The first from Colin regarding developing a customized solution for the blogjecting watchdog pattern vs. integrating/developing for a commercial monitoring suite (e.g. Unicenter/OpenView etc.). The second question I received was from Dru on running multiple versions of services (e.g. during upgrade) with active Sagas in the background. I think these questions are interesting enough to be answered as blog posts.Also since both these questions are related to the Blogjecting Watchdog pattern I thought it would be better to explain what it is actually first..

So here it is :)

Blogjecting Watchdog

Achieving availability is a multi-layered effort. I’ve already talked about how services should be autonomous (see for example Active Service pattern in chapter 2) , the Blogjecting Watchdog pattern will take a look at another aspect of autonomy. The Blogjecting Watchdog pattern shows how a service can proactively try to identify faults and problems and to try to heal itself when it identifies these problems.

1.1 The Problem

The Service Instance pattern (see section 3.4) for example, demonstrates a strategy that a service can implement to be able to cope with failure. The question is – is that enough? Is it enough for the service to try to cope with everything by itself? My answer is no, that is not enough. For one once we dealt with the failure within the service, the service ability to cope with the next failure would probably be diminished. For example if we found a failure in a server and moved to a standby server, the new server does not have another stand-by server to move to if another fault occurs.

Additionally, the failure might be too much for the service to be able to overcome it by itself. Like a switch going down - So we would have something external that looks after the service and could help the service (see Service Monitor pattern in chapter 4).

To increase the service autonomy and to increase the overall availability of our SOA we need both to try to identify and repair problem and to be able to notify the world about the service’s current status.

The question is then:

How can we identify and attend to problems and failures in the service and increase service availability?

One option is to try to infer the state of the service from the way it looks to the outside – yes this is as crude as it sound. You try to call the service, it doesn't respond you know it is down; you call the service, you expect to get a reply in 5 seconds you get it in 10 seconds, you understand that the service is congested. This is not a very good option as the external behavior only gives us coarse knowledge on the service's state. For example, if the services has a decent fault tolerance solution, we wouldn't know that anything happened – but the truth is that the service ability to handle the next fault might not exist anymore.

Another way is to install agents on the service's servers, this will give you a much better picture of what happens (vs. the option above). For example, you will also be able to get trend information (e.g. You can watch how much disk space is left and alert when it is getting low). There are several problems with this solution. One is that you need to actively install software on the service's servers which both decreases the service autonomy and creates a management hassle in itself. Another problem is that you still only get an external view of the service behavior (you just gain access more information). There are situations (see for example the Mashup pattern in chapter 7) where not all the services are under your control and you cannot access their hardware.

Yet another option is to actively question the service about it state. The has one big advantage over the two previous options since you also get some inside information regarding what the service has to say about its state. This enables the service to communicate trends in problems that will actually make it fail. For example if the service does not write any information into the local disk a low disk space is not a problem at all, if this is the disk where the database is located it is very much a problem. The solution is not perfect since it is the observers responsibility to go after the information. If the rate at which the observer samples the service is not fast enough it can miss on vital information.

As I mentioned earlier we want something that will help increase the service’s autonomy so a better approach in this regard would be for the service to watch over itself

1.2 The Solution

Watching over itself is also not enough as we also said we need the “world” to know what happening with the service, thus a combines solution is to :

Implement the Blogjecting Watchdog pattern and have the service actively monitor its internal state, try to heal itself and continuously publish its state and other important indicators.

clip_image002

Figure 3.14 The blogjecting watchdog pattern. The blogjecting. The blogjecting component that send the reports out and and listens for requests. The watchdog component monitor the status of the business service, tries to heal stray components and log any failure.

The pattern revolves around a single idea – to increase the service responsibility by using two complementary concepts reporting and self healing. The first is the Blogjecting concept where the service implements the Active Service pattern (see chapter 2 for more details) and a component which is in charge of monitoring the service's state. The component publish (see the publish/Subscribe interaction pattern in chapter 6) also the service's state on a cyclic basis or when something meaningful occurs. It is important to note that the fact that the service actively publishes its state doesn't have to mean it cannot also respond to inquiries regarding its health (akin to living a comment on a blog and getting a response from the author)

What are Blogjects

The term Blogjects was coined by Julian Bleecker back in 2005 (Bleecker, 2005) to describe "edgy designed objects that report themselves, or expose their experiences in some fashion" or in other words Blogject == Objects that blog. Julian Bleecker's vision for Blogjects is wider than the one suggested here. Jonathan's vision is for things that participate in the Web 2.0 sense of social-web or even further than that – to use Julian’s words :“Forget about the Internet of Things as Web 2.0, refrigerators connected to grocery stores, and networked Barcaloungers. I want to know how to make the Internet of Things into a platform for World 2.0. How can the Internet of Things become a framework for creating more habitable worlds, rather than a technical framework for a television talking to an reading lamp?” . I highly recommend taking a look at the full paper “A Manifesto for Networked Objects – Cohabiting with Pigeons, Arphids and Aibos in the Internet of Things” (Bleecker, 2006) to get the full picture.

 

The second concept that plays in the Blogjecting Watchdog pattern is the watchdog, The idea here is to have a component that listens in on the information gathered and published by the blogject component and then to acts on that information in a meaningful way to increase the reliability and availability of the service. The possibilities for implementing self-healing are endless, two simple examples for self-healing actions are restating failed components and cleaning temporary files.

Watchdogs

Watchdog (actually watchdog timer) is a term borrowed from the embedded systems world. A watchdog is a hardware device that counts down to zero, and when it gets there it reset the device. To prevent this reset the application has to “kick the dog” before the timer runs out. If the application does not reset the counter it means that the application is hanged and the idea is that the reset would fix that.

 

How is the Blogjecting Watchdog pattern better than the other options mentioned above?

Even if we just consider the blogjecting part of the pattern we can see several advantages over the other approaches. The Blogjecting Watchcdog combines the benefits of an agent that actively monitors the service's health with the internal knowledge of what's important for the service continuity and what's not. Unlike the external agents solution, using Blogjects, the service retains its autonomy. The autonomy is increased even further when you combine the self-healing features of the watchdog. Thus the end result is a service which is more resilient (and thus has higher availability), which lets the world know both its current state as well as future trends.

In one project I was working on we inherited a situation where there were interdependencies between executable installed on different servers (within a service) – for example when one process was down on server A the objects running on server B could not function well and other such dependencies (this isn’t the brightest design, but sometimes you have to compromise - in this case there was no time and budget to redesign these applications). What we ended up with, is something like the situation in figure 3.15 below:

clip_image005

Figure 3.15 a sample deployment of a blogjecting watchdog. The daemons on the servers monitor the running components on each server. The Watchdog edge exposes the current the current state both through a web-services API and as SNMP traps

The watchdog agents on each of the server nodes monitors the components. The agents communicate amongst themselves to examine the dependencies and actions taken. The watchdog Edge component provides a WSDL based endpoint where other services can query it for the service’s health. It also publishes SNMP traps to an external SNMP monitor (e.g. HP-Openview). As an implementation hint, I can suggest keeping the watchdog components in a separate very simple executable (preferably a daemon that runs when the OS loads). The simpler the component, the lower the risk it will fail in itself (you can of course have a backup in the form of a hardware watchdog ..). Let’s take a more thorough look at the technology mapping options

1.3 Technology Mapping

Implementing Blogjecting Watchdog in an enterprise will usually pre-determine the protocols you will have to use for your “blog”. The IT team will most likely already standardize on one of the leading monitoring suites (CA-Unicenter, HP-Openview, IBM-Tivoli or if you are an all Microsoft shop Microsoft Operations Manager). In these cases you can use the SDK of the monitoring software (e.g. the Unicenter Agent SDK or MOM management pack developer guides). There are even 3rd party software packages to help you build such agents (for example OC Systems have a Universal Agent that makes it easier to write agents for Unicenter).

Note, that this is not always the case though, and sometimes you do have the freedom to choose you protocols. Few projects I worked on chose to standardize on using web-services with specific messages for monitoring the health of service (so we had a specific endpoint for each service where these messages were supported). With the emergent of SOA specific tools like the ones by Amberpoint and Weblayers you will see more and more WS-* based monitoring.

Other ways for reporting your internal state can be to use standards like SNMP (Simple Network Management Protocol) or plainly the windows Event logs An interesting option, which will let your Blogjecting Watchdog literally blog is to use a product called RSSBus. Whish is an ESB implementation that uses RSS protocol for communications. At the time I am writing this, the product is still in beta, so I haven’t used it for a serious system yet. Nevertheless, it looks like an interesting direction which I’ll consider when it is released.

Regarding the self-healing part (watchdog), self-healing is still more prevalent in hardware then in software (watchdog timers, RAID, IBM , hot spare memories, hot spare drives etc.) in a sense any solution that builds on clustering technology also has some of that built-in. The virtualization trend will also help in this sense (see discussion on utility computing in this chapter’s summary). You can already read papers that talk about self-healing web services (G. Kouadri Mostéfaoui, 2006) or see some projects that tries to look into this problem (e.g. WS-Diamond - DIAgnosability, Monitoring and Diagnosis). Nevertheless, all of them are still in the research phase and if you want something now, you will probably need to implement something by yourself. In my experience, it won’t take you too much time to have a basic watchdog up and running , but it will take you sometime until you will have it predicting and acting as an advanced warning system.

1.4 Quality Attribute Scenarios

The Blogjecting Watchdog is an interesting pattern (and not just because of its odd name) as it can really help on the way to autonomous computing. The effect of this proactive approach is to increase the overall reliability of the service. A service which is self-healing can overcome (at least) minor problem which results in better availability overall. Additionally the monitoring aspects of the Blogjecting Watchdog also help enhance availability by notifying administrators that something is amiss (which will enable them to fix it).

Quality Attribute (level1)

Quality Attribute (level2)

Sample Scenario

Availability

Failure detection

Upon a failure or degraded performance, The system will alert the system admin (via SMS) within 3 minutes.

Reliability

Increased autonomy

During normal operations, the system will clear all its temporary resources (e.g. files) continuously

Table 1.1 Blogjecting Watchdog pattern quality attributes scenarios. These are the architectural scenarios that can make us think about using the Blogjecting Watchdog pattern.

Once we introduce a monitor and start to collect data, we can start to find new uses for that data, for  example we can use the information on incoming request to try to locate attacks on the service etc. Saved monitoring data can be used to analyze the service’s behavior over time, predict failures and thus increase its maintainability etc.



 
Tags: Q&A | SOA | SOA Patterns | Software Architecture

the previous installment provided some context as to why I want to implement this pattern. This
installment will look at some of the implementation options.

As I noted before, WCF provides quite a lot of extension points on the route the message pass from arriving on the service to the point WCF
calls the actual method in the service instance. Several of those extension points are possible candidates for the Service Firewall for instance

  • Contract Filter-The contract filter is responsible to route messages to the appropriate contract. It needs to be a subclass of a MessageFilter. It looks that the contract filter is a good option since it intercepts the call rather early so it means it would probably be the fastest option. Also its name (filter..) implies it is a good option
  • Message Inspector - The Message inspector is responsible for looking at or modifying messages when they enter a service and looks like a natural candidate for the job. There are two kinds of Message Inspectors: Those who look at messages on the client side (implement the IClientMessageInspector interface) and those that look at the server side (implement the IDispatchMessageInspector). It seems that the latter is the type of inspector we need here.
  • Service Authorization Manager - responsible for evaluating policies, claims etc. of the client to make sure that a call is valid from the security perspective. This looks like it would be a good class to use for a real service firewall. It seems it won't be a good fit for the purpose of what we need here.

When I need to choose  between several technical options that seem to be similar I usually do a POC - proof of concept.  A piece of throwaway code to get a feel of the different options and better understand their strengths and weaknesses (in the context of the solution I seek).

What I did was to take a class I prepared for some of the integration tests of the EventBroker and build a few extensions that interact with them. Here is some of the setup code of the environment:

testServer = new Tester();
 service1 = new ServiceHost(testServer, new Uri(string.Format("http://localhost:{0}", TestServerPort)));
 var binding = new WebHttpBinding
 {
     ReaderQuotas = { MaxArrayLength = 600000 },
     MaxReceivedMessageSize = 800000,
     MaxBufferSize = 800000

 };

 var ep = service1.AddServiceEndpoint(typeof(TestingContract), binding, string.Format("http://localhost:{0}/S1", TestServerPort));
 ep.Behaviors.Add(new WebHttpBehavior());
 ep.Behaviors.Add(new InspectorBehavior());
 service1.Authorization.ServiceAuthorizationManager = new TestAuthorizer();
 var cp = service1.AddServiceEndpoint(typeof(ImContract), binding, string.Format("http://localhost:{0}/Control", TestServerPort));
 cp.Behaviors.Add(new WebHttpBehavior());

The two redlines above are the ones responsible for injecting the POCs the InspectorBehavior is reponsible for inserting the ContractFilter and the MessageInspector and the TestAuthorizer is the Authorization Manager test implementation.

We also need some code to raise an event:

public void SendMessage()
 {
     var evnt = new TestingEvent { sagaId = Guid.NewGuid() };

     moqRA.Expect(x => x.GetChannel<TestingContract>(evnt.sagaId, true)).Returns(channel1);
     moqRA.Expect(x => x.GetChannel<TestingContract2>(evnt.sagaId, true)).Returns(channel3);

     eb.BeginNewSagaEvent(evnt.sagaId, evnt);
     eb.CloseSaga(evnt.sagaId);

 }

And now we can look at the different options. The InspectorBehavior is just a helper class to wite the filter and/or inspector to the endpont. (The Authorization Manager is setup at the service level (i.e. for all endpoints))

public class InspectorBehavior : IEndpointBehavior
 {
     
     public void AddBindingParameters(ServiceEndpoint endpoint, BindingParameterCollection bindingParameters)
     {
     }

     public void ApplyClientBehavior(ServiceEndpoint endpoint, ClientRuntime clientRuntime)
     {
         throw new NotImplementedException();
     }

     public void ApplyDispatchBehavior(ServiceEndpoint endpoint, EndpointDispatcher endpointDispatcher)
     {
      var inspector = new TestInspector();
      endpointDispatcher.DispatchRuntime.MessageInspectors.Add(inspector);
      endpointDispatcher.ContractFilter = new TestFilter(endpointDispatcher.ContractFilter);
         
     }

     public void Validate(ServiceEndpoint endpoint)
     {
     }

The first thing I tried was the "ContractFilter". It is actually very simple to use. You inherit from MessageFilter and there are two "Match" method you need to override. One that accepts a buffer and one that accepts a (WCF) Message. WCF calls the Match method which accepts a Message.

WCF's Message class is interesting in the sense that it has a one-time touch feature. i.e. only one piece of code can read/copy it and the next piece of code which will try to do the same will fail

So the match method you can do something like the following:

public override bool Match(Message message)
 {
     var buffer = message.CreateBufferedCopy(Int32.MaxValue);
     message = buffer.CreateMessage();
     var r = buffer.CreateMessage().GetReaderAtBodyContents();
     .
     .
     .
 }

Which basically means get a buffer of the message, create one copy to preserve and the get another copy for internal use and work with that to parse and verify the actual message. Unfortunetly, this doesn't really work - the message parameter is not passed as ref so the original message is lost on the first line of the method and that's it. Note that you can access the header part of the message without problem, however that's not a good fit for what I am trying to do.

The next thing I looked at the MessageInspector. Again implementing it is rather simple, you just need to implement the IDispatchMessageInspector interface. This interface has two methods BeforeSendReply and AfterReceiveRequest. We'll look at the AfterReceiveRequest method. Again we try the message copy trick:

public object AfterReceiveRequest(ref Message request, IClientChannel channel, InstanceContext instanceContext)
 {
     var buffer = request.CreateBufferedCopy(Int32.MaxValue);
     request = buffer.CreateMessage();
     var temp = buffer.CreateMessage().GetReaderAtBodyContents();
     .
     .
     .
 }

This time it works since we get the request parameter as ref. At first it seemed to me that while you can inspect and alter the message as your heart wishes there is no way to say that the message is bad. One option is to alter the message to a faulty message and let the application handle it - but that means too much coupling between infrastructure and application. Another, better, option is to throw an exception.

So using the MessageInspector is a usable option. It is very good if you want to alter the incoming message but throwing an exception when the message is bad is not very clean

Which brings us to our third option Authorization Manager which, surprisingly  turned out to be the best option

public class TestAuthorizer :ServiceAuthorizationManager
 {
     public override bool CheckAccess(OperationContext operationContext, ref Message message)
     {
         var autorized= base.CheckAccess(operationContext, ref message);
         var buffer = message.CreateBufferedCopy(Int32.MaxValue);
         message = buffer.CreateMessage();
         var testMessage = buffer.CreateMessage();
         .
         .
         .
         return autorized;
     }
     
 }

Like the message inspector it receives the message as ref and like the filter it allows a single yes/no answer to decide if a message should continue or be discarded. Additionally it notifies the client that the message was rejected if that is what you choose to do (in the WebHttpBinding I used that means a 400 bad request return code)

Ok, so we've seen some of the options for implementing the Service Firewall and briefly went over thier different behaviors. The next part in this series will take a look at some of the actual implementation I did



 
Tags: .NET | SOA | SOA Patterns | WCF

One of the SOA patterns I already described is the Service Firewall. The idea behind the service firewall is to have an intermidiator between the actual service and callers and inspect in an applicative level incoming and outgoing messages.


Anyway, while I documented the pattern as a security one, I am actually going to implement it for another purpose - a saga filter.
In our implementation of EventBroker I made the design decision to have services expose regular WCF contract. i.e. services can communicate with each other directly and not just via eventing. This design decision is there to allow both interaction with non WCF services and to allow flexibility for multiple message exchange pattern (where events are not the best choice).
Another design decision we have is that we have two types of services. Servers and Channels. Servers handle multiple sessions and are (relatively) heavy to write. Channels on the other hand are light-weight services that  are stateful and dedicated for a specific session. Naturally there are a lot of instances of channels to allow supporting multiple sessions (and there are infrastructure bits to allow allocations and propagate liveliness etc. but that's another story). Channels have several benefits like increasing the systems capabilities to cope with failure (if a channel is down only the session it supported fails). One of the benefits of Channels is simple coding model. The Channel is dedicated to a session (typically a saga) and thus it doesn't have to handle all the routing of messages to sagas etc. that Servers have to cope with. This is where the Service Firewall comes to play.
In order to keep channels' code simple "someone" has to make sure the channel doesn't get messages that are not related to the saga it is part of. Otherwise the Channel will have to know about its current active saga and filter messages by itself - which kind of misses the point.
Making sure other services will not send messages while not in saga etc. will only take us so far (you know - latencies and stuff). A service firewall will let us intercept the messages before they reach the service and only allow the messages related to an active saga to pass through (while maintaining the benefits of direct contracts)

WCF has a rich extensibility model (see figure from MSDN below). This series will show how you can use some of these extension points to implement a service firewall and achieve the goal depicted above.  I hope you'd find it interesting




 
Tags: .NET | SOA | SOA Patterns | WCF

January 25, 2009
@ 11:42 PM
If you read this blog regularily you've probably heard/read about the 8 fallacies of distributed computing once or twice ... you know the assumptions architects and designers tend to make when designing distributed systems which prove to be wrong down the road, causing pain and havoc in the  project.  (indeed my paper explaining them is the second most poplar download on my site with just about 50K downloads)
Originally drafted in 1994 by  Peter Deutsch (with one more added by James Gosling in 1997). These fallacies still hold true today. I still see designers make these same old mistakes in modern  SOAs, RESTful designs and whatnot - but that's not the reason for this post.
What I want to talk about is the second fallacy "Latency is zero".

The more I think about it the more I think this fallacy should be updated to "Latency is zero or constant" (or add another fallacy for "latency is constant" on its own).

What's the difference?

Well, "latency is zero" fallacy means treating remote "things" as if they are the same as local "things". We can't do that - we need to build the API of remote things to take the fact the information takes time to get there into account (e.g. chatty interfaces vs. chunky interfaces). You can see more on that in a post called "Why arbitrary tier-splitting is bad" i wrote about a year ago

The "latency is constant" fallacy means thinking that if we send several batches of "stuff" to a remote "thing", they may arrive late but at least they'll arrive in order. Or to move from "things" and "stuff" to more concrete terms if you send messages over a network from one service to another they won't necessarily arrive in order.

But wait isn't it only true for  asynchronous messages? if we make synchronous calls we don't really care about this, now do we? That's only true if you and the service you are consuming are alone in the world. In all other cases (i.e. most of the time) even if you make all your calls synchronous, you can't know what other messages (from other senders) will arrive in between your messages - and how it will affect its state.

Unreliable latency can also mean we'll retry a message because we think it is lost and find out that the reciever gets it multiple times later.

These are things you really have to take that into account when you make multiple related calls - like,say, in a saga. One thing you can do to help is make messages idempotent (which also helps with the "network is reliable" fallacy). You can also increase latency even more and order the messages something that happens, for example, when  streaming video or audio.

What you really need to think about is  ACID 2. No, I am not talking about the database transactions ACID but rather on another term I first saw in "Building on Quicksand" (paper (pdf)/ppt) by Pat Helland. In this paper Pat talks about some of the implications of unreliable conditions (such as inconstant latency, failure etc.) on fault tolerance. ACID 2 (which apparently was  coined by Shel Finkelstein) stands for Associative, Commutative, Idempotent and Distributed. i.e. messages can be processed at least once , anywhere (same machine or across several machines), in any order.

That's harsh but I think that If you are building distributed systems today (SOA or otherwise) you can't ignore it.






 
Tags: REST | SOA | Software Architecture

January 16, 2009
@ 07:08 PM
In a post called "Rhino Service Bus: Saga and State" Ayende said
"In a messaging system, a saga orchestrate a set of messages. The main benefit of using a saga is that it allows us to manage the interaction in a stateful manner (easy to think and reason about) while actually working in a distributed and asynchronous environment."

I really don't agree with this definition of a saga. The Saga provides a context for set of messages to allow manging an effort for distributed concensus. It does not "orchestrate" messages (that's what workflows are for) - you can read more on Saga's in an excerpt from my SOA patterns book:  Saga pattern.

Here's the comment I left on Ayende's site:
"What you describe is nice except it isn't a Saga it is more of a workflow. The notion of Saga which is originated from databases relates to the overall coordination of state between the different services - or the context for the whole business process.
In the coffee shop example you use that would be the whole "transaction" from the point the customer orders her coffee until she either gets it or the transaction is canceled (e.g. it took too long and the customer leaves or the coffee shop is out of milk etc.)
Unlike database (or distributed) transaction when/if a saga is aborted the different component of the system might not return to their previous state e.g. if the customer complains that the coffee is not good and gets her money back. the milk is not separated back from the coffee beans and returned to the bottle - rather the coffee cup goes to the trash.

Workflow is one strategy a service can take to handle the long running interaction within a saga. In your case the BristaSaga class (which I think should be BristaWF) orchestrate the internal state transitions depending on the different messages that arrive within the saga. In your case you have a hardcoded workflow - but it is also possible to use a workflow engine for the job.

By the way, in the above example you could also use a statemachine instead of a WF to manage the process "
In another comment Kristofer asked me:

Arnon: I'm not 100% sure of how you distinguish a Saga from a Workflow, could you elaborate some more on this?

A Saga involves a number of underlying workflows?
A Saga might as well contain a number of underlying Sagas?

Isn't it just a question of at what level it is initiated?

If a Saga should represent the whole transaction / business process, then who should handle it? Couldn't it be implemented as a Saga, exactly as Ayende describes it, by the initiating service (in this case the ordering)?, which then also is given the responsibility to handle restoring the total state etc of underlying/involved services if the transaction is aborted? The possibility to restore state does of course depend on what the specific Saga is handling, some processes might not be able to "rollback" completely, it's rather a question of rolling back all involved parties to a known/acceptable state."

The answer is that ,again, Saga is similar to a transaction in the sense that it provides a shared context for an attempt to get a distributed consensus  Unlike a transaction which insures ACID properties. Sagas are not.
The concept of dissipating that shared context, having each party (service) affect whether the saga should be aborted or successful etc. is what I call a saga.
When a saga is aborted the only thing the coordinator can do is pass the status to the participants. Each of the services is responsible to do its best effort to handle the abort (either by rolling back, compensation or whatever)

Workflow is another thing altogether. which keeps a context between calls and means externalizing the decisions on the logic flow from the business logic (usually with a workflow engine). You can use workflows within a service (a pattern I call workflodize) or you can use them externally (a pattern I call orchestrated choreography e.g. BPM)
You can use either form of workflow to support the implementation of a saga but you can also implement sagas without workflows.
In our system we use an "event broker" (see www.rgoarchitects.com/.../EventingInWCF.aspx) the event broker infrastructure dissipates the saga context when you raise a saga event. A service that initialized a saga (by sending the first event) can choose to close the saga (commit) or abort it. etc. We don't currently have any workflow driven services (but some of them use a state machine as an alternative)

(I think the term Saga does not describe Ayende's class since the "barista" is just on of the participants in the saga there are other participants.)

Powered by ScribeFire.


 
Tags: SOA | SOA Patterns | Software Architecture

January 12, 2009
@ 08:42 PM
When describing the "known exceptions" to the Knot anti-pattern, I wrote the following:
Starting out on a large project, such as moving an enterprise to SOA, is difficult enough as it is. You can’t figure everything in advance; you need to deliver something – so as Nike says “just do it”. Get something done. You do need to be prepared to let go and redesign further down the road

In a comment to that post, Derrick Gibson wrote:
I have concerns about a "just do it" approach; it belies an assumption that at some point in the future the opportunity will be there to do things a "right way", whereas today time does not permit adherence to this mythical "right way".

One cannot put off til tomorrow that which should be done today. There is no guarantee of any future work to do "enhancements" or "architecture" and there is certainly no guarantee that even if there is a future project, you will be around to work on it. The next team will be starting from scratch and they will be left literally scratching their heads asking, "why did Team Alpha make this decision?"

So, if you make the first assumption that your team has to implement the best architecture it can with the time it has allotted, then will that not lead to other discussions along the way that prevent laying the seeds for this anti-pattern?

For instance, would not the use of a service bus and an approach that says each application makes calls to and receives responses from a service bus, free you from having services that call each other? Now, your services are no longer dependent upon other services or even other back-end data stores, so as new processes are defined and/or new systems are implemented (or others retired), your services remain agnostic to those changes.

This requires your service bus to have the logic which says, "this message needs to be routed here, while that message needs to be routed there." Wouldn't this approach resolve the knot anti-pattern before it ever originates?
The concrete  answer to this comment  is that service bus is one of the candidate solutions to solve/circumvent the Knot anti-pattern (as I also mentioned when I described the anti-pattern) - The question it begets  however is how do you know that the service bus is the right architectural decision for the project on the onset?! Ths question has much wider implications.

In "Who needs an Architect?" (a worthwhile reading in itself) Martin Fowler mentions that we can look at architecture as "things that people perceive as hard to change". The conclusion from that that an architect can do her work better if she doesn't impose these "hard to change things" or does that  as late as possible.
My experience is that when you start a "new grounds" project (such as moving an enterprise to SOA) there are a lot of moving parts. What I mean by that is that the uncertainty levels are very high e.g. the requirements are not set, the understanding of the technology and/or domain is partial, team is new and what not. Making a definitive architectural decision, which is "hard to change" and has a lot of effect on how you design your system and/or has substantial costs (both in licensing, training, adoption etc.) is not necessarily the right the decision. In fact, chances are you initial architectural decision will be flawed.

A phrase I heard from Ivar Jacobson once  is "plan to throw one away, you will anyway" - This is something I try to take with me and differ costly decisions if possible. Especially considering initial releases usually suffer from "time-to-market" constraints. To use a cliche - sometimes you need to go slow to go fast. By the way, this is one place where I don't agree with Uncle Bob who recently said "When is redesign the right strategy? ... Here's the answer. Never."

Like every guidance, this isn't always true. For instance, if this is your n-th similar project and you already know enough about it to say that an architectural pattern X (say service bus) or technology Y (say Hibernate) is good then, yeah go ahead and use that. You still want to consider the "cost to change" though since you can still be wrong.



 
Tags: SOA | Software Architecture

January 5, 2009
@ 08:02 PM
We are going to use some of our test code in production. Yes you read it right test code in production. Here are the details
In our system, among other things, we support visual search in video calls. i.e. an end user calls the system, points the camera at something she is interested, and (hopefully :) ) gets relevant information. Basically the system is made of several resources (image extraction, identification etc.) that collaborate via an event broker. We have a blogjecting watchdog that makes sure everything is up and running and we have applicative recovery service to handle failures.
The watchdog makes sure resources/services are up, resources report their liveliness and wellness so we know more about the resources than the fact that they are up. However, we still need a way to make sure that resource instances  can collaborate to provide the service.

Enter our automated acceptance tests. Part of our development effort included building a test runner for automated tests scenarios, e.g. load tests, verifying algorithms correctness etc. One of these tests is the smoke test (run after each successful build) which includes a sunny-day scenario of a video call- as described above. What we're going to do now is build on the test runner and the sunny day scenario a "keep-alive" tester that will periodically make test calls to the system (depending on the current load etc.) and make sure that everything is still working correctly.


So there you have it, an unexpected benefit of automated acceptance tests, who would have thunk it :)



 
Tags: .NET | SOA | Software Architecture | TDD | WCF

The year is almost done so I'd thought it would be a good time for a short retrospective into what I blogged here. The 13  posts below are the ones  I liked best this year. Turns out these posts touch on a lot of different subjects: requirement, software management, agile development, architecture, SOA and programming.



 
Tags: Agile | Project Management | SOA | SOA Patterns | Software Architecture | TDD

December 16, 2008
@ 10:36 AM
An initial draft for the Knot anti-pattern, As usual any comments are welcomed. You can also download it in PDF form

Everything starts oh so well. Embarking on a new SOA initiative the whole team feels as if it is pure green field development. We venture on - The first service is designed. Hey look it got all these bells and whistles; we are even using XML so it must be good. Then we design the second service, it turns out the first service has to talk to the second – and vice versa. Then comes a third, it has to talk to the other two. The forth service only talks to a couple of the previous ones. The twelfth talks to nine of the others and the fourteenth has to contact them all – yep our services are tangling up together into an inflexible, rigid knot

 

The above scenario might sound to you like a wacky and improbable scenario - why would anyone in the right mind do something like that?  Let’s take another look, with a concrete example this time and see how the road to hell is paved with good intentions. In Figure 10.1 below we see a vanilla ordering scenario. An ordering service sends the order details to a stock service, where the items are identified in the stock, marked for delivery and then sent to a delivery service which talks to external shipping companies such as DHL, FedEx etc.




Figure 10.1 a vanilla ordering scenario. An ordering service sends the order to a stock service, which provisions the goods to a delivery service which is responsible to send the products to the customer

 

If we think about it more we’ll see that when an item is missing from the stock we probably have to talk to external suppliers, order the missing items and wait for their arrival- so the whole process is not immediate. Furthermore since the process takes time, it seems viable to cancel the process if an order is cancelled.  It seems we have two options (see Figure 10.2) either the ordering service will ask the two other services to cancel processing related to the order or the two services call the ordering service before they decide what to do next.   Naturally the system wouldn’t stop here, we would want to introduce more services and more connections e.g. an Accounts Payable service  that interacts with the external suppliers, the stock service and the delivery  service(since we also need to pay shipping companies) etc. 



Figure 10.2 a little more realistic version of the Ordering scenario from figure 10.1. Now we also need to handle missing items in the stock, cancelled orders and paying external suppliers. In this scenario the services get to be more coupled. For instance the Ordering service is now aware of the delivery service and not just the stock service.

 

With each new service we draw more lines going from service to service, and with each new service we update the services’ business logic with the new business rules as well as knowledge of the other services’ contracts.

 

1.1.1 Consequences

Well, so we get more lines going from service to service that normal isn’t it? After all if the services won’t talk to each other they won’t be very useful? Isn’t that the whole point of SOA?

 

Well, yes – and no. Yes it is normal for services to connect to each other.  After all, creating a system in an SOA is connecting services together.  As for the “no” part, the problem lies with the way we develop these integrations   if you are not careful it is easy to  get all the integration lines in a big, ugly mess – a knot

 

A knot is an Anti-pattern where the services are tightly coupled by hardcoded point-to-point integration and context specific interfaces

 

For instance, what happens when we want to reuse the ordering service mentioned above? No problem, we just call it from the new context. Alas, the knot prevents us from reusing it without hauling in the rest of the baggage - all the other services we defined above (the stock, delivery etc.) if the new context is not identical in it ordering processes and matches what we already have we can’t use it. Or we can’t use it without adding one-off interfaces where we add specific messages for the new context and all sort of “if” statements to distinguish between the old and the new behavior. Another option is to make this distinction in the original messages, which either not possible or forces us to make sure the other services are still functioning. In any event it is a big mess.

 

Let’s recap. We moved to SOA to get flexibility, increase reuse/use within our systems, prevent spaghetti point to point integration – what we see here is not flexible, hard to maintain and basically it seems like we are back in square one and we invested gazillions of dollars to get there.

 

 

1.1.2Causes

How did that happen?  How can a wonderful, open standards, distributed, flexible SOA deteriorate to an unmanageable knot?

 

It is tempting to dismiss the knot as the result of lack of adequate planning. If we only planned everything in advance we wouldn’t be in this mess now. Well, besides the point that trying to plan everything ahead of time is an anti-pattern in itself (an organizational anti-pattern – which isn’t in the scope of this book). There’s still a good chance you’d get to a Knot anyway since the problems are inherent in the way business work.

 

If we take a look back at the Integration Spaghetti scenario discussed in chapter 1 (depicted as figure 10.3 below), we can see that the phenomena was there as well, when we our business processes evolve we find we need to interact with information from other parts of the system. The flow of a business process expands to supply that needed information or service and thus the Knot grows.



Figure 10.3 the Knot anti-pattern is similar in both effect and origin to the spaghetti integration in non-SOA environments

 

From the technical perspective, we have two forces working here. One is the granularity of the services. On the one hand, Services are sized so that a business process requires several of them to work together. On the other hand they aren’t small enough so that they would be an end-node in the process (i.e. only other services would call the service and it will just return a result). Note that this isn’t a bad thing in itself, after all if each process was implemented by a single service we’d have silos not unlike the ones we try to escape by using SOA and if we set the services too small we’d fall into another trap (see the Nanoservices anti-pattern later in this chapter).  The bottom line is that while the granularity is a force that drives us toward the Knot, there’s not a lot we can do about it without getting ourselves into worse problems.

 

The second, stronger, force that pushes a system into a Knot is the business process itself.  Since, as we mentioned above, the process flows through the services, the services needs to be aware of the flow and then call other services to complete the flow.  In order for a service to call another service it has to know about its contract and know about its endpoint. When another business flow goes through that service we not only add the new contracts and endpoints but also the contextual knowledge of which other services to call depending on the process. And that’s my friends, is exactly the thing that gets us into trouble – the services start to tie themselves to each other more and more, as we implement more business process and more flows.

 

Hey, you say, but SOA should have solved all that, surely there is something we can do about it – or is there?

 

1.1.1Refactoring

 

The previous section explains that most of the problem is caused by having the services’ code determine where to go next and what to do with the results of the services’ processing. If there was only a way to somehow pry these decisions away from the services’ greedy hands…  As you’d probably guessed there is such away, in fact there are several such ways and this book lists three of them: The Workflodize pattern (Chapter 2), Orchestrated Choreography (Chapter 7) and Inversion of Communications (Chapter 5). Let’s take a brief look at each of these patterns and see how they help.

 

The workflodize pattern suggests adding a workflow engine inside the service to handle both Sagas (i.e. long running operations, see chapter 5) and added flexibility. The “added flexibility” is the card we want to play here. When we express the connections as steps in the workflow they are not part of our services’ business logic. They are also easier to change in a configuration-like manner both of these points are big plusses.

Still, a better way to solve the service to service integration problem is to use an external orchestration engine. The idea of using the Orchestrated  Choreography pattern is to enable Business Process Management- or a way for the organization to control and verify it processes are carried out as intended (you need an orchestration engine for that but it helps…). In the context of solving or avoiding the Knot anti-pattern, Orchestrated Choreography is better than Workflodize since it centralizes and externalizes all the interactions between services and thus effectively removing all the problematic code from the services themselves. Note that there’s a fine line between externalizing flow and externalizing the logic itself (see discussion in Orchestrated Choreography pattern, in chapter 7).

 

The third pattern we can use to refactor the Knot is Inversion of Communications. Inversion of Communications means modeling the interactions between services as events rather than calls. Inversion of communications is, in my opinion, the strongest countermeasure to the knot. The two patterns mentioned above bring a lot of flexibility in routing the messages between the services. The inversion of communications pattern also helps the message designers remove specific contexts from the messages since when the service’s status is raised as an event it isn’t addressed to any other service in particular. Note that using inversion of communications doesn’t negate using  either of the two other patterns mentioned above since that once the event is raised we still need to route it to other services and using a workflow engine is a good option for that. Another implementation option is to use an infrastructure that supports publish/subscribe (see the pattern’s description in chapter 5 for more details.)

 

Going back to the ordering scenario we mentioned above. As I mentioned, the services grow with needless knowledge of specific business process. So for instance, the ordering service had to know both about the stock service and the delivery one. Refactored with the Inversion of Communications pattern, the same Ordering service doesn’t have to know about any of the other services. In Figure 10.4 we can now see that the Ordering service sends two business events (new order, cancelled order) and the routing of these messages is no longer the responsibility of the service



Figure 10.4 the Ordering service using the Inversion of Communications pattern. Now the service doesn’t know/depend on other services directly. It is only aware of the business events of new order and cancelled order which are relevant to the business function that the service handled

 

Refactorings aside, one question we still need to think about is whether there are any circumstances where having a Knot is acceptable.

 

1.1.1Known Exceptions

 

In a sense the Knot is a distributed version of an anti-pattern described by Brian Foote and Joseph Yoder as “Big Ball of Mud” – spaghetti code where different types of the system tied to each other in unmanageable ways. The reason for mentioning the connection is that the reason that “Big Ball of Mud” might be considered a pattern rather than an anti-pattern also apply here:

 

“[when] you need to deliver quality software on time on budget… focus first of feature and functionality, then focus on architecture and performance”

 

Starting out on a large project, such as moving an enterprise to SOA, is difficult enough as it is. You can’t figure everything in advance; you need to deliver something – so as Nike says “just do it”. Get something done. You do need to be prepared to let go and redesign further down the road. In the current system I’m working on – a visual recognition/search engine for mobile, we went with a “knot” approach for the first release. The simplicity of the implementation, i.e. less investment in infrastructure, ad hoc integration etc. enabled us to deliver a first working version in less than 6 months. These 6 months also helped us understand the domain we are operating in much better and more importantly get to market with the feature the business needed in the schedule the business wanted. We spent the next 6 month rewriting the system in a proper way, including applying the Inversion of Communications pattern mentioned above.

 

To sum this up, coding the integration code into services is likely to end as a Knot. It is acceptable to go down this path for a prototype or first version i.e. to show quick results. However you do need to plan/make the time to refactor the solution so you will not get stuck down the road.





 
Tags: SOA | SOA Patterns | Software Architecture

December 8, 2008
@ 10:56 PM
I am (finally) writing some new stuff for my SOA book - working on a few Anti-patterns
  • The Knot - The distributed version of "big ball of mud" basically point to point integration
  • NanoServices - designing/building fine grained services (methods != services)
  • 3-tiered SOA - dressing up 3-tier architecture in SOA clothing (e.g. database as a service)
  • Whitebox Services - exposing internal structure - comes in two flavors exposing technology and allowing access not through contracts
  • Transactional Integration - inter-service transactions (use Sagas instead)
  • RESToid- combing SOA and REST without understanding the full implication of either
I am going to publish one of them (probably the "knot") in a few days but I thought I might be able to get a little feedback before that. I chose to describe anti-patterns in the following format:

  •  Context - Presenting the problem (probably through an example)
  •  Consequences - Explaining what the problem is. i.e. what happens when the anti-pattern is prevalent
  •  Causes - discussion on the forces that lead to the anti-pattern
  •  Refactoring - The patterns (and/or other tips) that can be used to fix the design
  •  Known Exceptions - Are there any contexts where using the anti-pattern is acceptable
I'd be happy to hear any comment you have on the anti-patterns listed above as well as comments on the structure for describing them

Thanks
Arnon


 
Tags: REST | SOA | SOA Patterns | Software Architecture

I got a question from Dru for my opinion of tow messaging subscription modes - subscription by message (type) and subscription by topic
The way I see it there are two different usages for Topics.

The first use for topics is for grouping or marking related messages. In this scenario you can actually break the subscription into three different levels of  generalization:
  1. Per message- interested parties subscribe to a specific type
  2. Topics - interested parties subscribe to a set of related types
  3. Topics hierarchy - interested parties subscribe to a set of sets
Here, when it comes to topics -on the pro side you get to easily subscribe to a lot of message types and on the con side you get to easily subscribe to a lot of message types...
The less specific the subscription, the harder it is to ensure it would work in open environments (i.e. when different organizations or groups get to integrate with your services). The problem lies in the number different messages you need to be able to handle/understand/parse and the control on new  types of messages. Getting versioning right with messages is hard enough when you have a hierarchy well that's just much harder

The second use for topics is routing.In this scenario a specific message type  can be sent using different topics.And the  topics basically become part of the  meta-data of the message. The supporting infrastructure can then use that meta-data to get messages to different subscribers. For example,  In a defense system project I participated in , we used Tibco Rendezvous support for topics to define interest regions on a closed set of messages e.g.  say you want only the messages related to the middle-east or the ones related to the US etc.
In the current infrastructure I am building I am going to implement something similar to topics (albiet without hirarchies) to allow different routings based on different saga types (so services that stay the same don't have to change thier behaviours)

To sum this, I would say that in my opinion the latter use for topics  is more useful for  general purpose use and the first use for topics is more useful in closed systems

P.S.
if you have an interesting question on SOA or architecture you can send it in and if I think it would interest a wider crowd I'll blog it here
 

Tags: ESB | SOA | Software Architecture | Q&A

October 18, 2008
@ 10:58 PM
As I mentioned in a couple of previous posts (like "Using REST along with other architectural ), I've been spending the last few weeks writing an Event system over WCF (probably also explains posts on  WCF gotchas like this;) ). Being a communication infrastructure it is still a long way from being completed, but it seems to be stabilizing and I think it turned out nicely so I thought I'd share a few details.

Let's start with the simple part - the usage.
The eventing is built on the idea of a bus (i.e. no centralized components) and the resources/services that want to use eventing have to use a library which I call EventBroker.  There are two modes for using the EventBroker. one is "regular" events which are contexless. This means that consecutive events can reach different services, and there is no context that flows from event to event:

bool raisedEvent = eb.RaiseEvent<SampleEvent>(new SampleEvent());
The second type of events are Sagas, which represent long running interactions. Sagas does have a "best effort" guarantee to reach the same recipients over consecutive calls. Also you can also End sagas (sucessful termination), Force End Saga (successful termination by a service that didn't initiate the saga) and Abot Saga (unsuccessful termination): Here is how you raise a saga event.
var evnt = new SampleEvent { data = somevalue};
var SagaId = Guid.NewGuid();
eb.RaiseSagaEvent<SampleEvent>(SagaId, evnt);
if you use the same Saga Id, the events are handled as part of the same saga, if you use a Saga Id that wasn't previously defined it will initialize a new saga.
The eventbroker translates events to the relevant contract and dispatches the events over to the different subscribers. Which brings us to to the next part which I  guess,   is also a little more interesting. How subscriptions are defined.

The first thing to do is to define the event itself.
    public class SampleClassEvent :ImEvent
{
public string DataMember1 {set;get;}
public int DataMember2 { set; get;}
}
There aren't any real constraints on the event, except that it has to "implement" the ImEvent interface. Which is really an empty interface but it marks the event as one for the event broker.
Then you have to define an interface for handling the event. The event broker, builds on the idea of convention rather than configuration (an idea popularized by the rails framework) so it is easier to generate the interface (something I do with a resharper template)
    [ServiceContract]
public interface IHandleSampleClass
{
[OperationContract]
int SampleClass(SampleClassEvent eventOccured);

}
The convention is that the interface will have a IHandle prefix followed by the name of the event. It will hold a single operation named like the event (without the Event suffix) and will recieve a single parameter which is the event data. Currently  events do return a value (int) but I am thinking about changing it to void and have everything marked as OneWay for added performance

Now, when we create a service which needs to handle events it will do that by specifing which events it handles. E.g.
    [ServiceContract]
public interface ImSampelResource : ImContract, IHandleSampleClass, IHandleSomeOtherThing
 {
}
So each contract declares all its subscriptions (by a list of IHandleXXX). It should also include the ImContract interface which holds all the service operation used by the eventbroker (e.g. ending sagas etc.).
Services that want to raise events should inherit from a ControlEdge class (base class Edge component that delegates control events to the event broker)

There's still the question of how does the event broker knows where to find other services. There are several ways this can be done (e.g. a service repository) but since we have  blogjecting watchdogs in place anyway, we use them to propagate liveliness (and location ) of services.

This sums up this post. It is basically just a little context for several planned posts where I hope to talk about some of the challenges, alternatives and design decisions that led me to the current design. Meanwhile, I'd also be happy to hear any comments, ideas or reactions you may have
 
Tags: .NET | Design | OO | SOA | SOA Patterns | Software Architecture | WCF | xsights

Following my latest post on evolving the architecture Dru asked me for more details on our RESTful control channels.
For one you can take a look at slide 25 of my presentation on REST which talks about the Sessions resource. The session resource returns an AtomPub feed of the current active sessions and then if you follow a link to a session you get the current status, the URIs of the participating resources etc.
I guess the more interesting questions are (especially in the light of all the on going REST debate we now see)
  1. Why rely on REST for the control channel
  2. Why not use REST for the whole system
So, why is REST a good option for the control channel?

  • the REST architectural style in general and REST implementation using web standards (HTTP, AtomPub etc.) in particular brings a lot of benefits in integration (what easy for humans to understand is easier to implement).
  • Another reason for REST (over HTTP) is standardization over languages and platforms. Any language and platform I've used has an implementation that allows sending and receiving HTTP messages. We have few components running on Linux and components running on Windows and we're planning even more heterogeneity down the road.
  • Lastly, REST allows for easy debugging and run-time interaction. This proved invaluable during system integration test where we could easily understand the current state of each of the components in the system as well as the general picture.
Ok, if everything is so good, why not use REST for the whole system? Well, because like any architecture or architectural style (especially, when incarnated in a technology), REST has things that it does well and things that it doesn't (personally, I don't buy the Only Good Thing(tm) for anything or as Brooks puts it there's no silver bullet).
Let's look at message exchange patterns for instance. REST over HTTP support the request/reply pattern.
This works extremely well in many business situation. For instance is we have an Order service (or resource for that matter) and we need to calculate the discount for a specific customer we can go to the Customer service and get her current status and check if she a VIP customer, senior citizen etc.
There are, however, places where it doesn't work as smoothly. Returning to our Order, lets consider what happen once the order is finalized and we need to both start handle it (notify the warehouse?) and Invoice it
The order service does not care about these notifications it isn't its business.
My favorite way to solve this is to introduce business events (incorporate Event Driven Architecture) so that the interested parties will get notified. Another common way to solve this is to introduce some external entity to choreograph or orchestrate it (BPM etc.) both options have different constraints and needs compared with REST. In my organization we have a lot of processes that lend themselves to event processing much better than they do REST over HTTP (though the implementation might end up aligned with the REST architectural style - I am not sure yet)

Another reason not to use REST is when you have to integrate with stuff that isn't RESTful, for instance we need to integrate with systems that use RTP and other such protocols so we are bound to that - and we are a startup with "green field" development. In an established enterprise the situation is much more complicated.

To sum up, in my opinion when you take a holistic view of a complete business you are bound to see places where different architectural principles are a good fit. Architecture styles (and architectural patterns) are tools you can use to solve the challenges.There are places where a hammer is a great fit, but it is also wise to make sure the toolset has more than just a hammer.

PS

It isn't that you can't do events with  REST over HTTP. e.g You can implement the events as an ATOM Feed and have the "subscribers" check this feed every once in a while (the way this blog works). It can even check the HTTP header before getting the whole feed. Still push is a more natural implementation for this for various reasons like you don't have to know where to find the event source and you can more easily improve latency (when needed) etc.

 
Tags: REST | SOA | Software Architecture

Microsoft recently released SP1 for .NET. While the SP brings some nice stuff it seems it also has some bugs and a few less than inspiring components
Another example for a less than stellar idea is the "ADO.NET data services" component. Before I go on to explain why I think that. I should probably mention that this isn't just a Microsoft thing as IBM also mentions similar ideas as part of their (broader and sometimes even worse) view of "Information as a Service"

So why is exposing the database through a web-service (RESTful or otherwise) is wrong? let me count the ways
  1. It circumvents the whole idea about "Services" - there's no business logic
  2. It makes for CRUD resources/services
  3. It is exposing internal database structure or data rather than a thought-out contract
  4. It encourages bypassing real services and going straight to their data
  5. It creates a blob service (the data source)
  6. It encourages minuscule demi-serices (the multiple "interfaces" of said blob) that disregard few of the fallacies of distributed computing
  7. It is just client-server in sheep's clothing
When it comes for ADO.NET data services you can add a few other problems like
  1. it isn't really RESTful - you can also "enhance" the services with operations like example 18 in "Using ADO.NET data services" : http://host/vdir/northwind.svc/CustomersByCity?city=London (though it does support caching and hypermedia )
  2. Also it doesn't really externalize a state machine it externalizes a relational model
  3. It is built on Entity Services.

 
Tags: .NET | data | REST | SOA

Retrospectives, every "agile" team does retrospectives.What are retrospectives anyway?

A retrospective is a meeting where the team takes a look and inspect the past, in order to adapt and improve the future.

Agile or not, our team does a retrospective at the end of each iteration (every two weeks in our case). We try to look at what worked, what didn't , how we are meeting our goals etc, how is the product going etc.. These meetings provide a lot of value for steering us at the right direction.
On going retrospectives that look at the near past allows for suppleness and change adaptation and they are very powerful at that - However it is sometimes worthwhile to reflect over longer periods of time.

One area where longer perspective is important is the architecture of the project. Evolving an architecture you run the risk of accepting wrong decisions - mostly because architectural decisions have long term implications, while YAGNI, time constraints and life in general drive you toward short term gains.

Again, taking an example from my current project, working towards the first release, we took a few major decisions during the development e.g.
  • federated resource management - Taking into consideration the fallacies of distributed computing we decided that we'd have local resource managers that will take care of resource utilization and allocation. The resource managers will have a hierarchy where they'd communicate with each other to gain the "bigger picture"
  • Introduce Parallel Pipelines - handle image understanding by dividing the work between specialized components.
  • RESTful control channel - to use a "lingua franca" between all component types so that we can easily integrate across platforms and languages
  • local failure handling - resources and components handle failure by themselves
  • Communication technology (WCF in our case) is isolated from the business logic by an Edge Component
  • etc.
Once we finished delivering the first release. We took a few "days off" to consider what we've done thus far. updated our quality attribute list per our knowledge working with the system and looking at some customer scenarios. studies the things we liked/didn't like in the design and architecture of the working system. and revised a few of our decisions for instance
  • We found that rushing to a working system we introduced some excess coupling to a specific technological solution (for video rendering). We initiated a few proof of concepts and found out how to both isolate the technology from the rest of the system as well as allow more technology choices.
  • We found that the some of the data flows were not as clean as we thought they'd be - adding new features caused more resource interactions than we thought when we partitioned the resources. We redefined some of the resource roles to get less message clutter (and higher cohesion)
  • The federated resource management works well, but introduce needless latency in session initiation. We now opted for introduce "Active services" which are more autonomous.
  • Add a blogjecting Watchdog in addition to local failure handling to both increase the chances of failure identification and recovery as well as get a better picture in a centralized Service Monitor.
  • RESTful control channel worked well and will continue for later release
  • Some of the scale issues will be handled by introducing "Virtual Endpoints" while some would continue to use autonoumous endpoint creation and liveliness dissemination (hopefully learning from the mistakes of others)
  • etc.
The result of these and the other decisions we've maid is a rework plan that will (hopefully anyway) make our overall solution better.
What we see is that we evolved our architecture as we went forward. While all the the decisions we made seemed right at the time we took them, only through reviewing them in a wider perspective (architecture retrospective) we identified the decisions that we need to change and the ones that we have to enhance. The insight you gain after working on a project for awhile are much better than the initial thoughts you have or the understanding you master in the initial interations.
I think it is essential to review the architecture once you've gained more experience with the realities of the system you write (vs. the precieved realities you have on the get go)

By the way if you work with a waterfall approach your situation is worse. Since in this case you take your decisions before you write any code so, you don't even have the benefit of POCs, and working code to enhance your insights


PS
if you have the MEAP version of SOA Patterns you can read more on the patterns I've mentioned here: Active service in chapter 2, blogjecting watchdog in chapter3, Service Monitor in chapter 4, Parallel Pipelines in chapter 3, Edge Component in chapter 2


 
Tags: Agile | Project Management | REST | SOA | SOA Patterns | Software Architecture

DZone recently published an interview with me on my  SOA Pattterns book. Along with the interview you can also download chapter 2 of the book (I think you need to be a DZone member to actually download it).

Chapter 2 includes  the Edge Component , Service Host , Active Service , Transactional Service and the Workflodize patterns. Additional downloads related to the book include
Lastly, you can ownload the first version of chapter 1, which I mention in the interview and the slides of a presentation on few of the patterns from Dr. Dobb's Architecture and Design World last year


 
Tags: SOA | SOA Patterns | Software Architecture

July 24, 2008
@ 09:49 AM
Every Thursday we have this "happy hour", you know beers, snack etc. Every other week or so we also try to make it educational and after socializing for a while hear a presentation  or a webcast.

I used this week's slot to present the REST architectue style. I think the presentation turned out pretty well so I thought I'd share it online (note it is a 6M ppt)


 
Tags: REST | SOA | Software Architecture

July 12, 2008
@ 10:30 PM
My friend Gunnar Peterson asked about my opinion on SOA and security concerns. Here's what I wrote him:

In a paper I wrote a couple of years ago I examined the relevancy of the “fallacies of distributed computing” defined by Peter Deutsch almost 20 years ago. Writing about the “Network is Secure” fallacy I wrote that after all these years you would think that the fact you cannot assume the network is secure would be a no-brainer. Alas it still it happens all the time - and that's for "regular" distributed systems.

 In my opinion, assuming the network is secure for an SOA is not only naïve but negligence pure and simple. The whole premise of moving an organization to SOA is connectedness and integration. So, unless your SOA will fail it will be connected to other systems. Whether you  are building RESTful systems, WS-* SOAs, EDAs or any combination of these architectural styles, If you won’t treat the services boundary as a border and secure it – you will be sorry…

Security in SOA should be considered at the "grand-scheme" level with issues like authertication, authorization but also at the single service level, looking at issues like DDOS, SQL injection, elevation of privilige and what not. A trivial thing like exposing a transaction beyond service boundaries can translate to an attacker denying services in your system simply by locking out your database. Again, this is just a simple example.

The other thing about Security is that you have to consider it early. patching security "later on" can have devestating effects on a system's capabilites esp. in areas related to performance. I have seen even military systems that had to go through serious rework, just  because Security was added as an afterthought instead of handled early on


 
Tags: SOA | Software Architecture

I guess the designers of WCF really want to discourage some of the uses of the framework - I can't really understand some of their choices, if that was not the case.

For instance, when you create a stateful service (InstanceContextMode = InstanceContextMode.Single) the default concurrency behavior is single threaded. In this mode, WCF will serialize all the calls to the service and messages will wait/time-out. While it is easier to program, this has no real-life use except maybe for demo applications in Teched presentations.
Luckily you can override that and set ConcurrencyMode = ConcurrencyMode.Multiple and get a multithreaded service but the default is useless at best. By the way beware of the ConcurrencyMode.Reentrant  since in this setting you still have a single threaded service and WCF can accept calls when you call other services so you need to take care of multithreading but don't get the benefits.

Another example which is even worse, is the default for maximum number of connections for self hosted services. This is limited to 10, yes, 10 concurrent connections. We found that out when we set up a service that had, lo and behold, 11 different services that interact with it. These services would call the service something like 10 times a second and occationally we got timeout exceptions. At first we figured we got something wrong with the multi-threading implementation. So we spent a couple of days going over the locks and releases, and what-not. Then we thought the problem was with the transport (net.tcp) so we changed that to http and still saw the same problems. Only then we figured out that, as I mentioned above, the default is 10 concurrent sessions.
To solve this problem you need to configure the Throtteling behavior of the service by using ServiceThrottlingBehavior. This class has three useful settings

The MaxConcurrentCalls property limits the number of messages that currently process across a ServiceHost.

The MaxConcurrentInstances property limits the number of InstanceContext objects that execute at one time across a ServiceHost.

The MaxConcurrentSessions property limits the number of sessions a ServiceHost object can accept.


The default for MaxConcurrentCalls is 16, MaxConcurrentInstances int32.MaxValue and MaxConcurrentSessions is 10.
If you're using a self hosted service bump these up or you might DOS yourself like we did :)

Anyway, these defaults are a real barrier to scale and performance. Sure, you can change them easily, but you first have to know about them, and that's the probelm. Hopefully, my wasted time will help you avoid these problems :)


 
Tags: .NET | SOA

[It has been a little rough last week between a looming milestone @ work and my son fracturing his elbow @ home but hopefully I'll be back to the regular schedule this week]

Stateless services are da bomb right? they are easy to scale (since they have no state you can deploy as many as you like) they are easy to reuse (no state - no baggage) and what not.
The only problem with that is that the state doesn't really go away. Stateless services just suffer from NIMBYism ("Not in my back yard") when it comes to state. A stateless service needs to be stateful when it performs it action and since the state is not there, it has to get it from somewhere

There are basically two approaches to getting the state into the stateless service
The common way is to make the state someone else's problem (usually that would spell a database). With this approach the stateless service perform queries (database or otherwise) to get the state from the 3rd party. This is problematic in many ways e.g.
  • You need to pay network tax for getting the state (remember the fallacies of distributed computing..)
  • If that someone else is a single source (such as a database) it can easily become a barrier for scalability (I wrote about the RDBMS problem in the RDBMS is dead). If it isn't a single source you need to go to multiple sources so you have the network problem multiplies
  • You need to pay network tax for putting the state back at the state repository
The other way to get the state is to put the state on the message - or the "document" approach. This approach is superior to the previous one as you get to piggyback the data on the request. This is a good example of stateless communications*, which as a side effect, can save the stateless service the problems mentioned above.
The "state on the message" approach works when the handling of messages is serialized. ie. only one "station" in the flow can make changes to the state at any one time.  Unfortunately this only works for a subset of the interactions you can have. Inj most cases multiple consumers need to get to the same data or coordinate

You can also combine the two approaches and sometimes get good reults.
Another way altogether is to look at stateful services which I'll talk about in the next post



* Many times people fail to make the distinction between stateless services and stateless communications - I'll expand on that in another post.


 
Tags: scalability | SOA | Software Architecture

IT Business Edge published a short Q&A with me on SOA patterns - you may want to check it out :)


 
Tags: SOA | SOA Patterns

Someone calling himself r r left the following comment on part IV of my series of posts on SOA definition:

"I keep trying to read this series on SOA unfortunately suffers from the same disease as the rest of literature on the subject. stays general to a comfortable level so it can't really be applied anywhere, tends to complicate things where is not clear if it's needed, and encourages philosophical debate on what ultimately is a business (and so concrete) requirement. Meanwhile the serious (IMO) issues stay untouched - how does one actually approach an integration project with functionality, performance and security in mind. Which should be the standards used (considering the tens of standards on WS out there). How granular should the WS be (I'm done with answers like "not too much, but enough", or "well, depends on your project"). "
Before I talk a little about the "serious issues" mentioned above - I want to point out that the point of this series of post, as stated in the first post is to take a formal / semi-academic look at SOA. I started these posts as a reaction to a comment that Pete Lacey left on my blog stating that my view of SOA (as published in "What is SOA anyway?") does not demonstrate that SOA is an  architectural style. I don't pretense that this is some fully thought out academic dissertation or anything but I do try to look at the architectural roots of SOA.

That said let's take a look at the more interesting parts of this comment. First, the thing that bothers me about this reaction is (what seems to me as) the quest for final and concrete recipes. For instance consider the comment on service granularity
"How granular should the WS be (I'm done with answers like "not too much, but enough", or "well, depends on your project"")
The problems is - it does depend! and if you forgive me taking another philosophical detour, if you try to provide a hard definition for a service granularity you get  something like the heap paradox - When you remove individual grains  from a heap of sand is it still a heap when one grain remains. So while it is obvious that hiding a complete system as a single service is wrong and that exposing every little object as a service is wrong (even though for some inexplicable reason Juval lowy seems to thing that the latter is good practice) it isn't really obvious when you get too granular.

Nevertheless it is not a pure guess either. You can use some guidelines and measure them against your specific project/system/enterprise needs. Personally The set of guidelines I use is based on the fallacies of distributed computing :
  1.  The network is reliable
  2.  Latency is zero
  3.  Bandwidth is infinite
  4.  The network is secure
  5.  Topology doesn't change
  6.  There is one administrator
  7.  Transport cost is zero
  8.  The network is homogeneous
Since a service edge is boundary which may (usually is ) be accessed remotely you need to think about the incoming and outgoing interactions of the service within the fallacies stated above. if the proper behavior of the service depends on one of the above there's probably something wrong.

Regarding the other questions (how do you approach a real system), well, if you pardon me for banging my own drum, that's exactly why I started to write my experience on these matters as patterns. for instance if we look at the saga pattern (one of the patters I published online). you'd see that it is talking about achieving distributed consensus in a transaction-like manner. I talk about the problems of using distributed transaction etc., offer an architectural solution (the saga ) and then discuss relevant technology issues (e.g. WS-BusinessActivity ) as well as its implication from quality attributes perspectives (Integrity and reliability). Nevertheless even these patterns aren't an end-all solution. different circumstances require different solutions
Both my previous job and my current one involves building a scalable solution on-top of algorithmic engines. In my previous job I  managed the construction of a biometric solution that allows using multiple biometrics. In my current job I manage the development of  a mobile visual search solution . Again, while on the surface both needs to get some data, run a few  algorithms and produce an answer. These systems have very different quality attributes. On the first system we had to handle very large databases, hundreds of queries, an emphasis on modifiability and security, the current one needs millions of queries, almost no database, low latencies and emphasis on usability.  These differences result in radically different solutions, with different services, different interactions , use of different patterns etc. There's no "one right answer" (tm)


 
Tags: PaperLnx | SOA | SOA Patterns | Software Architecture

This post is part of a series of posts trying to define SOA as an architectural style. In the previous post I talked about SOA and the Layered architecture style (which generated a couple of follow-ups - one on layered architecture in general, one on its importance for SOA and on on layers in enterprise architecture vs. solution architecture)

The next architectural style SOA builds on is Pipes and Filters, Unlike Layers and Client/server which I described in previous installments, Pipes and Filter is not also a base style for REST. This basically, this style is where SOA and REST begin to diverge.
The pipes and filters architectural style defines two types of components - yep you've guessed it, Pipes and Filters.
Filters -  are independent processing steps they are constrained to be autonomous of each other and not share state, control thread etc.
Pipes - are interconnecting channels


Each filter exposes a relatively simple interface where it can receive  messages on an inbound pipe, process tthem and produce  messages on outbound pipes. The idea behind this is to allow easy composability thus allowing greater usage (also known as "reuse" - I'll discuss the difference in another post). Systems are composed of several filters working together, filters can be replaced with newer version (provided they keep the same interface) etc.
On the downside the overall latency is increased , since to accomplish a task you have to move from filter to filter.

The pipes and filters style brings to SOA things like the autonomy of services, the sense of explicit boundaries. For instance, this is the basis for why you wouldn't want to do distributed transactions across service boundaries, which I blogged about several times before.

The pipes part of the "pipes and filters" also means that the wiring can be taken care of outside of the services themselves and that you can control them externally, this works well with ithe use of middleware (service bus). Additionally Fielding (you know, the REST guy) also mentions that
"One aspect of PF styles that is rarely mentioned is that there is an implied "invisible hand" that arranges the configuration of filters in order to establish the overall application. A network of filters is typically arranged just prior to each activation, allowing the application to specify the configuration of filter components based on the task at hand and the nature of the data streams (configurability). This controller function is considered a separate operational phase of the system, and hence a separate architecture, even though one cannot exist without the other."
Which is the harbinger of the orchestration/choreography aspects of SOA.

So as you see, pipes and filters is one of the important pilars of SOA, in the next part (unless I'll have to clarify things about this post) I'll talk about the last architectural style SOA builds upon "Distributed Agents".


 
Tags: SOA | SOA Patterns | Software Architecture

March 6, 2008
@ 09:14 PM
Jack Van Hoof has a different view than I have on the difference between Tiers and Layers. I am not sure I agree with his view, but it still provides an interesting read. I think  the main difference between our respective views is that Jack takes a look form the enterprise-architecture angle which gives him layers like
  • Technical infrastructure - OS, directory Services etc.
  • Application infrastructure - Apps, Portals, DBMSs
  • Application Landscape - SOA, EDA
  • Bushiness Processes - BPM
Jack uses the term tier for layers within the same level of abstraction. for instance he gives the following examples:
"E.g. the layer of business services may be arranged in the tiers: front-office, mid-office and back-office. At the next lower layer, the application layer, services may be arranged in the tiers: UI, business logic and data persistency. The interaction of services between two tiers may be bidirectional (but may also be constrained to unidirectional). "
The perspective I have (or at least try to maintain in this blog) is the solution/product line architecture - basically living within Jack's application layers. So in my view I want to know and differentiate between the difference of having a UI and business logic live on the same machine vs. having them distributed in the world. So I guess in the end both perspectives need to have their place and the problem is, like many other times,is  overloaded terms


 
Tags: SOA | Software Architecture

February 23, 2008
@ 08:38 PM
In the previous post I mentioned a couple of questions on SOA and layers Udi left on an older post I made:

1. How does this [layers - ARGO] play with two services talking with each other? One pubs to the other's subs?The other requests to the first's response?
2. How valuable is the layered abstraction?


1. As I explained in the previous post. Layers does not necessarily mean unidirectional relation from a top layer to a lower level one - it does mean that a layer can only know a layer that is diretly above or below it. In other words the bidirectional interaction between two services  i.e. the request, reactions, events etc. flowing between them do not violate the layered style constraints.

2. So, how valuable is the layered abstraction to SOA? The short answer - very :). Again, as I mentioned in the previous post, the main reason layers don't seem that valuable is because they've been misrepresented and misused. Layers bring added flexibility to SOA. The fact that a service or any other SOA component cannot see beyond the next layer enables things like the  ServieBus, Edge Component, Service Firewall etc. Without layers it would be harder to have autonomous services as other services could (potentially) have access to the innards of the service adding more coupling and preventing independence.



 
Tags: SOA | Software Architecture

February 5, 2008
@ 11:29 PM
Following the third part of my "Defining SOA" posts Udi Dahan left the following two questions:
How does this [layers - ARGO] play with two services talking with each other? One pubs
to the other's subs?The other requests to the first's response?
How valuable is the layered abstraction?

Considering Udi and me usually see things eye to eye. I guess that if I managed to get him confused, more clarification is warrant :) I'll do that in two posts, this one which will explain the concept of layers and the next which will explain why it is paramount  to SOA (and answer the two questions)

Usually when I review an architecture one of the first (sometimes the only) architectural artifact I am shown is a "layered diagram" of the architecture e.g. something like the following:


These sort of half layers/half block diagrams with or without the common 3 layers (which also appear in the diagram above) of "UI" "Business" and "Data" give the whole idea of layers a bad name

The key differentiator between layers and just a bunch of blocks is the limitations on the allowed communication paths between the components (layers vs. blocks). In the previous post I quoted an old (2005) definition I had for layers where I said the following
"Typically a layered is allowed to call only the layer below it and be called only by the layer above it (but there are variants e.g. a layer can call to any layer below it; vertical layers that can call multiple layers; etc. -- all is fine as long as the layers communication paths are limited by some rules)."
Alas, I was too quick on the Copy/past and everything in the brackets (bold) is wrong - it should actually say - "but there's another variant where layers are allowed to call the layer above it or below it". The other variants (like  a layer that can all any below it) just muddy the water, makes it hard to distinguish between layers and regular components and thus make layers seem unimportant. consider the following diagram:


So, in the above diagram the relations are the Component D and Component C know each other. Component D is made of two layers (A and B). Note that a more proper representation should also explain the relations allowed between the layers i.e. is it unidirectional or bidirectional (unless there's a common convention in the project)

Why is the distinction between layers and other type of components important? because Layering gives you some benefits which "just components" don't:
1. Layers allow information hiding. Since we don't know the inner working of what's beyond the layer
2. Layers allow separation  - Things beyond the immediate layer  are hidden from each other. This means that things which are beyond the layers are loosely coupled in a way that allows for  flexibility and the addition of capabilities. for instance adding a firewall between your computer and the internet.
3. Layers allows changing the abstraction level - since layers are hierarchical in nature, moving through the layer "stack" you can increase or decrease the level of abstraction you use. This allows expressing complex ideas with simple building blocks. The best known example for this is the TCP/IP stack moving from an abstraction level close to the hardware of the network interface layer to the application level protocols such as HTTP


On the downsides - layers hurt performance by adding latency. Also too many layers introduce added complexity to the overall solution (e.g. it is harder to debug).

It it interesting to note that Interfaces are in fact leaky layer abstractions (vs. for example SOA contracts which are not leaky) - since when you use an interface you still need to instantiate the object which is otherwise hidden behind the "layer" (interface). This is basically the reason we want something like dependency injection (DI) - to help make the abstraction complete and why languages like Ruby where the contract abstraction is complete - you don't need DI (I discussed Ruby and DI in another post)

Another issue which I mentioned here is the difference between logical layers and physical (or potentially  physical) layers. I usually call the first kind layers and the latter tiers. logical layers are local and can assume a lot about their neighboring layers. Tiers or physical layers can be distributed, which carries a lot of implications (something I discussed here recently in relation to MS Volta)





 
Tags: Design | SOA | Software Architecture

This post is part of a series of posts trying to define SOA as an architectural style. In the previous post I talked about how SOA builds on the Client/Server architectural style. In this post I'll talk about how SOA builds on the architectural style of Layered System.

Layered System or Layered architectural style is one of the most basic and widely used architectural styles. Here is a definition of Layered architecture I posted in the past
The layered style is composed of layers (the components) which provides facilities and has a specific roles. The layers have communication paths / dependencies (the connectors).

In a layered style a layer has some limitations on how it can communicate with other layers (the constraints). Typically a layered is allowed to call only the layer below it and be called only by the layer above it (but there are variants e.g. a layer can call to any layer below it;  etc. - all is fine as long as the layers communication paths are limited and restricted by some rules)
SOA takes the strict layers definition and restricts the knowledge of one service only to the service interface/contract of the other services. This means the services cannot be aware or care about the internal structure of other services. Services don't mind the internal structure of other services. This helps with introducing the  "boundaries are explicit" tenet  (although, it build on more than just layering)

The layered nature of SOA means you can also add additional layers between the services. One very common example is adding a servicebus (e.g. using an ESB or tools like NServiceBus) other examples can include load balancers, firewalls (see Service Firewall pattern) etc. Naturally, When you add intermediary layers  services don't talk to each other directly rather accept the services (such routing , message persistence etc.)  from the intermediary layer.

It should be noted, that in the context of SOA the layers are, in most cases, actually tiers. The difference is that tiers provide (potential) physical separation where as layers provide logical separation . When a layer is actually a tier it has extensive implication on the level of trust between the tiers (see my post "Tier is a natural boundary" for more details)

The next post in the series will talk about the "Pipe and Filters" style  and SOA. This is the first place where the REST architectural style and SOA diverge.


 
Tags: REST | SOA | SOA Patterns | Software Architecture

Sam Gentile and myself exchanged a few blog posts on the definition of SOA, in the latest installment Sam disagrees with me that SOA should first be looked at in the pure architectural sense without bundling in the business and enterprise aspects.
In a nut shell I have two main reasons to prefer looking at SOA at the core as a pure architectural style.
The first is the when you bundle in enterprise-wide aspects of implementing SOA you loose out on the option (or the audience) that can use it to solve more local problem (i.e. at the product/solution level) using the same principles that bring the benefits on the enterprise scale.
The other reason I have  for separating the concepts is that the business encompassing definitions tend to be fluid, hand waiving ones and cannot be measured for compliance.
Consider the definitions Sam quotes from  Thomas Erl's books:
"SOA establishes an architectural model that aims to embrace the efficiency, agility, and productivity of an enterprise by positioning services as the primary means through which solution logic is represented in support of the realization of strategic goals associated with service-oriented computing." (emphasis by Sam)"

SOA represents a model in which functionality is decomposed into small, distinct units (services), which can be distributed over a network and can be combined together and reused to create business applications. [3]
Now what the hell is that? These are all noble goals but shouldn't this be the goal of any enterprise architecture ? What makes SOA unique in this sense?
Also how does these definitions help us build services? what makes a service a service ? Why is (or isn't) any web-enabled component a service?
Definitions that distance themselves from the architectural roots seems to me like smoke and mirror and contribute to the general confusion around SOA - to the point where even people like Harry Pierson wonder why we should even bother defining it

Personally, I still think it is worth while defining *** ( the architectural style, formerly known as SOA) since as I mentioned earlier it is (in my opinion) a useful architectural style for building distributed systems - whether the distributed system is a solution, a product, a product line or a complete enterprise





 
Tags: SOA | SOA Patterns | Software Architecture

December 29, 2007
@ 10:49 PM

Sam Gentile comments about my attempts to define SOA (Part I, Part II, more to come..) and says that

"That's all well and true, but any definition of SOA must encompass the business drivers and business reasons, as SOA is not really about technology. It is about a better alignment of business and IT through business processes and services. The goal is to create a dynamic, more Agile and Dynamic IT that can respond quickly to new business opportunities and threats by quickly assembling new capabilities from putting together composite applications (and even Mash-ups) from reusable business services..."

I am sorry Sam, but I beg to differ, not about the importance of business drive behind implementing SOA, but about what SOA is. The culprit, in my opinion, is terminology overloading

 SOA is, as I said in the above mentioned post and numerous other times, is first and foremost an architectural style - as an architectural style it offers several architectural benefits and poses several architectural constraints. This has nothing to do with business drivers. it has to do with defining components, relations, attributes on relations and components as well as constraints. Now you can take those set of rules and use (or misuse) them as you like, in the context of a subsystem, single project, a product line or  an enterprise - this is your choice.

Applying SOA, on the other hand, has everything to do with the business . I'll take Sam's post word for word but instead of using the word SOA, I would prefer using the term SOA initiative. An SOA initiative is the effort of applying SOA in a wide context for an enterprise, aiming to increase the alignment of IT and the business etc. I would have to say though,  that in my experience, such an effort would rarely use SOA alone. It would also include other distributed architectural styles that also help with decoupling and loose coupling like EDA and REST to name a couple.


By the way, SOA has nothing to do with technology either. You can implement SOA using WS-*, Atompub, MSMQ, CORBA just as much as you can implement REST with quite a few technologies, it so happens that WS-* is a common implementation technology for SOA, and that HTTP is used as a common implementation technology for REST but both styles live independently of the technologies.


 
Tags: SOA | Software Architecture | Trends

In the previous post  on defining SOA I claimed that SOA is an architectural style building on 4 other architectural styles. The first one of these is Client/Server.
Describing client/server is easy - not because I am such a genius (far from it) but it has already been done before numerous times. Let's take a look at the definition from  Roy Fielding  in his famous dissertation (The link is to chapter 3, REST is defined in chapter 5 if you are interested)

The client-server style is the most frequently encountered of the architectural styles for network-based applications. A server component, offering a set of services, listens for requests upon those services. A client component, desiring that a service be performed, sends a request to the server via a connector. The server either rejects or performs the request and sends a response back to the client. A variety of client-server systems are surveyed by Sinha [123] and Umar [131].

Andrews [6] describes client-server components as follows: A client is a triggering process; a server is a reactive process. Clients make requests that trigger reactions from servers. Thus, a client initiates activity at times of its choosing; it often then delays until its request has been serviced. On the other hand, a server waits for requests to be made and then reacts to them. A server is usually a non-terminating process and often provides service to more than one client.

Separation of concerns is the principle behind the client-server constraints. A proper separation of functionality should simplify the server component in order to improve scalability. This simplification usually takes the form of moving all of the user interface functionality into the client component. The separation also allows the two types of components to evolve independently, provided that the interface doesn't change.

The basic form of client-server does not constrain how application state is partitioned between client and server components. It is often referred to by the mechanisms used for the connector implementation, such as remote procedure call [23] or message-oriented middleware [131].

SOA takes from the Client/Server style the two roles - ie. in each interaction one party is the client (what I call service consumer) and the other is the server (service) which  handles the request coming from the client*. Unlike traditional client/server, the roles are held only for a particular set of interactions - a given interface that the service exposes. In another set of interactions the roles can be reversed and a component that once was a server can now act as a client even working with the very same component that was previously its client.

Like REST, SOA takes the constraint of separation of concerns which allow the service and its service consumers to evolve independently (as long as the interface is kept).
In order to support this, services should takes care of all its internal state without exposing its internal state or its internal structures outside of the service. This also allows the service to scale behind the interface but for that we also need constraints and capabilities from the next architectural style layered system, which I'll discuss in the next installment on this subject.


* You can compose SOA with other architectural styles to get different behaviors. E.g. compose SOA and  EDA and you can have the service also push data.This t isn't, however,  something SOA ,manifest in its basic form


 
Tags: REST | SOA | SOA Patterns | Software Architecture

December 11, 2007
@ 11:27 PM
I got a couple of emails with questions reagarding  my previous post on Volta. So here's another go at explaining why dynamic-tiering is not a good move - this time in technicolor.

Let's start with a simple illustration. The diagram below represents a typical local component(A) in its environment. As a component that works locally, it has access to other local components which it interacts with. These can be objects it created by itself or objects that where injected to it. The likely design for local components is to have a chatty interaction - After all objects can talk to instances of other objects quite easily.



Now enters Volta (or any other  such framework - and I've seen a few. I am  ashamed to say but I even wrote one about 15 years ago) and says we'll just mark things we want to execute on a different server and everything would be fine. What you get is something like the illustration below:



We have the same number of interactions - only now all the interactions between A and its (used to be) near environment requires serialization, network interaction, possibly encryption, authentication, authorization and what not. You can imagine that this type of interaction can have a heavy hit on performance and scalability if it wasn't pre-designed somehow.

This is a bit of hand-waving so let me also give you an example from a real project. About 3 years ago I was invited to consult in a project. This was the kind of project that interacts with real things like sensors etc. I'll use an automated irrigation system to illustrate its architectural components. One type of component is "Things", these represent real devices you can interact with like sprinklers, soil sensors etc. Things represent the logical state of the real devices and cannot talk to each other. When two Things need to interact -e.g. we want to turn on the sprinkler if the soil is dry, we introduce another architectural component, we'll call it "Interaction" which looks at the state of the Things and can then act upon it. The last major type is "Services" (not services in the SOA sense) e.g. we can have a Service that reads the weather. Services can't interact with Things directly, but they can interact between them and they can interact with "Interactions". This particular system had dozens of Things, Hundreds of "Interactions" and "Services". And the tiers/process boundaries were as follows:


Interactions have to know about changes both in Things and Services so messages keep flying around this system to keep the Interactions in sync as well as propagate decisions made by Interactions. The outcome of this "smart" design is that every status change in a Thing results in an order of magnitude more messages to react to the change is status. I was brought in to find a way to find a way to get in-order reliable messages flow fast enough between the different tiers. I did my best and left -what they didn't want to listen to, and the better solution is to give a lot of thought about related Things , Interactions and Services and bundle them together into "tierable" component. The interactions within these "chunks" would be local and would then inflict a whole less messages on the system. In our example it makes sense to bundle the four components (sprinkler etc.) into a single tier and possibly the same process and increase the overall performance significantly while also giving us  more cohesive boundaries.




(as a side note I'll just mention that I ran into someone who is part of this particular project a few days ago - They are still struggling with performance and stability problems...)

Anyway, one could argue that frameworks like Volta would allow you to move from the bad partioning to the good one more easily - but this is not really so since when you rearange the components you also have to remodel the messages that flow between the new partitions. Also

This is not to say that having the ability to run a system in local and in distributed modes does not have value - as I said in the previous post- it is the assumption that you can easily move this boundary and still get a viable solution that is wrong. Also if you are going to allow running in local and distributed mode that doesn't have to spell to "dark magic" of MSIL rewrites and compilations.
In another (SOA) project we designed services so that in a small-scale installation you would be able to instantiate services in the same process. Services were constructed as Active Services (i.e. have at least one  thread of control). If you wanted to let two services run in the same process you just had to write a new ServiceHost and a new ServiceBus The new ServiceHost has to provide each service its own thread or thread pull and the ServiceBus has to work inmemory by passing message objects around rather then serializing/deserializing and sending them over the network. On a small installation this works better than multiple processes (but not as good as a system designed specifically to run on a single tier). Note that this is the opposite of what Volta does as it takes a distributed solution and allow it to run locally rather than the other way around.

The other part of Volta is the C# to javascript cross compiler. This may have a future - but it really depends on the attention Microsoft will put into this direction. Google does something similar on its android mobile platform where it takes Java bytecode and translated it into the Dalvik VM. But for Google that's a strategic platform. With MS investments in Silverlight (Which I personally prefer), I would guess the effort in would always lag behind (though I hope they'd get it to be better than it is today)

 
Tags: .NET | Design | Everything | SOA | Software Architecture

November 24, 2007
@ 06:34 PM
A few weeks ago I posted a reaction to a post by Pete Lacey that asked what is SOA. In a comment to my post Pete said that my definition isn't good since
"...even according to your definition, an architectural style contains constraints, and to date neither SOA nor web services have been shown to exhibit any constraints"
The idea behind this series of posts is to try to take a little more formal view at what I think SOA is. It is based on my thinking for the past few weeks but it is also still a work in progress (so any comments are welcome)

The way I see it SOA is an architectural style which is derived from the following architectural styles:
  1. Client/Server
  2. Layered System
  3. Pipe and Filters
  4. Distributed Agents
Note that if you add to the above statelessness, uniformed pipe and filters and a cache you can get a RESTful SOA. This is not REST as REST itself does not require distributed agent or even pipes and filters (but it does build on client/server and layered system). In other words not all RESTful systems are SOA, you can build SOAs which are not RESTful and you can build RESTful SOAs.

The main components of SOA are Service,Message, Contracts and Consumers. Policies also exists but now I tend to think they are optional. The four architectural styles mentioned above affect the definitions of the different components and the way they interact together

In the following posts on this subject I'll first take a look at each of the contributing architectural styles and how they affect SOA and later try to provide a definition that builds on them


 
Tags: REST | SOA | SOA Patterns | Software Architecture

October 30, 2007
@ 11:41 PM
Earlier today Microsoft announced it newest SOA initiative codenamed Oslo Here are a few observations I have on this announcement.

Let's start with "what is it?" Well it isn't an "it" per se, since Oslo is a bunch of initiatives within the Microsoft offerings.

For one, it is some of the libraries within .NET 4.0 -- specifically the next versions of WCF and WF.

Secondly, it is a bunch of designers and tools that will be part of Visual Studio (beyond VS 2008).
The most interesting component of Oslo will be a new repository to allow version management of models and services. I guess it is safe to say it will be built upon Team Foundation Server (or a subset of which which will be used by both products).

The last part of the puzzle is of course V.Next of Biztalk and something currently branded as "Biztalk Services '1'". As far as I know Biztalk sells pretty, but I think it is both too bloated (e.g., think about the hardware needed to run this in high-performance solution) and builds on the wrong architecture (hub vs. bus). I hope Microsoft makes major updates this time (Biztalk 2004 to Biztalk 2006 mostly innovated around the business activity monitoring. While that's important I think more work on the engine was/is due).

Biztalk services would offer an implementation of some of the SOA patterns I talk about -- service host, workflodize, etc. -- to provide an infrastructure for building services. The relation between "Biztalk 6" and "Biztalk services 1" is not clear from the information provided by Microsoft; hopefully this is just a branding issue and not a tight relation between the products.

On the upside, one of the key persons working on this is Don Ferguson who, before joining Microsoft, was chief architect for IBM's software group. About a year ago I had the chance to hear him talk about SOA and all I can say is that Don is someone who really knows his stuff.

PS: It's amusing to see the press release talks about "model-driven" approach rather than software factories, but I guess that's just nitpicking.
 
Tags: .NET | Everything | SOA | Software Architecture

October 5, 2007
@ 10:46 PM
Pete Lacey has a post called "What is SOA?" where he defines SOA as follows:
"
  • Network Oriented Computing (NOC): An approach to computing that makes business logic available over the network in a standardized and interoperable manner.
    • Service Oriented Architecture (SOA): A technical approach to NOC that has a non-uniform service interface as its principle abstraction. Today, SOAP/WS-* is the chief implementation approach.
    • Resource Oriented Architecture (ROA): A technical approach to NOC that has an addressable, stateful resource as its principle abstraction. Today, REST/HTTP is the chief implementation approach.
  • Business Service Architecture (BSA): An unnecessary term (also not an architecture) that tries to make the obvious something special. Aka, business analysis. Aka, requirements gathering"
I am sorry but I beg to defer.

The first thing to note (again) is the architecture vs. architecture style differentiation I mentioned in a previous post (You can see a similar definition by Stuart Charlton) Here is a quick reminder :
Software architecture is the collection of the fundamental decisions about a software product/solution designed to meet the project's quality attribute requirements. The architecture includes the main components, their main attributes, and their collaboration (i.e. interactions and behavior) to meet the quality attributes. Architecture can and usually should be expressed in several levels of abstraction (depending on the project's size).
An Architectural style is a blue print that can be used when you desing an architecture. An architectural style defines some of the components and thier attributes as weel as place constraints on how they can interact.
My claim is that SOA is an architectural style for distributed computing which puts extra emphasis on the interface (and hence gets the easier interoperability). Ok, if SOA is indeed an architectural style, we should be able to define it as a set of components, interactions and attributes. Well, I already did that a while ago (in a paper called "What is SOA anyway?"). And while it may not be perfect, I think it is a reasonable definition all the same:

"SOA is an architectural style for building systems based on interacting coarse grained autonomous components called services. Each service expose processes and behavior through contracts, which are composed of messages at discoverable addresses called endpoints. Services’ behavior is governed by policies which can be set externally to the service itself. "



You can see the above mentioned paper for a little more detail on each of the components.

ROA, in my opinion, is just a re-branding of REST so that it would be easier to discuss it as an architectural style and not connect it to the HTTP implementation - which is what  a lot of REST proponents are doing.

By the way, as I pointed out before, there are a few other important architectural styles that are related to distributed systems like Event driven architecture, Spaced based architecture, peer-to-peer etc.

As for "Business Service Architecture" - I personally like to think about that as "SOA initiative" as in the strategic decision to try to implement an SOA in an organization while trying to achieve the more nebulous traits like business and IT alignment etc. (which is why it is nether architecture nor architecture style)


 
Tags: Everything | Papers | REST | SOA | Software Architecture

In a recent post Steve Vinoski said:

"Frankly, if I were an enterprise architect today, and I were genuinely concerned about development costs, agility, and extensibility, I’d be looking to solve everything I possibly could with dynamic languages and REST, and specifically the HTTP variety of REST. I’d avoid ESBs and the typical enterprise middleware frameworks unless I had a problem that really required them (see below). I’d also try to totally avoid SOAP and WS-*."

It is easy to dismiss this as just another yahoo who goes against conventional wisdom until you remember that Steve spent more than a decade working in Iona in leading roles like Chief Engineer of product innovations and helped develop some of the middleware standards for OMG and W3C.

Well, I guess that's becoming an epidemic  now :)  just recently we had Michael Stonebreaker talking about the RDBMS demise, Pat Helland talking about life beyond distributed transactions.  and now Steve on ESBs.

That trend aside, I think Steve is doing throwing the baby out with the bath water. The dream of a single infrastructure for an enterprise is ludicrous enough (Remeber Peter Deutsch and the "The network is homogeneous" fallacy). but if you drop the "E" from the ESB moniker you get a valuable middleware which is very usable in many situations and not just legacy system integration. For instance one thing that is missing form "HTTP variety of REST" implementation is reliable messaging. location transparency is  harder to solve with HTTP etc.

Another problem I have with the current approach of Steve is that he is replacing one dogma (EBSs are good) with another (ESBs are bad use Ruby, REST) - this is not a healthy approach. The solution should match the problem, that's probably the primary reason why we need architects after all

 
Tags: ESB | Everything | REST | SOA | Software Architecture

It seems that even the smartest people can get the difference between architcture, architecture styles and technology wrong
For instance Anne Thomas Manes points out the Roy Fielding makes this mistake in his REST and Relaxation presentation by mixing an architectural style with technology:
 "Roy is equating SOA with web services. Although a lot of folks use web services to implement services, that's simply an implementation decision"
But then procede to make the exact same mistake 
"So when watching Roy's presentation, replace the term "SOA" with "WS-*", and the discussion will make a lot more sense."
REST is an architectural style you can implement it with WS-* which is a technology. It is not the most natural way to use WS-* standards but it is doable.

Looking at the same context (i.e. Roy Fielding's presenation) Steve Jones makes a similar mistake confusing Architecture and Architecture style.

My definition for software architecture is
Software architecture is the collection of the fundamental decisions about a software product/solution designed to meet the project's quality attribute requirements. The architecture includes the main components, their main attributes, and their collaboration (i.e. interactions and behavior) to meet the quality attributes. Architecture can and usually should be expressed in several levels of abstraction (depending on the project's size).
An Architectural style is a blue print that can be used when you desing an architecture. An architectural style defines some of the components and thier attributes as weel as place constraints on how they can interact.

For instance, the REST constraints (taken from Anne's post mentioned above) are:
"Uniform Interface:
  • Resources are identified by only one resource identifier mechanism
  • Access methods (actions) mean the same for all resources (universal semantics)
  • Manipulation of resources occurs through the exchange of representations
  • Actions and representations are exchanged in self-describing messages

Hypertext as the engine of state:

  • Each response contains a partial representation of server-side state
  • Some representations contain directions on how to transition to the next state
  • Each steady-state (page) embodies the current application state"
Architecutre Styles can be combined to create new architectural styles. Roy Fielding demonstrates this in his famous dissertation  where he demonstrate how REST is a composition of several styles such as  Client/Server, Layered system, Stateless etc. As another example (which a lesser degree of precision) I take about enhacing SOA with EDA in "bridging the gap between BI and SOA"

The last piece of the puzzle is technology. Technology (in the software context) are set of tools provided by a vendor to enable and support building software solutions. As I've said here numerous times, technologies has their own internal architectures (as they are software solutions themselves) which is why different technologies support different architectural styles and why the alignment of the technology with the architecture chosen for your solution is important.

Yes this post is all about semantics - but clear meanings are important to prevent confusion, at least in my opinion anyway


 
Tags: Everything | REST | SOA | Software Architecture

There were a few threads about whether SOA is about the technology or not.

In my opinion SOA and Architecture in general  are never about the technology - technology is important but it is just one variable in the equation. What we are looking for is a way to satisfy as many of the business needs as we can under all the constraints we face.

For instance, a few days ago I got a question in my email box from someone calling himself coldplay. While I don't think the band has somehow got itself interested in Event Driven Architecture, the question itself looked interesting enough. Here is the situation:
Current Setup
Lets take a e.g of a Inventory Stock Reorder point exception with in heterogeneous apps environment(No-SOA and integrations)...
The exception definition was built into the source apps and when the stock dropped below reorder....event registered and led to a exception. Exception was further handled by a rules based engine and a workflow notification raised ..

Planned Setup
Same e,g as above .. Post SOA implementation.. Inventory management is composite service built by orchestrating collaborative services from SAP and Oracle...which have different data model supporting them...
The exception definition requires to be defined outside the native Oracle apps and might have to get some event related information from SAP web service also .. to arrive to a conclusion as to whether this really is a Exception or not ..


Possible Technical approaches:

• Data Persists somewhere in the processing the Exception
• Data doesnt persist
• In mem database used..


My Question now :

1. What do you advise to be used in EDA?  which would reduce network round trips, decrease apps server loads from the above 3 technical outlooks.
2. What is diff between in-mem db and usual processing of apps logic by a apps server

I feel :

• Data persistence would lead to larger commit times and reduce operational efficiency
• If the data doesnt persist... and all validations are executed on the fly... dont you think the current apps servers would die processing ... or if its processing capacity is increased .. is it going to be economically viable alternative.

To be quite honest I can't really answer the questioned asked because the question lacks the business context -
what are  the implications if events are  missed or lost? What's the acceptable latency that would allow the business to operate properly ? if the Oracle bits and SAP bits need a lot of data from each other - then maybe the whole service partitioning is wrong and the services are not cohesive enough? How many business events are expected anyway? How often? and the list goes on.

Once you answer the business questions you can look at the available technology portfolio and ask whether you would want a in-memory Database or maybe a datagrid would be a better option? and even then the decision is not just technology driven since when you do cost/benefit analysis you need to take into account purchasing costs, operational costs, skillset of the dev team, time to implementation etc.

This is not to say that if you choose a technology that isn't aligned with your architecture you should reconsider the architecture (or technology). Also since each technology product brings into the table its own architecture (with its own constraints and decisions) you probably need to verify the architecture once you make technology choices. but still, at the end of the day it is the business needs that sits in the driver sit, the rest is just tagging along for the ride.


 
Tags: Everything | SOA | Software Architecture

September 20, 2007
@ 12:25 AM
Another REST related post - this time I want to share a couple of observations I had after reading (Roy's presentation from RailsConf 2 days ago via Pete Lacey) and listening to Roy T. Fielding's presentations.


The first point has to do with a question which is sometimes raised whether you can do REST without HTTP. i.e. can you have a RESTful architecture  if you don't use the http protocol and further more not using the http verbs (GET/PUT/HEADER etc.) or  as the unifier interface. I talked about it a while ago and I think you can. listening to Roy's talk  it seems that, at least in http architect's opinion the answer is yes as well.

Another point that occurred to me, watching Roy's talk, which is related to the "REST magic" post I wrote a little over a week ago. The use of a uniform interface is tauted by REST proponents (and Roy himself) as coupling reducing formula. After all if you use a uniform interface you are not coupled to the particular semantics of any resource/service you already know the capabilities (actually the maximal capabilities) they offer. What ensues is that instead of using a lot of verbs (ReserveRoom, UpdateOrder etc.) you use a lot of nouns (/rooms/, /orders/order1 etc.)

This works extremely well on the "human" web where my browser can navigate to any-ol'-site without any prior knowledge of what's the site about. When I navigate to Amazon I can buy stuff, when I navigate to New York Times I can read stuff etc. The problem here is the browser is really dumb about what's going on. I, as a human using the browser, understand the context from the content (well, most of the time anyway;) ) so the browser can remain decoupled.  However when you translate it to the "programmable" web you usually don't have some mighty AI engine examining the response to understand the context - instead what you do is trade the verb coupling, which with WS-* web-services would be defined in a contract, you are now coupled to the nouns ( this is not to say the nouns aren't discoverable - since they are due to the hypertext or document orientation communication REST encourages). The end result is pretty similar to what you get when you use verb based contracts your software still needs to understand (where "understand" means some level of coupling) what it is doing with the "other" services. not to mention that you still need to understand the content of the message (sorry- response) to do anything useful with it.

In any event, while loose coupling is very desirable, we also need to remember that the only way to truly achieve complete decoupling is to not connect components. So some coupling is always needed if we want to produce meaningful systems.

What do you think?


 
Tags: Everything | REST | SOA | Software Architecture

If there's one reason to go to ApacheCon 07 in Atlanta, then it's probably Roy T. Fielding's "a little REST and Relaxation"

Here is the abstract:
"Representational State Transfer (REST) is an architectural style that I developed while improving the core Web protocols (URI, HTTP, and HTML) and leading them through the IETF standardization process. I later described REST as the primary example in my dissertation. Since then, REST has been used (and sometimes abused) by many people throughout the world as a source of guidance for Web application design. But is the REST that we hear about today the same as what I defined in my dissertation, or has it taken on the baggage that comes with an industry buzzword? This talk will provide a real introduction to REST and the design goals behind its evolution as the Web's arhitectural style. This is not about XML-over-HTTP as an alternative to SOAP, nor about "resource-oriented" frameworks that help simplify CRUD operations, but rather about the design goals and trade-offs that influence the development of network-based applications. I will also describe what happens when we relax some of the REST constraints, and how such relaxation is impacting the design of the waka protocol as a replacement for HTTP."
Now all I have to do is find an excuse for my boss... :)

There isn't a whole lot of information available  on WAKA  (that replacement for HTTP Roy mentions in the end of the abstract). Belwo are a few links I managed to find
And there's a few others but not as interesting (to me anyway). Well, as we see this WAKA thing is in the works for a long time now. Also replacing something as ubiquiteus as HTTP is not a small feat. But I guess if anyone can pull this off it would be Roy... As always, only time will tell

Edited (18/9): it seems that a recent version of Roy Fielding’s presentation  is available online on parleys.com (via Stefan Tilkov)



 
Tags: Everything | REST | SOA | Software Architecture

From time to time I read about the magic that is RESTful services and how they solve everything and anything like scalability, idempotency, simplicity etc. for instance in "RESTful Web Services" by Sam Ruby and Leonard Richardson they say
 "PUT and DELETE operations are idempotent. if you DELETE a resource, it's gone. If you DELETE it again, it's still gone..." (p.103)
or
"the safe methods, GET and HEAD, are automatically idempotent as well" (p.219)

Another example comes from Anne Thomas Manes who said

"The REST architectural style defines a number of basic rules (constraints), and if you adhere to these rules, your applications will exhibit a number of desirable characteristics, such as simplicity, scalability, performance, evolvability, visibility, portability, and reliability.

The basic rules are:
  • Everything that's interesting is named via a URI and becomes an addressable resource
  • Every resource exposes a uniform interface (e.g., GET, PUT, POST, DELETE)
  • You interact with the resource by exchanging representations of the resource's state using the standard methods in the uniform interface
"

I think such claims  are plainly wrong and misleading.
 
Don't get me wrong, I like the REST approach, since it encourages better service design - e.g. document oriented message exchange vs. the RPC like message exchange which the so called "WS-death-*s" (or actually the tools that support them) encourages.

It also encourages the above mentioned traits - however that's exactly the  point - REST encourages this thinking not solves scalability or other problems out of the box- you still need to design your services properly.

For instance if you follow Anne's rules you can still end up with a service which is stateful, that performs heavy distributed transactions against multiple databases and systems - i.e. a service that is neither simple, scalable or perfromant

DELETE will only be idempotent if the resource is idempotent (e.g. a specific version of a resource)  or the message is idempotent (e.g. requesting a deletion of a specific version) if you are deleting the "recent version" then it might have been recreated between your calls you are now deleting something completely different. heck, even a GET (read) message with a single reader can be made to be non-idempotent  if you decide to code something that alters the state of a resource significantly whenever it is read. When you have multiple readers and writers GET will not be idempotent "automatically" as two consecutive reads can give you two different representations as the resource might have changed (again unless the resources are idempotent)

REST is not different from other styles in this respect - for instance you can do Object orientation in C but working in an OO language encourages object orientation (the opposite is also true - using an Object Oriented language does not guarantee that you get an Object Oriented design)

At the end of the day, architects should still think about the design if they want to ensure the results matches the quality attributes they want to achieve - some environments/styles/tools will make some quality attributes more easy to achieve but nothing will solve the problems for you.



 
Tags: Everything | OO | scalability | SOA | Software Architecture | REST

September 4, 2007
@ 11:03 PM
When I begun writing SOA patterns, the first version of chapter 1 was a general introduction to Service Oriented Architecture from the perspective of Software architecture. When the editors saw the patterns chapters they've felt the chapter wasn't focused enough on patterns so I rewrote it untill it finally molded into the current version.

Nevertheless, I think that the first version has value on its own providing some guidance on the influences on architecture and putting SOA in an architectural context. I will probably edit it a little over the next few days so that it would be standalone (i.e. disconnected from the book). Meanwhile you can download the original version from here



 
Tags: Everything | SOA | SOA Patterns | Software Architecture | Papers

One of the most interesting presentations in Architecture & Design world was the eBay Architecture presentation by Randy Shoup and Dan Pritchett. The presentation was only one hour long, so Randy and Dan didn't cover all the topics in the slide. Here are some of the insight I took from this presentation.

Architecture evolution  -  eBay actually went through several architecture revolutions. Their initial architecture cannot even begin to scale to their current loads. It was, however, a very good fit for their initial quality attributes - specifically, the emphasis on time to market and costs.  This shows the importance of balancing quality attributes. Sure an architectural change is painful but if they'd future proofed too much I doubt they would ever get something working.

V2 demonstrated that traditional 3-tier architecture would only scale so far. It was nice to see how it evolved though. Also with the move from version 2.4 to 2.5 and later to 3 we see eBay learning about  CAP - the hard way. In its final (current) incarnation eBay's data architecture prefers partitioning and availability over consistency. This doesn't mean they forgo consistency altogether - just that they trade the comfort zone of ACID transactions with the BASE approach. Where BASE - stands for Basically Available, Scalable/Soft state & Eventually Consistent. .
eBay partitions thier data in two levels one is a SOA like division by business areas (users, items etc.) and the second level is an horizental partitioning based on access paths.This BASE approach to data was dubbed by Dathan Pattishall (from Flicker and Friendster) as sharding (via HighScalability). This approach means things like high partitioning, no distributed transactions (also see below), denormalization etc. (you might also want to read the item I wrote on denormalizaiton in InfoQ yesterday).
The more major implication here is that when it comes to internet scale, the database looses its importnace - or as Bill de Hora nicely puts it:
The use of RDBMSes as data backbones have to be rethought under these volumes; as a result system designs and programming toolchains will be altered. When the likes of Adam Bosworth, Mike Stonebraker, Pat Helland and Werner Vogels are saying as much, it behooves us to listen.

As I said the data architecture of eBay is SOAish -  partitioned their components and data along business lines, and they apply many of SOA principles. They don't however unite data and components to create a service and  they don't (seem) to have the same contract boundaries that SOA promotes (Randy told me that they are currently contemplating SOA).

Returning to the  eBay do not use transactions. "no transactions" which seems very controversial  - but if we just consider some of the points I made on transactions between services in previous posts - it is the only logical way to ensure scaling. By the way, as can be expected  they do use transactions - when they are local e.g. if the users table is spread over a couple of table both will be updated together).

The application layers also follow the segmentation by business areas. eBay cacse metadata/immutable data as much as possible. keep the application stateless (i.e. state comes from client/db) e.g. they don't use sessions. The DAL virtualized the horizontal partitioning mentioned above for the rest of the code.

It was also interesting to  that eBay developed its own messaging infrastructure - though Randy and Dan did not provide alot of details on that

Development process - It seems that eBay is using some hybrid of feature driven development with waterfall (i.e. the development is feature by feature - but the development of a feature is waterfallish). The do have a constant delivery rate which they synchronize using the concept of a train. if you have a features that is it will be added to the train which is scheduled to arrive around the time your feature will be ready. Several features are delivered as a package which gives a predictable (weekly). I guess it also gives them some nice metaphors to use such as a feature that doesn't make it - misses the train or the train leaves on time etc.

The slides of the presentation can be downloaded from Dan Pritchett's site (They not from the same event but they are pretty much the same slides. Also you can read Elliotte Rusty Harold's account of the presentation.
 
Tags: A&D2007 | Everything | SOA | Software Architecture

August 1, 2007
@ 09:52 PM
I won't say anything about my presentations (that's for others to say :) ). The point of this post is just to let you download them. So here they are:
  • SOA Patterns (2.14mb) - Takes a look at different strategies (patterns) to solve common SOA pitfalls
  • Getting SPAMMED for architecture (4.56mb) - Takes a look at the activities architects can/should do when they think about software architectures. The presentation also covers architecture in agile projects.


 
Tags: .NET | A&D2007 | Agile | Everything | SOA | SOA Patterns | Software Architecture | SPAMMED Process

While I am getting ready to fly to A&D world 2007 where I'll present both SOA patterns and the SPAMMED architecture framework, I thought I'd throw in a little update on the book as well.

I've made a small change to the way chapters 5-7 are organized. They are now grouped under a separate part called "Service Interaction Patterns" (and chapters 2-4 are grouped under "Structural Patterns").
  • Chapter 5 is focused on Message Exchange Patterns (MEP): synchronous, asynchronous, events and transactional  - The patterns there are not new for SOA, instead the focus is on the meaning of implementing the usual MEPs under SOA constraints. I sent it to manning early last week so hopefully it would be available on MEAP soon.
  • Chapter 6 is called "Consumer Interaction patterns" and includesthe UI interaction patterns as well as interaction pattern with other types of consumers. This is the chapter I am currently working on.
  • Chapter 7 is unchanged for now
Lastly,  as you may remember,  I publish online one pattern from each chapter so I'd be happy to get comments on which of the following three patterns (from chapter 6) you like  to see on-line: Reservation pattern (making partial commitments), Client/Server/Service (integrating Legacy or thin clients with SOA) , Client/Service (integration Rich clients with SOA) - if you want to vote just send me an email or leave a comment


 
Tags: Everything | SOA | SOA Patterns

Following the previous post I had a chance to exchange a few email with Mark Little (the director of engineering in the JBoss division of Red Hat). Mark thinks the topic of transactions and SOA has been beaten to death already and wonder's why does it need to resurface (see his post "Is anyone out there?") - I don't see a problem with discussions resurfacing when new people are faced with situations others already solved (but that's a matter for another post)

Anyway, the reasons we're here is  that I think that during this conversation  mark made a few interesting observations and I think the end result is pretty interesting. I decided (with his permission) to post it here ( It is only minimally edited: no deletions, few additions (in []) and a few time shifting to make it more coherent as a single conversation)

Mark
: From what I can see it's [the arguments on transaction and services - are] the same old arguments that have gone round and round, ignoring the important fundamental issues and not doing enough background research.
Sagas are transactional - it's just an Extended Transaction model and not an ACID transaction model. Don't get hung up on the word "transaction", which is way to overloaded in our industry to actually mean anything by itself. Plus, 2PC is a consensus protocol too; it does not impose any other aspect of ACID than the A. Even the D is optional until/unless you want to tolerate failures.

Arnon I know this is an old argument - but that doesn't mean it isn't worthwhile

Mark It isn't worthwhile if people aren't going to listen ;-) I've been involved in these debates so many times over the past 7 years (for Web Services transactions) and longer for extended transactions, that it gets a bit old after a while. Maybe we should create a wiki page and point people at that ;-)?

Arnon I guess, but you should keep in mind that people who are solely in the .NET camp only got WS-AT recently with Windows Communications Foundation so you can expect the issues to resurface. By the way a wiki might not be a bad idea 

[regarding 2PC.] 2PC is a distributed consensus protocol and in principle doesn’t have to be related to ACID transaction. But I think the common view and use of it is for ensuring distributed ACID behavior. Looking back at my experience with XA and COM+ transactions it seems it does a good job at achieving this ACIDness


Mark This is an education issue. The literature is clear on this. People who know and understand transactional protocols don't make the mistake of equating 2PC to ACID properties.

Arnon Yes it is an educational issue . But I am not sure that it is that common knowledge. It is expected that middleware vendors who build the tools to support these protocols to understand it better - I don’t think it is that widely known outside these circles. Most of the architects I’ve don’t (maybe It time to look for new friends ;-))

By the way as 2PC is not resilient to failures of the coordinator so in a highly distributed environment like SOA it might have been a better idea to go with paxos commits if at all you go down that path.

Mark The reason WS-AT and WS-ACID chose 2PC is: interoperability. All TP monitors support it. Try getting IBM, MSFT, Oracle, BEA etc. to change to Paxos, 3PC, flat-commit, or anything else and you'll be waiting for the heat death of the universe.

Arnon Can’t argue with that

Mark [also] 2PC is resilient to failures if the coordinator eventually recovers. Paxos has its own failure assumptions too: Jim never disputed this. Same as 3PC and other consensus protocols. As with *any* fault tolerance approach (transactions, recovery blocks, replication, etc) it's always probabilistic. All we're doing is making it highly unlikely that the system cannot complete, but we can never make it entirely safe. Even in the airline industry they can "only" go to a probability that failures happen .000000001 ;-)

Arnon You are probably right that in an SOA situations the chances of not getting an ACID transaction are worse than in a controlled environment - which actually make the situation even worse since people using WS-AT perceive it as allowing them ACID interaction (e.g. Juvals podcast) .


Mark WS-AT is *all* about ACID, in the same way WS-ACID is about ACID transactions. It is *nothing* to do with SOA though. Web Services are not purely the domain of SOA implementations!

Arnon I totally agree that Web-services and SOA are not directly related and can each exist independently of each other. Again this is an educational issue but,  SOA==Web-services is a very common misconception (I guess the word “service” in web-service doesn’t help ;-) )

[in any event] I think distributed transactions in general should be used carefully period.


Mark Absolutely. They are not a global panacea and people who push them as such do more harm than good.

Arnon WS-AT is more problematic than regular distributed transactions as by definition in an SOA you do not know who and how many other services will participate in your transactions so you are much more likely to run into problems.

Sagas which embrace the temporal shift don't give an illusion of ACIDness and allow to focus on achieving distributed consensus while keeping all parties involved consistent. I think that it is a much better option if you need transaction-like behavior

Mark For SOA, yes. Although Sagas are only good for a certain type of use case. That's why we've always tried to develop "live documents" that allow people to add new models when/if needed. With a couple of exceptions during the BTP days, there has always been consensus that one size does not fit all (http://www.webservices.org/weblog/mark_little/blackadder_and_the_micro_kernel_approach_to_web_services_transactions).

Web Services *anything*, whether it's WS-AT, WS-Sec, or WS-Addressing all have their non-SOA aspects because Web Services aren't developed purely with SOA in mind. If that were to happen then Web Services as a technology would lose some of their important benefits immediately.

Arnon This whole discussion is in the context of SOA (at least from my side ) – naturally there’s a place for ACID transactions for other uses.

Regarding Sagas - calling them "Extended transactions that are not ACID" is just semantics - my point was that they are not ACID transaction. I think most people equate transactions with ACID transactions as well (but I may be wrong)

Mark Many people do and that again is an education problem. The term Extended Transactions (don't need to say "that are not ACID") has a well defined meaning in the R&D community. There have been many good models and implementations around Extended Transactions. They really took off in the vendor community through the Additional Structuring Mechanisms for the OTS, back in the 1990's. If you check that out you'll see that it formed the basis of WS-TX and WS-CAF. Even in Jim's original technical report he discussed relaxing all of the ACID properties in a controller manner to get more flexibility. That was the first extended transaction. In fact, ACID transactions are just one type of extended transaction. There are many many others, including nested transactions, coloured actions, epsilon transactions, sagas etc.

Unless I qualify it beforehand, I try to never use the term "transaction" in isolation because it has different meanings to different people. For example, when talking to developers working in trading infrastructures, a "transaction" isn't an ACID transaction at all. In telecos it's different again.


 
Tags: Everything | SOA | Software Architecture

Evan H asked a question about distributed transactions and services in the MSDN architecture forum:

Are distributed transactions (ie.. WS-Transaction) a violation of the "Autonomous" tenant of service orientation?   Yes or No and Why?  Kudos if you can address concurrency and scalability (in an enterprise with multiple interacting services).

I answerd this questions back in april when I wrote a couple of posts that explained why cross-service transactions are a bad idea:cross service transactions and some more thoughts on cross service transactions.
Roger Sessions also agrees with this view (well, it seems actually, he wrote about it well before I did :) ):
When the WS-Transaction specification was first proposed, back in 2002, I wrote an article explaining why I thought the idea of allowing true transactions to span services was a bad idea. I published the article in The ObjectWatch Newsletter, #41: http://www.objectwatch.com/newsletters/issue_41.htm. Nothing since then has changed my mind. Atomic transactions require holding locks, and spanning transactions across services requires allowing a foreign, untrusted service to determine how long you will hold your very precious database locks. Bad idea. Just because IBM and Microsoft agreed on something doesn't make it good!

The reason I am bringing this issue back is that Juval Lowy (who wrote the article that triggered my first post on the subject) has recorded an Arcast with Ron Jabobs. Where he re-iterated the idea that "Transactions is categoricaly the only viable programming model" and you should strive to use it whenever you can. It seems Juval admits you sometimes need to use Sagas (which he called "long running transactions" - you can see in my link why I think that's a wrong name). He also agrees that you can also use a transactionable transport and then only do internal transactions from each service to the transport (a pattern I call "Transactional Service"). However, at the end of the day, he still thinks you should use WS-AtomicTransactions whenever you can.

I agree that transactional programming is important. I think it is the simplest programming model (from the developers side). I would probably never write an interaction with a database that is not transactional; I look very favorably at initiatives for in-memory ACI (no Durability) transactions such as the one Ralf talks about.  Until we get to Distributed Transactions...

First, we should note that transactions are not "the only viable" option.As Martin Fowler notes Ebay seems to be doing fine without distributed transactions. Not only that, they abandoned distributed transaction and went "transactionless"because they needed one simple thing... Scalable performance .

In most COM+ scenarios you have a single server or a few internal servers where the distributed transaction happen - and even there you should plan your transactions carefully if you want to get any kind of decent performance. In SOA scenarios the situation is more complicated as the distribution level is expected to be higher (even if you don't involve services from other companies). More distribution means longer times to complete transactions (especially if a participant can flow the transaction and extend it). It also means increasing the chances of failure (see Steve Jones series of posts on five nines for SOA). In my opinion, the more distributed components you have the more you want their interaction to be decoupled in time - i.e. the opposite of transactions.

Juval also said he doesn't buy the denial of service problem I mentioned (supporting a transaction means you allow locks - if an external party doesn't commit you retain the lock..). Juval said he assumes that a solutions has both authentication and authorization so this shouldn't be an issue. For one, I have seen too many projects where security was something that was neglected or quickly patched in at the latest moment - so I would hardly assume security. Even with security on - you increase your attack surface.
But that's just the half of it. Even if all your service consumers have good intentions - you still don't know anything about their code. SOA is not like the "good old days" where you owned the whole application  - this means you cannot trust their security to be ample. Also you don't know anything about their code quality. Services are likely (in the general case) to be deployed on different machines, even if they start co-located. I think that a Service boundary should be treated as a trust boundary just like a tier boundary. I strongly believe you should have reduced assumptions on what's on the other side of the service's boundary - transactions are not reduced assumptions

SOA and distributed transactions do not go hand in hand - it isn't just autonomy at stake here. It is a problem for performance and scalability and even security period.

To finish this post - I would also highly recommend looking at Pat Helland's paper "Life Beyond Distributed Transactions: an Apostate's Opinion" and a post he recently made called  "SOA and Newton's Universe", where he explains more eloquently than I ever could why SOA is not a good fit for distributed transactions.



 
Tags: .NET | Everything | SOA | SOA Patterns | Software Architecture

July 9, 2007
@ 10:55 PM
Steve Jones has (yet another) great post called "Le Tour SOA - why support services are critical, but not important".
You should go read the article - but in a nutshell, Steve explains that important services are the ones that bring business values and critical services are the supporting ones that help keep the light on for the important services to function properly. 

While the post has SOA in the title. I think it is more general and is also applicable to applications or any other IT generated components. In fact it can also be applicable to IT itself as Nicholas Carr noted in 2003 when he published his paper "IT doesn't matter". Nicholas argues that IT will become akin to electricity and as such be critical for the business to continue operating but not important. As a side note I'd say I think this is might be true for traditional businesses but not for businesses where the IT is the business (such as banks, insurance companies, etc.)

Back to critical vs. important - I think this is an important for architects to make this distinction to be able to prioritize work and not confuse business value with semblance of business value due to criticality for operations. This doesn't mean you can neglect critical tasks (after all they are critical...). It is the important stuff that will bring your business the competitive edge.


 
Tags: Everything | SOA | Software Architecture

June 28, 2007
@ 03:23 PM
You raise an event when something interesting happens to you, you think it is important, but you don't care enough to know who is interested. you are even less interested in to personally going and to each an every interested party and letting them know. So - instead, you raise and event, and let the poor buggers take care of any implications by themselves. We raise the event "now" when the change happened - it is only important now anyway...


Looking from the "poor buggers" -the event consumer point of view things are more complicated. There are events which are cyclic in nature like stock price updates, the blips from a sonar etc. if you missed one, then it isn't really important you'd get the right information in the next update (actually, that isn't entirely true - see later in this post). Then, there are the events which only occur once. sometime it isn't important for you to listen to them if you are not up and running in the same time. Other times you can't afford to lose an event for instance if your ordering service (or component for that matter) communicates with the invoicing one using events you don't want to miss the event of a new order else you would loose money.

This basically means that the event producer and the event consumer are coupled in time - one way to solve it is to make sure both of these services are available at the same time i.e. if the invoices crashed, then processing orders should be suspended (note that this doesn't mean that you don't accept orders just that you don't process them).
Ok - maybe we can just raise the event "transactionally" - this would probably work, but we need to remember that the event producer doesn't really care about the event consumers, why would it want to fail because of them?!
Maybe a better way would be to "raise" the event over some reliable transport  - this has a few problems. one is that we've passed the problem to the connection between the event producer and the transport. It might be acceptable to have a transaction between the event producer and the transport. However, as I've already said the producer doesn't care much about the consumers..
We can have persistent subscriptions for existing consumers to prevent events from getting lost which make both creates a er minor problem that new consumers can't see past events but also has the risk of existing subscribers disappearing and their queue can then grow endlessly (or until an administrator would remove the subscription)

Ok, let's try to look at the problem from a different angle. looking at the events, what we can really see is that an event has a time-to-live (TTL) as far as the event consumer is concerned. For instance in the case of the cyclic events the TTL is the interval until the next  event. Actually, even with cyclic events the TTL might be larger - if we are also interested in analyzing trends or  ab normal occurrences (which is why I said it isn't entirely true we don't care about old events). In case of one-time events the TTL might be indefinite or maybe even then it might be some definite value (one day, week, year etc.). Since we can't know about the TTL of consumers it can be a good idea to make past events available somehow.

Thus, when you design an event centric architecture like  EDA (whether on top of SOA or not)  it is important to think about event consumers - we don't want to think about  specific  consumers since it negates the benefits of thinking in events, but I would say that you want to think about event consumers in general, after all your component is also an event consumer (do unto others as you would have them do unto you)

One option, which I already talked about, is to make past events available as a feed. Event consumers can then come at their own leisure and consume past event  (this can be in addition  to  to raising the events in real-time). This provides a partial solution as the maximal TTL is determined by the event producer (after which the event is deleted from the feed). This may be acceptable but you must be aware of that.
The other option is to to log all the events and provide an API to retrieve past events. In a sense the max TTL is still at the hands of the event producer only if you use a database it would probably be a large time compared with a feed. Alternatively the events can be logged on by a central "always present" event aggregator (in a manner similar to the aggregated reporting pattern I described for SOA).

To sum all this - events they seem only to matter in the instance in time they are created, we are used to that thinking from building OO systems where all the components are co-located in the same address-space and time (even there I can think of scenarios where we would want past events) - in a distributed world events need to have a TTL, the TTLs can vary and are determined by the events consumers. Lastly, as I demonstrated in the paragraph above, there are several strategies we can use to help solve the event TTL dilemma (and there are probably a few others).


 
Tags: Everything | SOA | Software Architecture

June 24, 2007
@ 09:20 AM
Few months ago I wrote here about solving the mismatch between Service Oriented Architecture (SOA)  and Business Intelligence (BI) (see papers and articles section). Recently I got the following question from Ben:
One major question I have is around large data sets. As an experienced BI/DW architect and developer I have worked on a number of large scale data warehouses. Retrieving large data sets (i.e. millions of records) doesn't seem to fit well into SOA. As you state in your article, we could have another point-to-point interface, where the service which houses data we need gets a request and writes out a batch file (xml or plain ascii text). Then using typical ETL, we grab the file and load it. The underlying source system (service) can use optimization in generating a large data set (vs. record by record) and
the data warehouse can correspondingly load in bulk.
Like most architectural questions - the answer is "it depends"
For instance, if you do a run-of-the-mill ETL as a on-time setup then it is just that- a one time setup and I, personally, don't see any contradiction between SOA goals or tenets and that.

I do think that iit is better to enhance SOA with EDA interactions to provide a long term solution to the BI problem. You can also have a dedicated component that aggregated the information that flows in in these events and builds batch files that are suited for the ETL you've used during the setup phase (mentioned above).
It is true though that moving an SOA which is already in-place to EDA is not a small feat, but adding EDA layers does not have to mean that the old interfaces go away - especially not immediately (remember to treat services as products)

If you have a business that generated millions of records on a daily basis - then the situation is more complicated. Now you have to think about the trade-offs between "compromising" SOA and adding a dedicated interface (or a backdoor to the database) for the ETL vs. the implications of performance, bandwidth, transition costs, ROI  etc. of pushing that information with EDA.
 I, personally believe in pragmatism and the "no-silver-bullet" approach so I can't say that EDA is always the best solution (As an aside, this is part of the reason I write my book as patterns not as "best-practices guidance"). You may find that ETL is the best trade off in your situation. Yes I know that it isn't a definitive answer - but real life is (usually) a little more complicated than black and white solutions. As architects we need to find the best trade off for the situation at hand.


 
Tags: Everything | SOA | Software Architecture | BI

I thought I has this  RESTful web services thing figured out, but following one of the threads on the Yahoo group on Service-Oriented-Architecture I came to the conclusion that maybe I don't.

Steve Jones tried to see if he understands REST by giving an example and that example was corrected by Anne Thomas Manes (who is a research director with the Burton Group which recently stated that the future of SOA is REST).
Here are the examples from the above mentioned thread:
POST http://example.org/customer
HTTP message body contains a representation of "anne"
server creates a subordinate resource called http://example.org/customer/anne

GET http://example.org/customer/anne
returns a representation of "anne"

GET http://example.org/customer/personByName?name=anne
returns a representation of "anne"
or perhaps returns the URI of the "anne" resource
or perhaps returns a list of URIs of all people named "anne"
might also be specified more simply as
GET http://example.org/customer?name=anne

GET http://example.org/customer/personByAge?age=27
returns a list of URIs of people whose age is 27
or perhaps returns a collection of representations of all people aged 27
might also be specified more simply as
GET http://example.org/customer?age=27

PUT http://example.org/customer/anne
HTTP message body contains a representation of "anne"
either creates a new resource called "anne" (if none exists)
or replaces the existing "anne" resource

PUT http://example.org/company/newco
HTTP message body contains a representation of "newco"
either creates a new resource called "newco" (if none exists)
or replaces the existing "newco" resource

If you prefer the server to assign the URI you would instead say

POST http://example.org/company
HTTP message body contains a representation of "newco"
server creates a subordinate resource called http://example.org/company/newco

POST http://example.org/customer/anne?addCompany=http://example.org/company/newco
this would append the newco company reference to the "anne" resource

You can see another example for what I am talking about here on Jon Udell's blog giving an example from RESTful Web Services, by Leonard Richardson and Sam Rubycovering  of doing a transaction in RESTful style

If all these are indeed "legal" or "correct" RESTful interactions I have 2 observations to make
First, I guess Pat Helland is right when he said "Every noun can be verbed" since I don't see the real difference between having a contract with a PersonsByAge request which returns a document* of Persons and a REST request like " GET http://example.org/customer/personByAge?age=27" or even " GET http://example.org/customer?age=27".

The second observation has to do with the so called "uniform interface". I would argue that the resources and their attributes (age=27, name="anne") are the interface. the POST, GET etc. uniform interface does not mean much more than the "uniform" SEND, BROADCAST  interface of messaging.
Further more if resources and their attributes are indeed "the interface" - than not only does REST not have a uniform contract - it actually has a dynamic one which changes in run-time as new resources are created - such as the "POST http://example.org/company"  which creates a new resource "http://example.org/company/newco" in the example above







* I think it is very important for SOA to have document oriented messages and not RPC one I\ll blog in a separate post about the differences. for now it is suffice to say that the REST hypermedia notion of returning the URIs of all the relevant persons should also be present (one way or another) in a good document oriented message even if you are using WS-* or plain messaging as transport
 
Tags: Everything | SOA | Software Architecture

In addition to the drafts of selected patterns I publish on my site, you can now purchase my book via the Manning Early Access Program (MEAP).
MEAP means you can get chapter drafts as I write them and the complete book when its done (ebook or printed). Here is Manning's explanation:
"Buy now through MEAP (Manning Early Access Program) and get early access to the book, chapter by chapter, as soon as they become available. You choose the format - PDF or ThoutReader - or both. By subscribing to MEAP chapters, you get an opportunity to participate in the most sensitive, final piece of the publishing cycle by offering feedback to the author. Reader feedback to the author is welcome in the Author Online forum. As new chapters are released, announcements are made in the MEAP Announcement Forum. After all chapters are released, you will be able to download the complete edited ebook. If you order the print edition, we will ship it to you upon release, direct from the bindery, weeks before it is widely available elsewhere.
By the way, this is probably also a good time to mention that I'll be speaking about quite a few of the patterns in Architecture & Design World 2007 which will take place this July.

There is still a lot of work, but I already like to thank all the people in manning that helped me get this far. especially to Cynthia Kane my editor (hey, maybe now she'll give me more slack :) )
Ok, 'nuff blubbering, back to completing chapter 5...


 
Tags: .NET | Everything | SOA | SOA Patterns | Software Architecture

June 6, 2007
@ 10:02 AM
While I am on the topic of REST, it is probably a good time to comment on my (first) post on InfoQ "Debate: Does REST need a Description Language"

Personally, I think there's merit in Services publishing their message structures in a machine readable format. When a Service has a machine readable contact. generated stubs allows you to make the interaction with less bugs vs. hand crafted interactions. It also makes it easier to test the service itself.

I do agree with Stefan's views on runtime interface dependency where he said that if a service consumer needs just 20% of the information in a service it shouldn't be forced to deserialize (i.e. know or care about) the whole message.However, I think this is a weakness of tooling not the concept. What if you had a tool that reads the machine readable contract, allow you to pick the 20% you need and generate for you a stub that ignores all the other 80% and "hand pick" the 20% you need. This is what you would personally do yourself anyway, and since the code is generated from the Service's definition it would be more resilient and error-free This is effectively designing a personalized mini-contract from the published general one. It does mean that when that 20% changes you will be affected, but this is something you'd have anyway.

I also agree that that the WS-* standards and resulting contract are (and getting more) complicated. Much of this can probably be attributed to the "design by committee" effect. However, there are also some real challenged that the SOA and ROA architectural styles do not address and we still need to solve those. Trying to solve these challenges is, by the way, what prompted me to write my SOA patterns book...


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

DevHawk (Harry Pierson) raised today a question I was toying with myself for a while now - if REST is an architectural style can it exist without the specific technologies that define it today. Or as Harry put it :
  1. REST is a an "architectural style for distributed hypermedia systems".
  2. REST "has been used to guide the design and development" of HTTP and URI.
  3. Therefore REST as an architectural style is independent of HTTP and URI.
  4. Yet, I get the feeling that the REST community would consider a solution that uses the REST architectural style but not HTTP and/or URI as "not RESTful".
What I had in mind for example is to use messaging where the equivalent of the URI would be a topic hierarchy.
Topic hierarchy allows you to have a unique "URI" for each resource.
The next thing we need to take care of are the PUT, GET, POST and DELETE verbs - we can do that by making the verbs part of the message headers.
As an aside I'll also say that  if we try to think about it as an architectural constraint then we don't necessarily have to use these verbs, a more general rule would say that the verbs are uniform and well known rather than specific ones
The rest (no pun intended) of the concerns, like specifying related states etc. can be dealt with making conventions on the message formats
Is that still REST?! I wonder...

In any event, what worries me the most in regard to REST is the religious manner that some people seem to treat it. By the way that is the same phenomena we see with some of the Agile folks. As for me? - Well, I don't really care if I fit that label or the other. I am just payed to deliver working  and viable software :), but hey, that's another discussion.


 
Tags: Everything | SOA | Software Architecture

Yesterday I attended an SOA governance presentation by Brent Carlson. The presentation was basically an updated version of an article he authored in 2006 "SOA Governance Best Practices - Architectural, Organizational and SDLC implications" As a  tool vendor Brent has a lot of focus on the governance processes which I don't completely agree with (I prefer Jim Coplien's organizational patterns approach - see my post from last week). I also think the reuse figures he cites (registration required) are a little optimistic common place for what I consider the right granularity for services.
He also made a few points I  strongly agree with
  • Brent talked about difference between the needs of run-time service repository (e.g. UDDI or an ESB) and a development time one. You need to address the services and their interactions during the development and you need to do that in a way that would be easy for the development teams. For example, one thing you want to log is usage, who is using the services since that will let you perform impact analysis when you have to make a change
  • Building an SOA for an organization is an iterative process not a "big-bang" effort. This means you can't do just top-down design. you need to be pragmatic and also roll out working services.
The reason for this post however is the insight Brent gave regarding treating services as products rather than applications

Treating Services as products is  important because even if you don't believe that the SOA initiative should be  an iterative process once the move is  finished you would have quite a few services deployed in your organization. These services would integrate and interact with other services - some of which outside of your organization. You would also want to capitalize on flexibility claim that SOA makes and adapt your services to the changing business needs.
The challenges you face regarding updating and upgrading  functionality , anticipating consumer's needs, allowing consumers to get used to changes etc. are exactly the challenges product management techniques and principles come to answer

Treating services as products means a lot of things. let's look at a few examples: For one, it means predictable release cycles services like products get updated over time you want service users to be able to cope with this changes. Predictable release cycles means they can get organized in advance. Another aspect is the emphasis on backward compatibility e.g. orderly deprecation of features and version management,.One other thing  is introducing a "product manager". someone whose responsibility is to interact with customers, and potential customers, understand their needs and build a release road map for the services.

You might be used to doing some of that with applications   but thinking about services an products makes all this more explicit and that in itself is also important.


 
Tags: Everything | General | SOA

Udi Dahan writes that ".NET/Java Interop is not a reason for SOA". Udi writes that companies that  need to integrate two technologies turn to web-services and that
"The only problem is that in order for things to work right, they really must have a chatty interface, and flow transaction context between these “services”, and all the other things I describe as anti-patterns"

Udi is right that if you don't rethink and remodel your systems you will (probably) not  have an SOA as you are likely to find your self implementing  anti-patterns such as the ones he mentions.

However, using Web-services does not automatically mean that you are doing an SOA. If you don't think about moving to SOA you can still opt to use web-services as a remoting  or RPC technology to connect two systems. The advantage over the other proprietary products Udi mentions is that web-services are a standard technology. This will work well or fail is orthogonal to the technology choice. It depends on the architectures of the systems you integrate. If you need to flow transaction between the systems you'd also need that even if you cross-compile one of the applications in the other environment.

Another thing I don't agree with is the word must Udi uses. First, while it is likely that older systems has chatty interfaces it is not a must. The designers of the legacy system may have thought about the consequences of distribution without regard to SOA. Also you can still wrap an existing system with a service contract (using web-services or any other technology) and not get to chatty interfaces etc. However that means that the wrapper should have some substance or business logic inside it to mask the old system's behavior this is especially important  if you are thinking about moving to SOA and you take into consideration that the business will not just halt and wait there until you are done. You have to think about interim solutions, such interim solutions can include wrapping a legacy system with an Edge Component and a SOA facade (a pattern I call Legacy Bridge) while you move in the grader direction of a full blown SOA.


 
Tags: .NET | Everything | SOA | SOA Patterns | Software Architecture

May 23, 2007
@ 11:06 PM

I just read Shy Cohen's (via Nick Malik) article in Microsoft's Architecture Journal entitled "Ontology and Taxonomy of Services in a Service-Oriented Architecture" Shy provides a list of what he calls service types. He identifies two major types  bus services and application. He then continues to sub-divide them, the Bus Services are divided into communication and utility services and the Application services  are divided into entity, capability, activity and process services. I have to say it was quite alarming to see this coming from someone who had deep involvement in defining Windows Communication Foundation Indigo.

Where do I start?

well, for one, it seems completely fails to make the distinction between Services as in Service Oriented Architecture and Services as in capabilities or features an infrastructure provide. The "Communication services" are for the most part capabilities that a service infrastructure (such as an ESB) provides. Not Services you would define in an SOA initiative And then there's the matter of service granularity and the difference between Remote objects and SOA for instance, the example  Shy gives for  a "method" (his word) on a Customer service (entity service):

"An example of a domain-specific operation is a customers service that exposes a method called FindCustomerByLocation that can locate a customer's ID given the customer's address"
why would a service return a customer ID? This is the kind of call you would make on an object you hold a reference to not some remote "Something" that also want to authorize your call and may reside in a different company. This kind of thinking is what made remote objects fail. Gregor Hohpe explained that nicely in a paper  called Developing software in a Service Oriented World
The Transparency Illusion.  Distributed components promised to hide remote communication from the developer by making the remoteness "transparent". While the basic syntactic interaction between remote components can be wrapped inside a proxy object, it turned out that dealing with partial failures, latency, and remote exceptions could not be hidden from the developer. It turned out that 90% transparency was actually worse than no transparency because it gave developers a false sense of comfort.
As a side note, Gregor recently gave a presentation that covers this paper at JavaZone which you can watch online at InfoQ

Returning to Shy's article let's take a look at another quote:
Capability Service may flow an atomic transaction in which it is included to the Entity Services that it uses. Capability Services are also used to implement a Reservation Pattern over Entity Services that do not support that pattern, and to a much lesser extent over other Capability Services that do not support that pattern.
I already explained why cross-service transactions and especially flowing transactions is not a good idea in SOA so I won't do it again here - but you can read about it both here ("Transactions between Services? No, No, No!") and here ("Cross-Service Transactions"). Also I truly hope Shy didn't mean .NET data sets when he said "In some cases, typically for convenience reasons, Entity Service implementers choose to expose the underlying data as data sets rather than strongly-schematized XML data. Even though data sets are not entities in the strict sense, those services are still considered Entity Services for classification purposes." In any event the whole decomposition of Services into fine grained "capability", "Activity" and "process" takes no account of the fact the SOA is a distributed architecture...maybe Microsoft is not affected by the  fallacies of distributed computing ?
*ad nauseam (latin)- to the point of disgust


 
Tags: .NET | Everything | SOA | Software Architecture

May 15, 2007
@ 08:16 AM
Pat Helland is back in Microsoft (after a two years vacation in Amazon)  and more importantly he also restarted blogging. I only met him in person a few times - but he is definitely one of the few persons really worth listening to - especially when it comes to distributed computing. Not only does he make interesting observations he is also capable of explaining them in a crisp and interesting manner.  Indeed, it didn't take too long (his second post) before he blogged some valuable content. The post is called Memories, guesses and apologies (go read it).

Pat talks about how the notion of time in a distributed environment is subjective and you can really know what happened before what and what we can do about it (I really think you should  just go read it :) ).
Another related aspect of the phenomena Pat mentioned is that taking a snapshot in time, the chances of having a single unified truth in a distributed system degrade in a proportional manner to the system's load. I  had a chance to work on a few systems where some of the sites had either occasionally connected or connected over  low bandwidth networks. This situation makes the whole notion of guessing the state and compensating and/or apologizing for wrong conclusions much more explicit than in always connected high bandwidth system. Nevertheless, latency still exists even in connected systems and and you should really be weary of assuming a universal truth - unless you can stop the businesses  long enough to allow complete synchronization.
As I mentioned a few days ago, we can't afford to have cross-service transactions (I also think we can't afford too many distributed transaction in non-SOA architectures, but this is a especially true for SOA) which makes things even worse in this sense. One thing we can do in an SOA to achieve distributed consensus is to run a Saga. Saga, which is a long running conversation between services, is probably one of the most important interaction patterns for SOA.
You know what? instead of trying to explain it here in a haste i'll just publish the pattern draft - I'll try to do that before the end of the week.






 
Tags: Everything | SOA | SOA Patterns | Software Architecture

May 7, 2007
@ 06:15 PM
In the article mentioned in the previous post, I talk about adding EDA to SOA and how you can use Complex Event Processing (CEP) to process the event streams and infer the trends and enhance the understanding of what happen inside your business. All the tools I knew about were Java tools I knew about were Java tools but now I've found out (via Nauman Leghari's blog) that there's also a .NET CEP engine and it is even open source - It is called NESper and like many other tools it is a port of a Java tool.
I am not sure how good it is - but at least I'll have an interesting evening today :)


 
Tags: .NET | Everything | SOA

April 30, 2007
@ 07:34 PM
An article I wrote on Business Intelligence (BI) and Service Oriented Architecture (SOA) has just been published on MSDN.
You can find it here http://msdn2.microsoft.com/en-us/library/bb419307.aspx.

The article explains the SOA & BI mismatch and how to bridge it by adding EDA to SOA. (I bloged about it here before, but the article is more ordered and complete)


 
Tags: .NET | Everything | SOA | SOA Patterns | Software Architecture

April 20, 2007
@ 11:15 PM
I have seen the following question on one of the forums I follow
"I have studied up on the SOA approach and it all sounds good.  But most articles stop at the theory.

Lets say I sell things.   I have a CustomerProfileService.   The application does CRUD through this service to a back end database.  Its autonomous and isolated.
I have anther service, InventoryItemProfileService.  Again, the application does CRUD through this service to a back end database. It is autonomous from the CustomerProfileService.  Not only may it live on a different DB from the CustomerProfileService, it might exist on a different platform.
Now lets get to the InvoiceService.  Lets say from the client side, I would guess that i would have a CreateInvoice(custID,itemID[] ) method.  The InvoiceService would then call out to the CustomerProfileService for profile that meets the needs of the invoice, then another call out to the InventoryItemProfileService for the item descriptions and such.

Here is the question.  It would seem like in the back end (the db) of the InvoiceService there would be tables to support the customer info and the item info from the invoice.  Where prior to SOA, when everything was in the same db, these requirements would be largely satisfied by joins.  Now a logical join across services just seems radically expensive (everytime you touch the invoice).  hence the need for the customer and item tables local to the invoice service.

Does this sound right?  Just how often does the InvoiceService have to go back to these other supporting services?"
I also got a comment with a similar theme on my Cross Service Transactions post.

I see a few problems with the way the services in the question are modeled (like CRUDy interface) but in the end it all boils down to the root cause -and the real problem: granularity of the services.

Sure when "a service" is too small it doesn't make sense to separate its tables from those of other services. it doesn't make sense to have transactions that span only what's internal to the service. It doesn't make sense to pay the price to make a service autonomous (like caching reference data from other services). When the granularity is too small you will often find that you need to make a loot of interactions with other so called services. you are more likely to have CRUDy interfaces.
You are also more likely to have slow performing solution and suffer from  low-availability.
Using services in a granularity mentioned above is, in my opinion, a nightmare that would probably make you work very hard to maintain  the SOA principles in place  - or the more likely option, that you would circumvent the principles so that you can get something maintainable, usable and performing (and flip the bozo bit on this all SOA thing)

So what is the right granularity. Well, it is not a one-size-fits-all kind of thing, but as a rule of thumb I would say anything just shy of a sub-system and up. A service has to have enough meat so that it would make sense having it autonomous; that the transactions would fit nicely inside its boundaries; that it would be worthwile making it highly-available; that you can pass a complete task/document to it and it won't have to talk to a gazillion other services to complete processing it; etc.

If your application's idea of invoices is a 2 tables one with a header and one with invoice details - then don't make that a service. if invoicing is a sub-system with complex business rules a lot of options and what-not - then it can be a good candidate

Think about it next time you design a service :)



 


 
Tags: Everything | SOA | Software Architecture

April 17, 2007
@ 01:58 PM

After seeing Juval Lowy's article on WCF transaction propagation in the May issue of MSDN magazine. I posted  " Transactions Between Services? No, No, No! " in my DDJ blog. I've got a few comments which I thought warrant a post in their own-right.

The previous post was triggered by an article that promoted flowing transactions (i.e. you perform a transaction against one or two services and then one of the services calls an additional service and it joins the transaction). It is important to say that I think transactions between services should be discouraged regardless of automating extension of transactions. Transaction propaqgation just makes the matters worse.


There might still be some edge case where you have to have an atomic transaction from a service consumer to the service. I think that in the vast majority of SOA implementations you shouldn't do that and I would think real hard about the other options before allowing it in my architecture.In general  I think cross-service transactions are an antipattern (and that's the way you'd find them documented in my SOA patterns book :) )

One of the comments I received began with:

"Cross service transactions are a sure way to introduce coupling and performance problems into your SOA." I'm not sure I agree with that thought. Logically speaking, cross service transactions are a must. The question is how to implement them. There are two mechanisms we can use for implementing TXs: (1) ACID TXs; (2) Long-running TXs. The latter is preferable for the cases Arnon is talking about (large geographical distances, multiple trust authorities, and distinct execution environments). ACID TXs are more suitable for what Guy has mentioned (DeleteCustomer service invokes the DeleteCustomerOrder service internally). I agree with Arnon the a-synchronicity is preferable, but we all have encountered use-cases where ACID-ness is required from a business requirement level... [snipped]


One minor point in regard to this comment is that I don't like the term long running transaction - there is a long running interactions between services and I think the term SAGA describes them better. Sagas are made of a series of business activities that flow back and forth between services to realize a larger business process. Note that these interactions doesn't necessarily have transaction-like behavior.


which brings me to the more important point of looking at the statement "Logically speaking, cross service transactions are a must". I don't think so. For instance, if a service that manages the inventory in a warehouse receives a request for some items and later a cancelation of that request. The first request can trigger the inventory service to order some more items from a supplier. Whether or not the cancellation would cause a cancellation of the order of the supplier depends on the business rules of the inventory service for inventory levels for the items ordered. it might also depend on whether or not the items have already been received etc. The cancellation (the "abort") of the original request does not have to translate to an abort (or compensation) on the request receiver. Furthermore if the service communications model is based on the push model (e.g. using EDA with SOA) the cancellation notice would just be propagated without regard to the inventory service -. It is the inventory service's responsibility to understand the ramifications of this event and act accordingly. Even the example given in the comment 'DeleteCustomer service invokes the DeleteCustomerOrder service internally" is not a good candidate from ACID transactions (there's also a problem of service granularity here - I'll talk about it later). Since when the customer service decides to delete a comment and request the Orders service to delete orders - there's a reasonable chance that some of the orders are already paid for but not delivered. In this case the customer cannot really delete the customer until all the paid orders are resolved. Or maybe the order service is a facade to a night batch that does the actual deletion. - I know I am just fantasizing with these examples but the point is that the customer service has no knowledge on the order service or the inventory service above except the messages supported in their contract. To assume something about the internal behavior is problematic. Even if you know about the internal structure on the onset, the whole idea of SOA is that the services can evolve independently from each other...


Another thought triggered by the example in the comment originated by the granularity of the services (DeleteCustomer service vs. a Customer Service that also supports deleting customers) is that we should be really conscious to the difference between other architectures like 3-tier client/server and SOA. SOA is actually more distributed than 3-tier - we cross a distribution boundary every time we pass a message from a service to a service and not just when we move a massage from a client-tier to an application server. We add this distribution to gain advantages in flexibility and agility. However, we should note that this is a weakness of SOA (considering for example, that Martin Fowler's first law of distributed object design is" Don't distribute your objects!") means we should really pay attention to the way services interact with each other.

  • The granularity of services - having a lot of fine grained services means there will be a lot of interactions over the wire (even if you don't go out to the network you still have to serialize/deserialize, follow the security policy etc.) rather than internal interactions that much faster
  • The Granularity of messages - The same considerations should also guide us to try to create larger and fewer messages. for the example above . Instead of a DeleteCustomerOrder message maybe something like an UpdateCustomersOrders message that can hold a list of customers and orders and the status changes or . by the way this would also support off-line clients better since they can accumulate changes.
  • The assumptions we can make on the other service's availability, performance, internal structure, the trust we have for it etc. - We should try to minimize the assumptions we make and concentrate on what can be inferred from the contract. Remember that policies can change externally so the business logic within a service cannot count on them being constant. this brings us back to the issue of transaction. every cross-wire interaction increases the chances of failure - in transactions one failure invalidates all the transaction is invalidate. every cross-wire interaction within a transaction increases the length of time we lock internal resources (even if we do trust all the involved parties) - especially if that transaction can extend itself automatically. Also as I've mentioned in the previous post the transactions also open the door for denial of service attacks.

If we want to reap the benefits that are sold under the SOA moniker, like flexibility and agility, we really have to pay attention to this extra distribution and design our services differently than we would components in a 3-tier architecture - but hey, that's why they pay us the big bucks, right ? :)

I should probably also add  that building SOAs is not a goal in itself. We can build perfectly good solutions using other architectures - but if we find that we do need SOA (or any other architecture for that matter) we have to pay attention to the way we implement it to both keep its benefits and not harm other quality attributes like performance, security etc..



 
Tags: .NET | SOA | SOA Patterns | Software Architecture

I've updated the draft for the Edge Component Pattern to a more legible version (thanks to Cynthia Cane my editor @ manning).

The Edge component pattern solves the following dilemma:

How do we allow the business aspects of the service, technological concerns and other cross-
cutting concerns like security, logging etc. to evolve in their own pace and independently of
each other?

 


 
Tags: Everything | SOA | SOA Patterns

I was going to try to explain why it took me so long since I've posted the last pattern draft on-line when I saw that a couple of my fellow Manning authors already did that. See Roy Osherov's "Writing a book is like developing Software" and Fabrice Marguerie's "My Writing Experience". I have similar experience here -there are a few commonalities for software writing and it seems that the counter measures of shorter iterations, refactorings (which I guess writers know as rephrasing) and increased inspections seem to work here as well.

Finally, I am back to writing new stuff and I am completing Chapter 4 now. Chapter 4  deals with SOA security pattern, and I've decided to release the "Service Firewall" pattern as free draft. Note that it is a draft and it can change by the time it gets to publication for example the Edge Component, which I published a few months ago already went through some extensive rewrite (maybe I'll post the updated draft..)

The Service Firewall helps deal with malicious "service consumers" and protect the services from several types of attack including for example XDoS (XML Denial of Service), malicious content, preventing leaking private information from the service etc.

You can download the draft for  Service Firewall  pattern  from here .


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

March 4, 2007
@ 08:14 PM

Following my post on SOA definition, Alex left the following comment

One question - how can an organization achieve "agility" through an SOA, if not through "re-use"? Isn't re-use really the ROI for implementing a Service?”

The way I see it, Agility means the ability to change rapidly and it doesn’t have to mean reuse  –for instance, it can come from the ability to replace a component without disturbing other dependent components – though you can say that this is reuse as you are reusing the interface (contract).

When you replace or update a service  you may reuse some or maybe even all of the previous version of a service – as long as the context for that service didn’t change significantly – if it did the granularity of the reusable components will be much smaller than a  “service”. 

I would also note that I think there’s a difference between reuse and use. If you take the same ordering capabilities and you include it in two business processes that just using it. I’ve seen reuse of services in product companies where services were reused with few modifications between two or more solutions but this isn’t very common.

Regarding the ROI of SOA –That doesn’t have to be reuse or just reuse it is also things like easier connectivity so that you can integrate faster with partners or new components that are developed . Another way to measure ROI is measure the gains in easier replacability and adaptability so you can faster respond to changing business requirements (e.g., changing what counts as a VIP customer without letting any of the service’s consumers that something changed).  


 
Tags: Everything | SOA | Software Architecture

February 11, 2007
@ 08:25 PM

Udi has some comments on my SOA definition. Udi says that the definition I provided does not support  the notion of publish/subscribe using topics for services. My answer to this is yes and no :)


First thing first, I never said (or at least I never meant to say) that contracts are limited to only incoming messages. Contracts contain incoming and outgoing messages.   I probably should have stated it more clearly though.
Udi says “Contract: Who owns the message type being published? The publisher or the subscriber? Common SOA knowledge would say that the message belongs to the contract of the service that receives it”


I don’t know who is “Common SOA knowledge”. In my opinion, this thinking is a wrong “even” for request/reply. The reply message belongs to the service the sends the reply


Regarding Endpoints – if the subscribers go to a topic as in “ServiceName\TopicName “ then yes I would call that an Endpoint since this is a well known address consumers (subscribers) go to find messages published by a service


Regarding consumers Udi says “ Is the publishing service “using” the subscriber when it publishes a message? I don’t think so, and the subscriber definitely isn’t using the publisher at that point either. So, we’ve got some inter-service message-based communication going on and it isn’t clear if we even have a service consumer. In fact, if all a service ever did was subscribe to some topics, and publish messages on other topics, it looks like we’d have very loose-coupling but be straying from the common SOA wisdom.”


Maybe that’s just semantics but I don’t see why the subscriber isn’t using the publisher- The publisher publishes a message on a topic this is part of its offering. The subscriber chooses to consume that information and maybe do some stuff with that – possibly publishing some other messages. That’s a “using” relationship to me.


Nevertheless - SOA is not a synonym for "Distributed system" so there are cases when distributed components that communicates through messages aren’t SOA. For example publish/Subscribe using topics where the topics are common and shared between components so that multiple services can publish on the same topic does not, in my opinion, fall under the definition of SOA . This doesn’t say that this is a bad architecture in any way – but it isn’t SOA either.
As I said in the “What is SOA posts” for an architecture to be SOA you need autonomous components , that publish and accepts messages defined in contracts, delivered at an endpoint and governed by policies to service consumers – no more, but no less either.


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

February 9, 2007
@ 06:50 AM

I've been talking about SOA for a while now it's finally time to (try to) properly define it

I've publised this as a 5 posts on my DDJ blog and I thought it was good enough to be publised as a single whitepaper:

"Service Oriented Architecture or SOA for short has been with us for quite a  while.  Yefim V. Natiz, a Gartner’s analyst, first talked about SOA back  in 1996. However it seems that only in the recent year or so SOA has matured enough for real systems based on the SOA concepts to start to appear – or has it?  There is so much hype and misconceptions surrounding SOA that we first have to clear them all up before we can explain what SOA is – let alone identify who really uses it...." (Download full PDF (670K))

You can see additional presentations and papers I wrote here


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

January 30, 2007
@ 06:30 PM

[originally published in my DDJ blog]

You may have read my BI and SOA post where I suggested using EDA within SOA to solve the BI/SOA impedance mismatch. Jack Van Hoof made the following comment on that post:

Many people think of SOA as synchronous RPC (mostly over Web Services). Others say EDA is SOA. And there are many people who say that the best of EDA and SOA is combined in SOA 2.0. Everybody will agree that there is a request-and-reply pattern and a publish-and-subscribe pattern. It is easy to see that both patterns are an inverse of each other….

I think that "Synchronous RPC" is not a very good (or useful) definition for SOA (see my series on what is SOA anyway). Nevertheless, I also think that even if all you have is synchronous request/reply you can still implement both asynchronous messaging and EDA How can we implement Asynchronous Messaging?

Option 1 Duplex Channel
Let’s say you are a service consumer. You send me your request. Instead of a reply I just acknowledge you that I got the message. I put the message into a queue and process it on my "spare" time. I then call you with the answer.

Option 2 One way Channel
Again you send the request. Instead of a reply, I give you a token or a ticket for the answer. When you think it is time, for example when the time promised in the SLA elapse (or whenever), you call me again, give me the ticket, and I look up the answer and give you your reply. If we hide all this protocol inside some software infrastructure the applications can see asynchronous messaging even though we have synchronous request/reply on the lower levels.

Okay, so what about Events? How can we publish events just using request/reply. The previous example would not work since we can miss out on important events?

If you are reading this blog -- chances are you already have the answer working on your computer -- yes, it is RSS. Think about it using an RSS Reader that pulls the server that publishes this blog you reach out using synchronous request/reply and get all the posts (events) that were added since the last time you asked.

There are a few additional architectural benefits for working this way. For one the service does not have to manage subscribers. Secondly, the consumer doesn’t have to be there the moment the event occurred to be able to consume it -- and the management and set up is easier and simpler than using queuing engines


 
Tags: Everything | SOA | Software Architecture

January 25, 2007
@ 11:57 PM

Jack Van Hoof left the following comment on my post on BI & SOA:

"Many people think of SOA as synchronous RPC (mostly over Web Services). Others say EDA is SOA. And there are many people who say that the best of EDA and SOA is combined in SOA 2.0. 

Everybody will agree that there is a request-and-reply pattern and a publish-and-subscribe pattern. It is easy to see that both patterns are an inverse of each other. In my article 'How EDA extends SOA and why it is important' I explained the differences between the two patterns and when to use the one or the other.

 

Because of the completely different nature and use of the two patterns, it is necessary to be able to distinguish between the both and to name them. You might say making such a distinction is a universal architectural principle. Combining both of the patterns into an increment of the version number of one of them is - IMHO - not a very clever act. I believe it is appropriate and desirable to use the acronyms SOA and EDA to make this distinction, because SOA and EDA are both positioned in the same architectural domain; SOA focusing on (the decomposition of) business functions and EDA focusing on business events."

I agree with some of the things Jack says but not all of them. The way I see it EDA and SOA are two different architectural styles- but I guess that I see it a little different than Jack does

EDA is an evolution of the publish-subscribe style - and can exist independent of SOA i.e. you can implement it with other architectural styles SOA is an evolution of the component based development style which puts an emphasis on interoperability and adaptability.

However I don't agree that SOA is "Synchronous RPC". That's just the initial "wave" of SOA implementations since synchronous interactions are easier to grasp and implement. I think that adhering to SOA principles you can also implement additional interaction patterns including, asynchronous messages, publish/subscribe and EDA (and combining SOA with EDA is what I suggested for solving BI in an SOA)

 

I don't like the SOA 2.0 term as well - but that's just because I don’t see a need for defining a new term :)

 

I'll post more about this once I finish the "What is SOA anway" series on DDJ where I explain the way I see SOA

 


 
Tags: Everything | SOA | Software Architecture

January 7, 2007
@ 11:02 PM

[based on a few posts from my DDJ blog]

Implementing Business Intelligence (BI) solution on top of Service Oriented Architecture (SOA) is not a simple feat. A recent survey by Ventana Research shows that "...only one-third of respondents reported they believe their internal IT personnel have the knowledge and skills to implement BI services.". There's a good reason for that since there an inherent impedance mismatch between BI and SOA which takes some effort to overcome. The purpose of this paper is to look to explain the problem as well as look at the possible solutions.

Service-Oriented Architecture is about autonomous loosely coupled components. These traits gives you lots of benefits such as greater flexibility and agility but it also means that services have private data. Data that you don't want to expose to the outside as exposing it will decrease autonomy and increase coupling. This is why services only expose data and processes via contracts rather then exposing their internal structure.

That is all fine until you start to think about business intelligence. The cornerstone of any business intelligence initiative is gathering, collecting and consolidating data from all over the place. Once you have the data, you can use tools to analyze it, data mine it, slice, splice, aggregate, and whatnot. Traditionally BI builds on ETL (Extract, Transfer, Load) which goes directly to the database of the involved sources.

And here lies the problem: On the one hand we have services that want to keep their data private, and on the other we have a datamart or warehouse that wants that data badly.

What are our options?

  • If you go with traditional ETL, you introduce coupling into your service.
  • If you only rely on contracts that were constructed for business processes you may be missing out on important data.
  • If you build a specific contract that exposes "all" the data you are back at the point-to-point integration -- solving point-to-point integration is one of the reason we want SOA in the first place.

The second option seems to be the most reasonable choice of the three -- but it also has several problems. One problem is that the BI needs to know about all the contracts. The second was already mentioned -- important data might be missing. The third problem is that the BI system need to fetch data from the services which means it may miss out on data in the intervals between request. On the other hand, too frequent requests and you can congest your network easily as well as cause DOS on your own services.

Clearly we need a fourth option

In my opinion, the best way to tackle BI in SOA is to add publication messages into the contract. By "publication messages", I mean that the service will publish its state either in a periodic manner or per event to anyone who listening. This is a service communication pattern which I call "Inversion of Communications" since it reverse the request/reply communication style which is common for SOA.

To make the solution complete, you can add additional requests/reply or request/reaction messages to allow consumers to retrieve initial snapshots. Following this approach, you get an event stream of the changes within the service in a manner that is not specific for the BI. In fact, having other services react on the event stream can increase the overall loose coupling in the system - for instance by caching results of other services

Why is this better than the other three approaches? For one , you can get a good picture of what happens within the service. However the contract is not specific for the BI and can be used by other services to cache the service state (thus increasing their own autonomy), for reporting (you can see an early draft of the aggregated reporting pattern), and for BI purposes. By working against a steady stream of events, the BI platforms can Analise treands, keep history and get the complete picture they need.

The approach above is sometimes referred to as "Event Driven Architecture" (EDA) and while I (and others) see EDA as another facet of SOA, not everyone agrees. Gartner, for instance, sees EDA as another paradigm and SOA just for request/reply, or client/server. Recently, however, they published a paper that calls the approach described here as "Advanced SOA". I tend to agree more with the "advanced SOA" definition and don't see a contradiction with EDA and the SOA definitions. We are still using the same components and the same relations only adding an additional message exchange pattern into our toolbox.

A note on implementation: If you are implementing SOA over an ESB that is rather easy to implement as most ESBs support publishing events out of the box. Using the WS* stack of protocols, you have WS-BaseNotification, WS-BrokeredNotification and WS-Topic set of standards. If you are on the REST camp, then I guess you will need to implement publish/subscribe by yourself.

Once you have event streams on the network, The BI components grab that data scrub it as much as they like and push it to their datamarts and data warehouses. However, event steams can also enable much more complex and interesting analysis of real time events and real time trend data using complex event processing (CEP) tools to get real-time business activity monitoring (BAM)

You can also get post as as a presentation down loadable from the papers section on my site or directly from here. (The download is about 3MB.)



 
Tags: Everything | General | SOA | SOA Patterns | Software Architecture

December 15, 2006
@ 10:50 AM

One unique aspect of SOA vs. other architecture styles like Object Orientation , Client/Server or even 3-Tier architecture is that it is built for highly distributed systems. Each and every service is a sub-system in itself it can run on its own machine and be located everywhere in the world . Many times, the service itself needs to be distributed in its own right. One reason to use distributed computing inside the service is computational intensive tasks.

 

 One of my recent projects was the development of a  biometric platform.  The platform can be used for many usage scenarios. A simple scenario is an access control systems - e.g. authorize entrance into a secure building or area. This is a relatively simple scenario as you usually only have to deal with few thousands of people and as a person requests entry she also declares who she is (e.g. using an RFID card with her ID). In these cases you can go to the database, lookup the appropriate record , run the biometric algorithm or algorithms and verify the person is who she says she is. However the same platform also has to work for other, much more demanding and computing intensive scenarios. For example consider a forensics scenario where you have a fingerprint collected at a crime scene, in this case you don’t know who the person you are looking for is, and you have to run your search on basically all the database which can contain millions of records. Keep in mind that when you match a biometric template[1] you calculate the probability of a match (based on the internal structure of the template) and  that each template weights about a one kilobyte you quickly realize that this can be quite a CPU intensive task.

Sometimes when you develop you SOAs you will have algorithmic tasks or other computational heavy tasks such as the one mentioned above and the question is

 

How can a Service handle  computational heavy tasks in a  scalable manner?

 

You can get the full pattern from here

[This is an early draft of one of the Performance, Scalability and availability Patterns from my SOA Patterns book]



[1] you can think of biometric template as a signature or a hash that represents the biometric sample. The template is smaller than the sample but contains enough information to identify the original.

 


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

December 5, 2006
@ 08:16 AM

I've added a section called SOA Patterns on the site while holds the current draft for the table of contents of the SOA Patterns book I am writing. The section lists the problem each pattern addresses as well as links to published patterns. Also, you can  use this to monitor my progress (patterns that already have their problem written down already have drafts; the others are in-progress or not started).

I am currently working on chapter 4: Security & Manageability patterns (not counting delays mentioned in the previous post).

Also, as I think I've already mentioned, I'll make public at least one pattern per month, if you are interested in a specific pattern in particular (from those which are ready - now chapters 2&3) drop me a note and I'll publish the one that gets the most votes

 


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

December 1, 2006
@ 09:29 PM

My editors at manning think that my chapter 1 of the SOA patterns book is not good enough.

They basically say that the chapter talks about too much theory vs. the other chapters which contain much more down-to-earth stuff (e.g. Edge Pattern, Aggregated Reporting Pattern, Decoupled Invocation Pattern ). Also they’ve said that I spend too many pages explaining what architecture is or taking about distributed system before I get to SOA – which is the topic of the book.

The way I see it, understanding architecture and distributed systems is essential to understanding SOA (from the development side i.e. when you want to design and build services). For example the discussion on quality attributes explains how you can use scenarios to find architectural requirements (and each pattern then has a section on relevant scenarios to help you find if the pattern is applicable to your needs)

I would be very interested in hearing what you have to say (either as comments here or emails to me) about the Chapter’s structure and content (considering most of the books will be patterns like the Edge pattern)

Thanks in advance


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

November 15, 2006
@ 09:33 PM

The business rationale behind going on the SOA road is increasing the alignment of the business and IT, so we divide the business into a bunch of business services and everything is just fine. However the minute we start diving into the SOA implementation details we are swamped by a horde of technologies, cross-cutting concerns (auditing, security, etc.) and whatnot.

For example, in one project I was involved with, we implemented an SOA over a messaging middleware (Tibco's Rendezvous). Just when everything was fine and dandy - along came another project which could potentially use few of the services. Well, almost, it needed a slightly different contract and it also used completely different wire protocol - WSE 3.0 (Microsoft interim solution for the WS-* stack before Windows Communication Foundation). And that's just one simple example - cross cutting concerns and implementation details are everywhere. The question is then:


How can you handle cross cutting concerns like multiple technologies, protocols, changing policies etc. while keeping the service's focuses on its core concerns - i.e. the business logic.


You can get the full pattern from here

[This is an early draft of one of the Service Structural Patterns from my SOA Patterns book]


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

November 5, 2006
@ 12:44 PM

I am going to present SOA in one of our internal forums next week - so I thought it would be a good opportunity to dust-off my SOA presentation and give it a little face lift. You can download a copy from the papers and articles section (or get it directly from here).

As always, any comments are welcome


 
Tags: Everything | SOA | Software Architecture

The draft for the first chapter of my SOA Patterns book is available on-line from Manning Publications Co.

The first chapter talks about  software architecture and the inputs the architect can/should use to design one (emphasizing Quality Attributes); Explains the challenges of distributed systems and takes a look at the SOA from an architectural perspective.

You can download the chapter from here

Any comments are welcome (you can also leave your comments at soa@rgoarchitects.com)


 
Tags: Everything | SOA | SOA Patterns | Software Architecture

[Will also be cross-posted on my DDJ blog]

Working on my SOA patterns book, I thought of this rule for contract versioning which my shameless ego wanted to dub Arnon's Contract Versioning Principle. I was happy playing with this thought, until I realized that there isn't some profound new understanding here, this is just an application of LSP for service contracts.

Liskov Substitution Principle (LSP) which I recently blogged about here as part of a series of blogs on Object Oriented Principles, basically states that a subclass should be usable instead of it parent class. To put this in other words you could say that a subclass should meet the expectations that users of the parent class have come to expect from the parent class's observable behavior.

So LSP applied to SOA would state that:

When changing the internal behavior of a service, you don't need to create a new version of the contract if for each operation defined in the contract the preconditions are the same or weaker and the postconditions (i.e. the outcome of the request) are the same or stronger or in other words the to retain the same contract version, the new version of the service should meet the expectations that consumers of the service have come to expect from the old version's observable behavior

For example, let's say you have a customer service and the contract lets you get a customer's VIP status. If you changed the way the VIP status is calculated (e.g. in the old version the customer had to have 1 million dollars in her account, but now she must have 10 million dollars) there's no need to create a new contract version. However if you introduced a new level of VIP status (e.g. 1 Million = Gold, 10 Million = Platinum) you do need a new version for the contract


 
Tags: Everything | SOA | Software Architecture

October 1, 2006
@ 11:16 PM

I've added a new section on the site www.rgoarchitects.com/Papers to allow easy access to all the papers, presentations and articles I published (and will be publishing e.g. I'll add a paper on architect soft skills in a month or so etc.)

 


 
Tags: Everything | General | SOA | Software Architecture | SPAMMED Process

July 25, 2006
@ 11:15 PM

Udi Dahan, which is one of the best architects I know has recently created an excellent course on SOA - you can find the details of the syllabus on Udi's site .

I had a chance to work with Udi in the past and the solution we implemented utilized many of the patterns and techniques Udi covers in his course - so these are not just nice theories but rather real stuff that works

 


 
Tags: Everything | SOA | Software Architecture

[crossposted from DDJ]

Yesterday I attended an interesting presentation on SOA by Dr. Donald F. Ferguson, chief architect for IBM's software group.

I was happy to hear him validate some of my thoughts on SOA (e.g., workflows are better kept inside services rather then outside, transaction boundaries should be inside a service, and so on), and introduced a couple of things I didn't know much about (for example, OSGi, a SOA platform for networked devices that's not based on web services. He also presented some nice insights (for instance, looking at the middleware as an infrastructure service and thus nicely unifying SOA and EDA)

On of the insights Donald presented was the use of heuristics as an aid to modeling and validating architectures. Some of the heuristics he mentioned include:

  • Occam's Razor -- avoid needless repetition
  • Don't create something new if you can compose existing stuff to get the same result
  • externalize volatility -- don't put in the code things that are likely to change
  • Focus on "name,value" programming not "offset programming" -- make things easy to understand
  • Different is hard


If you look at heuristics as an abstraction of experience, they can provide as a good tool for keeping yourself on the right track. Some heuristics are universal (maybe the ones mentioned above and a few others like "simplify, simplify, simplify" or "the original statement of a problem is probably not the best one or it may even not be the right one"), but the problem is, as always, deciding (in advance) which heuristics to apply to a problem.

If you interested in using heuristics as an architect tool you may want to look at " The Art of Sysytem Architecting", by Mark Maier and Eberhardt Rechtin. The book discusses the architectures of different system types (collaborative, IT, Manufacacturing, etc.) and provides heuristics for each of these systems.

Heuristics are a good tool to use when you design an architecture and in a way the different design principles (e.g., the single responsibility principle) can also be considered heuristics. Nevertheless it is very important to verify designs by additional methods like code and formal evaluation and not rely on heurisitcs as the only tool.


 
Tags: Everything | SOA | Software Architecture

June 4, 2006
@ 10:56 PM

 

I have amassed more than 30 patterns related to SOA (e.g. SOA Patterns - Decoupled Invocation and SOA Patterns - Aggregated Reporting which I previously published here). I have patterns around security, availability, scalability, composition, adding a UI etc. Some of the patterns are original (I think) and some are based on other peoples work.

 

I am trying to decide whether or not it would be worthwhile putting all these patterns into a book. Writing a book is a very time consuming task (or so I am told) - so I thought I'd run a quick poll between the readers of this blog to see how many of you would be interested in reading (and buying) this book if it will get published.

I know this is not a representative crowd - but it can give me a (very) rough idea on the interest in such a book.

 

Please  send any comments (comments like "forget it, no one would ever want to read anything you write" are also ok) to soa@rgoarchitects.com (or leave a comment here)

 

Thanks in advance - Arnon.


 
Tags: Everything | General | SOA | Software Architecture

May 29, 2006
@ 07:30 PM

[Edited version of post in DDJ]

Since I have been blogging for about a year now on this blog and now also on the DDJ blog. I think  it is time to try making something with more two-way communications.

Consequently, I am going to run a little experiment for a few weeks and see how it goes.

The idea is as follows: If you have an interesting architectural or design dilemma, drop me an email at ask@rgoarchitects.com I'll pick one issue per week and post (on the DDJ blog) the dilemma (anonymously) plus voice my opinion (and/or suggested solution)--and then everyone else can chime in with their comments and insight which hopefully will shed some light on the subject.

I'd be interested to hear both your opinions on this initiative and, of course, interesting dilemmas you are facing. Again, send your dilemmas to ask@rgoarchitects.com)


 
Tags: .NET | Everything | SOA | Software Architecture | SPAMMED Process

Here is another SOA pattern from the list of patterns I am publishing.


One of the core goals of going with SOA is to enable loose coupling. The request-reply communication pattern, which is very prevalent, inhibits this decoupling. The problem is for the caller or consumer of the service - the consumer service is dependant on the timely response of the called service for its normal operations. To help elevate the consequences of this dependency the service that is consumed should maintain QOS (Quality of Service) as part of its contract (it doesn't have to be part of the machine-readable contract but it needs to be defined and adhered to). Consider for example an on-line music store. On a normal business day that can have few thousands of purchases nicely distributed around the clock. And then when a new <name your favorite band here> album debuts they can have much higher peaks than their usual requests load. They still need to be able to handle all coming requests or the (potential) buyers will take their business elsewhere.

 

How can I maintain a level of QOS, handle peaks and high-loads without my service failing?

 

One option is to estimate the peak loads and get enough computation power to ensure you can handle them – this causes problems. One is a problem of waste you can have machines just sitting there twiddling their thumbs so to speak. However the idle computers have purchase, maintenance and operational costs. The other problem is unexpected loads (e.g. Harry Potter craze for an Amazon-like site) – the estimated load might not be enough.

 

Ensuring QOS gets even more problematic when some of the actions performed in the service access resources/services that are not in the under the service control (- e.g. taking to a credit card clearing in the e-commerce example mentioned earlier).

 

Another issue that needs to be take care of is prioritizing requests – a Service most likely handles several types of requests – not all of them need the same level of QOS. You can set the QOS for according to the most demanding request type – but then you may need more resources.

 

Decouple the invocation- separate the reply from the request: Acknowledge receipt in the edge, pass incoming request to a queue, load-balance and  prioritize behind the queue.

 

 

Making the Edge acknowledge the receipt of the request (for our e-commerce example this can translate to "Your order has been received and is being processed, you would get confirmation email when the transaction completes") allows hiding operations that take long time from the service consumers (be that other services or end-users).

 

Writing requests to the Queue is a relatively low-cost operation that can be performed fast thus allowing handling request peaks. The actual handling of the incoming requests can be performed more slowly according to the available resources of the service. The load balancing can be done by setting different number of readers working against the queue.

 

Making the Queue a Priority Queue (or having several queues according to priority) allows for maintaining different levels of QOS for different message types.

Decoupled Invocation can be combined with the Gateway pattern to allow scaling out the service.

 

Decoupled Invocation is enhanced by the use of Correlated Messages pattern which helps relate the request and the reactions.

 

Acknowledge in the Service

Sometimes the initial response needs to involve some business logic and is not just an acknowledgment. In this case the Edge doesn't respond, it just passes the request to the service, the service sends both the initial reaction and the reaction.

 

 


 
Tags: Everything | SOA | Software Architecture

May 8, 2006
@ 10:55 PM

Michael Platt talks about SOA vs. Web 2.0 He provides several links to blogs and article  that basically claim that SOA is dead and long live the new king Web 2.0.

 

One thing I have to say is that if indeed the hype around SOA is starting to calm - this is a very good sign. Finally we can go about adding SOA to our toolset and use it when it is appropriate (not just because management has got to have an SOA). Also it can be a good sign that SOA is maturing.

Another point I'd like to make is that SOA and Web 2.0 are not really related - there is no reason why one should compete with the other. Why using an AJAX front-end makes it impossible to have Services in the backend (it may be appropriate to have a Client/Server/Service Scenario - where the front-ends don't hit the services directly (the other option is Peer/Service) - I may talk about these 2 mini-patterns in my SOA pattern series). Another example where SOA and Web 2.0 can work together is RSS. A service can expose its list of recent changes as an RSS feed (as well as providing the more "traditional" web-services API). Exposing an RSS feed can be an option to implement the Inversion of Communication pattern I mentioned in a previous post).

 

To sum things up - Web 2.0 may be more hyped today than SOA. Web 2.0 and SOA can co-exist and actually complement each other.

In any event I think we (as an industry) should focus more on delivering great applications and solutions rather than fight about whose trend-du-jour is fancier or sexier.

 

[Edited]

After writing about the example of using RSS for Service communication I stumbled today on RSSBus which is an effort to create an ESB on top of RSS protocol ...


 
Tags: Everything | SOA | Software Architecture

As promised, here is the first pattern. If you like this pattern but you think there is something missing to gain better understanding please drop me an email: arnon at rgoarchitects.com . Naturally any other comments are also welcome :)




Getting an SOA right is very hard, not so much because of the technical problems (we know how to deal with those, don't we?), but rather it is very hard to figure where to put the borders and keep the right business alignment.  Assuming you somehow managed that, the real fun begins - you now have to produce reports, dozens and dozens of reports. Many reports will fall within the boundaries of single services (if you have a good partition), however many reports will also require adding data from several services. For example, in a Telco scenario, you may have a Customer, Billing and Provisioning Service (a real-life example would have dozens of additional services) now a customer is calling customer care and you want the CRM to show everything about the customer what outstanding invoices does she have, what equipments and services (GPRS, UMTS, friends and family etc.) she got, what her status as a customer (loyal , VIP, senior citizen …)  open service requests etc. Things get much more complicated when you need to summarize or group data from multiple services 

 

How do you get a decent cross business entities report with the data scattered about in all those services?

 

One possible solution would be to create the report at the consuming end (e.g. UI) visit all of the services involved then do all the grouping, cross-cuts etc. This solution is not very good from the performance perspective (you need to get more data then needed and you have to post-process it). It is also problematic from the flexibility perspective each service involved has to expose interfaces to get the data for the specific query (otherwise you mobilize even more data).

 

Another option is to go straight to the data, you may still need to hit multiple database servers to get to the data but the performance will be better. The problem is this is throwing your service boundaries down the drain and introducing a lot of dependency.

 

A third is to create interim Services ("Entity Aggregation") - this works fine as long as you have real business reasons to do the aggregations (there is an overhead with adding business logic to handle the aggregated data) and as long as you only have few of those  (or you might end up with a single "service" with all the business).

 

Create an Aggregated Reporting Service by building  an Operational Data Store (ODS) to enable creating sophisticated reports on otherwise dispersed data 

 

AggreagatedReporting.PNG


 

 

The ODS is similar in concept to a data mart e.g. data is subject based, integrated, scrubbed etc. However,  the main differences are that the data is up-to-date and that there is little or no history.

 

For incoming data the Aggregated Reporting Edge performs the data transformations from contract data into reporting data. The service updates the ODS by scrubbing the data (can be limited unless the data has to go to a data mart / data warehouse) and then integrating it and De-normalize into subjects.  Incoming report request fill parameters for the pre-prepared reports.

 

One problem with Aggregated Reporting is that it is not a Business Service (i.e. it is a technical solution rather than a business oriented one) - however since unlike Entity Aggregation the data in Aggregated Reporting is Read-Only this doesn't affect the business.

 

Aggregated Reporting is easier to implement when combined with  Inversion of Communication

 

Aggregated reporting with Data Mart/Data Warehouse

 

Instead of just storing recent operational data, this version enhances the depth and complexity of queries that can be executed against the service. The downside is the increased complexity in setting up the data mart - both from the operational costs perspective (e.g. additional storage) and from the design and development perspective (you need to think about long term aspects, indexing etc.) as you also need to scrub data and consider the structure of your schemas much more carefully.

 

 


 

Sidebar: Operational Data Store (ODS)

The ODS is probably the best kept secret of data warehousing technology. It has been around almost as long but it isn't as famous.

The data in the ODS is operational - live data and not static data. The ODS can be thought of the as the cache memory of the data mart / data warehouse.

It is important to note that while it doesn't need the same amount of planning and set-up as a data mart, an ODS still requires careful planning in order to bring real business value.

 

The figure below shows the classical usage of an ODS in an OLTP/Data Mart environment.

 

ODS.PNG

Originally it was thought there would be 4 types of ODS

 

Class I - Near Real-Time synchronization of the ODS with operational data from the OLTP databases.  an implementation of Class I is the preferred type for the Aggregated Reporting pattern

Class II - Update the ODS every four hours or so

Class III - Overnight updates of the ODS

Class IV - the ODS is updated from  the data mart / data warehouse

 

In reality there are more variants - for example a powerful (and complex to build) option is to merge a Class IV ODS with one of the other Classes and get.

 

 


 
Tags: Everything | SOA | Software Architecture

April 13, 2006
@ 10:29 PM

I decide to write a short series of blog post on SOA patterns. These are not patterns that are only usable for SOA, however, I have found them particularly useful in implementing SOAs.

 

This isn’t an exhaustive list of pattern - on the contrary I'll try not repeat patterns which are well known (like  Entity Aggregation  http://patternshare.org/default.aspx/Home.PP.EntityAggregation )

 

I am a little busy these days (e.g. I have to complete an architecture document for one of my projects) - so this post will only introduce the (first batch of) patterns . And the following posts (in the series) will expand on each one (i.e. explain  What to do, usage context, consequences etc.). Then, if I'll get good feedback maybe I'll publish some more.

 

So, what patterns are we talking about here?

Well:

 

  • Gateway - How do you scale a service without exposing too many endpoints?

 

  • Inversion of Communication - How do I get the data from other services without too much coupling?

 

  • Biztalkize - How do I control volatile behavior inside the  service ?

 

  • Aggregated Reporting - How do you get a decent cross business entities report with the data scattered about in all those services?

 

  • Transparent Emergence - How do I know where to find a service?

 

  • Decoupled invocation - How can I handle peaks and high-loads without my service failing?

 

  • Orchestrated Choreography  - How do I expand the behavior of hard-to-change service (e.g. legacy systems exposed as services) ?

 

Well, I hope this sparkle enough interest to make you follow the rest of the posts on this subject :)


 
Tags: Everything | SOA | Software Architecture

I just read an excellent post by Gregor Hohpe talking about the motivation for Event Driven semantics for service communication.

Gregor gives an example of a shipping service listening on order events and address change events to produce shipments.

It is nice to see how architectural approaches transcends business domains so well -  The Naval C4I project Udi Dahan  and myself are working on, we basically try to take the same approach. For example: A Sensors service publishes its status every predefined time - The sensor knows if something is wrong with its state. A sensor, however, doesn't know if the problem is important or  not. We designed an Alerts service that listens in on status messages, based on (changing) business rules a certain status may trigger an alert event (which a UI can then choose to display); a severe alert may result in an SMS alerting a technician to come and have a look.

 

However, while this approach is very good for inter-service communication -things aren't as rosy when it  comes to interacting with UIs. The point is UIs  are based on interaction so the request-reply idiom (should actually be implemented as request-reaction) is much more prevalent. Users really want to know their request is being taken care of

 

Another lesson we learnt is that since services make go on-line and off-line independently of each other, it is not enough just to support event listening for event aggregators to be up-to-date. One option is to relay on reliable messaging to any event posted will eventually get to the listener - there are several problems with this approach for example:

  •  For one, you need a reliable message transport which might be a problem e.g. you may not be able to use JMS/MSMQ between enterprises and/or the protocols you use don't support it (e.g. WS-RM is not durable see here  and here )
  • Even if you have reliable communication,  if one service has been offline for a long period of time (where long is defined by the communication load) - it may be a waste of time (or plainly wrong)  to process old events that are no longer valid

 Another option to handle this situation is to supply in the contract request for current state (the current state can be published using the same message structure used by the matching event). The advantage here is that a server coming on-line can quickly and efficiently get up-to-speed on the current situation.

 

The event thinking is relatively on par with Take-it-or-leave-it approach for contracts construction, but as I said in the previous post on contracts, I think it is more beneficial to know about your consumer and take their input into account

 

Apropos EDA,  I also learnt today that the Micro-Services  strategy Udi and me came up with had already been "invented" several years ago. It is called SEDA (Staged Event Driven Architecture) there's a nice presentation explaining it here 


 
Tags: Everything | SOA | Software Architecture

 

Udi Dahan writes about "Contract First, Discussion Second?" saying that "a service's contract is a more "take-it, or leave-it" kind of deal." 

There are situations when this is true - for example when Amazon decided to expose some of its functionality as Services they probably didn't negotiate it with most (if not all) of us. Similarly, whenever you want to consume a deployed version of a service you can either use it as is or move on.


However, services are rarely developed in a void. This means that when you set up to design the next iteration of a service (first or otherwise) there are usually several potential consumers out there (other internal systems, partners etc.) - and like it or no, you will be negotiating the service contract with them, after all, the  whole idea of the service is to add some business value. If you disregard your consumers, it will make it harder on them to actually make use of the (hopefully) wonderful functionality you will be providing.

 

This, Also means that it is better to negotiate the contract first (i.e. as one of the first steps of  developing the next service version). Again, deciding on the contract upfront, allows the other parties to get organize to better take use of the functionality that will be exposed through the service once it is deployed.

 

I suggest you be pragmatic when you set up to develop a service, meet with the potential consumers and try to agree on something that will be useful for them - or as the Beatles once said "let it be, let it be, speaking words of WSDL, let it be"…




 
Tags: Everything | SOA

I've just found out (via  Gianpaolo's blog) that Roger Wolter (former PM of Service Broker) started blogging . He is going to focus on  Data in a Service Oriented world. I had a chance to work with Roger for a short time, which was enough to notice that if anyone knows about data, it is him.  I guess there is no surprise there, considering his past at Microsoft working on :SQL Server Service Broker, SQL Express, SQL XML Datatype, SOAP Toolkit, SQLXML and COM Plus.

His first post (after the obligatory "hello world" post) is about Service Broker positioning (vs. MSMQ, Biztalk and WCF) - Subscribed


 
Tags: .NET | Everything | SOA | Software Architecture

February 8, 2006
@ 11:26 PM

In the previous post I said "don't bubble exceptions out of your service" -  Ebenezer Ikonne asks  "Well I wonder what the verbiage of the exception should be?  If a null pointer occurred in the service, what message should I return back to the consumer of the service?"


First off, lets consider the meaning of bubbling  the exception - what would a remote consumer, sitting on some other company's server do with a "null pointer" exception?! - the consumer doesn't have any control on the resources or life cycle (or anything else for that matter)  of the service it is trying to consume. Also if it depends on the internal problems of the service it consumes it makes it (the consumer)  much less autonomous.

 

So, what's the other option? Well, as I mentioned in my previous post it is best if the service can "pretend" nothing really happened e.g. log the incoming message before doing anything and then if there's an exception respond (if the contract requires response by a deadline) with a "got your message, working on it, you'd get a confirmation message soon" sort of reaction. If the exception occurs before the incoming message is saved then it is probably needed to respond with "out of service, try again soon" if the edge is not up then you (as a consumer) should (finally) get an exception (the protocol failed - the message you've sent did not arrive)

 

By the way a I think that a somewhat similar principle is true for bubbling exceptions across layers in a layered architecture 


 
Tags: Everything | Software Architecture | SOA