July 12, 2008
@ 10:30 PM
My friend Gunnar Peterson asked about my opinion on SOA and security concerns. Here's what I wrote him:

In a paper I wrote a couple of years ago I examined the relevancy of the “fallacies of distributed computing” defined by Peter Deutsch almost 20 years ago. Writing about the “Network is Secure” fallacy I wrote that after all these years you would think that the fact you cannot assume the network is secure would be a no-brainer. Alas it still it happens all the time - and that's for "regular" distributed systems.

 In my opinion, assuming the network is secure for an SOA is not only naïve but negligence pure and simple. The whole premise of moving an organization to SOA is connectedness and integration. So, unless your SOA will fail it will be connected to other systems. Whether you  are building RESTful systems, WS-* SOAs, EDAs or any combination of these architectural styles, If you won’t treat the services boundary as a border and secure it – you will be sorry…

Security in SOA should be considered at the "grand-scheme" level with issues like authertication, authorization but also at the single service level, looking at issues like DDOS, SQL injection, elevation of privilige and what not. A trivial thing like exposing a transaction beyond service boundaries can translate to an attacker denying services in your system simply by locking out your database. Again, this is just a simple example.

The other thing about Security is that you have to consider it early. patching security "later on" can have devestating effects on a system's capabilites esp. in areas related to performance. I have seen even military systems that had to go through serious rework, just  because Security was added as an afterthought instead of handled early on


 
The .NET community is generally criticized (and rightfully in many cases) that it lacks innovation, with a lot of developers relying either on Microsoft or porting Java stuff* (Junit - > Nunit, log4j -> log4net, Hibernate -> NHibernate, Spring -> Spring.net and the list goes on).
Amidst all this, it is nice to see, that a lot of .NET original ideas are generated here in Israel**.

Few examples that come to mind are
The reason for this post is a new open source project coming from here. This one is from Sasha Goldshtein and Alon Fliess and it is a Non Paged CLR host which offers
"This custom CLR host ensures that all memory given out to the CLR is locked into physical memory if possible, thus eliminating paging completely."
Which can be very useful in performance intensive applications. Cool :)


* This is not to say that sometimes, the port is better than the original (NUnit 2.0 is a prime example for that in my opinion)
**Naturally there's also original stuff happening outside of israel :) e.g. Ninject by Nate Kohari or StoryTeller by Jeremy D. Miller
 
Tags: .NET | new

July 2, 2008
@ 01:07 PM
As I was going through my RSS list (which I publish as a linkblog) I noticed Reginald Braithwaite's post on the metaobject protocol. Reginald says

The bottom line: Languages like Java give you an object protocol: There is one way to do things with objects and classes and interfaces, period. Anything else, like adding generics or annotations, must be done outside of the language, it must be done by the creators of the language.

Languages like Common Lisp and Smalltalk give you a meta object protocol: You can decide for yourself how to do things with objects, classes, interfaces, generic functions, whatever you want. You don’t need to wait for a committee to try something different.

What's nice is that a couple of posts later I noticed a link by Harry Pierson to an attempt by Jeff Moser to port OMeta to the .NET world and to enable the very "meta object protocol" Reginald mentios. Here are the opening bits from Jeff's post (go read the rest):

What if programming language implementations were object-oriented? What if you wanted to change one itty-bitty part of your favorite language? Say you wanted to add an exponentiation operator (^^). How hard would it be?

Wouldn't it be nice if you could "subclass" a language like C# and then add 5 lines of code to make it work and then use it? What if you could add Ruby-like method_missing support in 20 lines?

What if you could conceive of a new language and start experimenting with it in production code in an hour? What if it leveraged your knowledge of existing frameworks like .NET?

Also check out the code Jeff posted so far (on CodePlex)


 
Tags: .NET | Design | OO | Trends

Technical documentation of the code/project  seem like a worthy goal. After all if we aren't living in a vacuum someone will need to understand our code/design and maintain it, use it to develop new stuff etc.
On the other hand writing documentation can get tedious, boring, hard to maintain and whatnot - so you get all sorts of ways to deal with it. Consider, for example, the following:
  • On a recent article in IBM's DeveloperWorks Paul Duvall suggests you use automation to generate various documentation artifacts like UML diagrams based on the source code (using ANT tasks and UMLGraph), ERDs (with SchemaSpy) etc.
  • Another example, is a .Net tool called GhostDoc. This tool automagically generates C# XML documentation comments ("///") e.g. (all comments below are ghostdoc work):
/// <summary>
/// Appends the HTML text.
/// </summary>
/// <param name="htmlProvider">The HTML provider.</param>
public void AppendHtmlText( IHtmlProvider htmlProvider )
/// <summary>
/// Adds the specified item.
/// </summary>
/// <param name="item">The item.</param>
public void Add( string item )
/// <summary>
/// Determines the size of the page buffer.
/// </summary>
/// <param name="initialPageBufferSize">Initial size of the page buffer.</param>
/// <returns></returns>
public int DeterminePageBufferSize( int initialPageBufferSize )

[from Introduction to GhostDoc]
 What I think is that while both of these efforts can help satisfy a customer specific requirement for "comprehansive documentation"* they have very little value in making anyone understand anything about your code. UML diagrams can only help if they are created at a higher level of abstraction than the code (which means they'd be hand-crafted) and if GhostDoc can understand your code enough to create anything useful it means that your method and parameter names are self-descriptive anyway.

In a previous post I mentioned that I prefer to rely on tests, short methods and meaningful names for readability. I'll talk about tests in another post, for this installment lets look at the other two. I think it would be better demonstrated by an example.
Consider the following horror of a method:
       public void HandleWithFrame(FrameProp Frame)
        {

         int FreeProcessNum = 0;
         int FreeProcessId = 0;

         if (Frame != null)
         {
             rwl.EnterWriteLock();

             if (m_WaitingFrame.ContainsKey(Frame.m_SessionId))
                 m_WaitingFrame[Frame.m_SessionId].m_Frame = Frame;
             else
                 m_WaitingFrame.Add(Frame.m_SessionId, new WaitingFrame(Frame));

             IncrementPriority();

             rwl.ExitWriteLock();
         }

           rwl.EnterUpgradeableReadLock();
           foreach (var keyValuePair in m_ProcessMap)
          {
            if (keyValuePair.Value.m_Busy == false)
            {
               FreeProcessId = keyValuePair.Key;
               ++FreeProcessNum;

               if (FreeProcessNum > 1)
                     break;
            }

          }

           if (FreeProcessNum == 0)
           {
               rwl.ExitUpgradeableReadLock();
               return;
           }
       

           if (FreeProcessNum >= 2  )
           {

               rwl.EnterWriteLock();
            
                m_ProcessMap[FreeProcessId].SendFrame2CV(Frame);
              
                m_WaitingFrame[Frame.m_SessionId].m_NumProcess += 1;
                m_WaitingFrame[Frame.m_SessionId].m_Priority = 0;
                m_WaitingFrame[Frame.m_SessionId].m_Frame = null;

                m_ProcessMap[FreeProcessId].m_Busy = true;
              
 
              rwl.ExitWriteLock();
              rwl.ExitUpgradeableReadLock();
              return ;
          }


        


          WaitingFrame MaxPriority = new WaitingFrame();
          MaxPriority.m_NumProcess = 1000;
          MaxPriority.m_Priority = -1;

       
            foreach (var Item in m_WaitingFrame)
            {
                 if (Item.Value.m_NumProcess < MaxPriority.m_NumProcess)
                      if (Item.Value.m_Priority > MaxPriority.m_Priority)
                          MaxPriority.m_Frame = Item.Value.m_Frame;
            }
       
         rwl.EnterWriteLock();

       //  Console.WriteLine("ProcessId={0} Assign", FreeProcessId);
          m_ProcessMap[FreeProcessId].SendFrame2CV(MaxPriority.m_Frame);
          m_ProcessMap[FreeProcessId].m_Busy = true;

        
          m_WaitingFrame[MaxPriority.m_Frame.m_SessionId].m_NumProcess += 1;
          m_WaitingFrame[MaxPriority.m_Frame.m_SessionId].m_Priority = 0;
          m_WaitingFrame[MaxPriority.m_Frame.m_SessionId].m_Frame = null;
        
        
        rwl.ExitWriteLock();
        rwl.ExitUpgradeableReadLock();
        return ;

       }
This class needs a lot of explanations if you want to understand what exactly going on here. So you can set your self up to writing a lot of comments and trying to figure things our - or assuming this class was fully tested (which it wasn't, but that's another story)  try to refactor it until we get something meaningful (I will omit the added tests though for brevity)

Step1 - First If

It seems that when we have a first frame we want to keep it aside, so let's extract method all the if code to EnqueueFrame

           if (Frame != null)
               EnqueueFrame(Frame);

Ok so now we look at EnqueueFrame, The code we see here talks with the  m_WaitingFrame private member (which is a Dictionary of <Guid, WaitingFrame>();. The first thing we'll do is to rename it to FramesQueue. Now the more interesting thing is that the code here has to do with managing this FramesQueue and isn't directly related to the containing class.
We can either subclass the Dictionary class or we can add an extnention method to Dictionary<Guid,WaitingFrame> to handle this for us.
Do we'll do that and then refactor the If again:

           if (Frame != null)
           {
               rwl.EnterWriteLock();
               FramesQueue.Enqueue(Frame);
               rwl.ExitWriteLock();
           }

and Enqueue looks like (in a separate interanal static class)

        public static void Enqueue(this Dictionary<Guid, WaitingFrame> queue, FrameProp Frame)
        {

            if (queue.ContainsKey(Frame.m_SessionId))
                queue[Frame.m_SessionId].m_Frame = Frame;
            else
                queue.Add(Frame.m_SessionId, new WaitingFrame(Frame));

            queue.Prioritize();

        }
The advantage of what we've achieved thus far is both better OO design (separation of concerns) and enhanced readability by using intention revealing names and notation.

Step 2 - The foreach loop
Again we'll start with Extract Method, we can now remove the definition of FreeProcessNum from the beginging and we get
var FreeProcessNum = GetFreeProcessNum(ref FreeProcessId);

but this code is not really clear, for one we have to rename FreeProcessNum to FreeProcessesCount to make it more legible. and we have an ugly and hard to follow ref variable. It is probably better to apply the Single Responsibility principle and  seperate this into two distinct methods,  so we'd get
var FreeProcessesCount = ProcessesList.CountFree();
var FreeProcessId = ProcessesList.GetNextFreeId(); // we don't really need/want the ID but we'll fix that later

(as in the previous example we add extension methods to ProcessesList to make the code more intention revealing and for better seperation of concerns)
All we want to do in CountFree is count how many proccesses are not marked as busy so we can rewrite

var FreeProcessNum = 0;
foreach (var keyValuePair in ProcessesList)
{
        if (keyValuePair.Value.m_Busy == false)
        {
               FreeProcessId = keyValuePair.Key;
               ++FreeProcessNum;

               if (FreeProcessNum > 1)
                       break;
         }

 }
  return FreeProcessNum;
into
        public static int CountFree(this Dictionary<int, ProcessStatus> processesList)
        {
            return processesList.Count(item => item.Value.m_Busy == false);
        }

Thank you MS for adding Linq and Lambda expressions :). The same can be done for GetNextFreeId
 

Step 3 the two Ifs and the rest
Taking a deep look at the code we can see that the rest of the method tries to find a free processor and if there are enough processors send the frame, otherwise it should send the top prioritized. We can also spot a bug here that two different threads can get the same Processor and then try to send a message to it one after the other. Another potential bug comes from the way the maximal priority is found. There's an assumption there that the max priority would be 1000. While it isn't likely to happen it is still a hard coded assumption.
Anyway, if we continue and apply the same principles that got us here (Single Responsibility Principle, Don't Repeat Yourself, Intention revealing methods, coherence and opening classes to add specific functionaliy) we get the original method to look like the following:

       public void ProcessFrame(FrameProp nextFrame) //was HandleWithFrame
        {
           rwl.EnterWriteLock();
           try
           {
               if (nextFrame != null)
                   FramesQueue.Enqueue(nextFrame);

               TryDispatchTopFrame();
           }
           finally
           {
               rwl.ExitWriteLock();
           }

       }
Compare this with the original method....
Also note that it isn't that the functionality disappeared - it is just neatly distributed and grouped in short related methods in related classes e.g.

    internal static class FramesQueueExtnesions
    {
        public static void Enqueue(this Dictionary<Guid, WaitingFrame> queue, FrameProp Frame)
        {

            if (queue.ContainsKey(Frame.m_SessionId))
                queue[Frame.m_SessionId].m_Frame = Frame;
            else
                queue.Add(Frame.m_SessionId, new WaitingFrame(Frame));

            queue.UpdatePriorities();

        }

        public static void ResetSlot(this Dictionary<Guid, WaitingFrame> queue,Guid slotId)
        {
            queue[slotId].m_NumProcess += 1;
            queue[slotId].m_Priority = 0;
            queue[slotId].m_Frame = null;
        }


        public static void UpdatePriorities(this Dictionary<Guid, WaitingFrame> queue)
        {

            foreach (var Item in queue)
            {
                if (Item.Value.m_Frame != null)
                    Item.Value.m_Priority += 1;
            }


        }
        public static FrameProp FindTopPrioritized(this Dictionary<Guid,WaitingFrame> queue)
        {
            var maxPriority=  queue.Max(item => item.Value.m_Priority);
            return queue.First(item => item.Value.m_Priority == maxPriority).Value.m_Frame;
        }
    }

You should note that this is not the end of the refactoring (e.g. we should still handle the WaitingFrame, FrameQueue and the ProcessesList which we are called here) we just took a look at a single method.

While there might still be a need for an occasional explanatory remark , I think this little exercise demonstrate that we can gain a lot in the way of clarity by refacroting code and keeping up a few simple principles. Oh yea, and what we got at the end of the process is not just readable code, but also a more maintainable, better designed code that can move forward and evolve further as the system changes.



* I don't underestimate the value of generating full documentation when there's such a requirement from a customer. I would prefer to convince a customer that having such a Write-Only document is a complete waste of time  and trees but sometimes you can't help it. Generating documents in these situations can be a life-saver.


 
Tags: .NET | Agile | refactoring

June 16, 2008
@ 06:23 PM

This is what I've been working on for the past year or so :)


 
Tags: PaperLnx | xsights

I guess the designers of WCF really want to discourage some of the uses of the framework - I can't really understand some of their choices, if that was not the case.

For instance, when you create a stateful service (InstanceContextMode = InstanceContextMode.Single) the default concurrency behavior is single threaded. In this mode, WCF will serialize all the calls to the service and messages will wait/time-out. While it is easier to program, this has no real-life use except maybe for demo applications in Teched presentations.
Luckily you can override that and set ConcurrencyMode = ConcurrencyMode.Multiple and get a multithreaded service but the default is useless at best. By the way beware of the ConcurrencyMode.Reentrant  since in this setting you still have a single threaded service and WCF can accept calls when you call other services so you need to take care of multithreading but don't get the benefits.

Another example which is even worse, is the default for maximum number of connections for self hosted services. This is limited to 10, yes, 10 concurrent connections. We found that out when we set up a service that had, lo and behold, 11 different services that interact with it. These services would call the service something like 10 times a second and occationally we got timeout exceptions. At first we figured we got something wrong with the multi-threading implementation. So we spent a couple of days going over the locks and releases, and what-not. Then we thought the problem was with the transport (net.tcp) so we changed that to http and still saw the same problems. Only then we figured out that, as I mentioned above, the default is 10 concurrent sessions.
To solve this problem you need to configure the Throtteling behavior of the service by using ServiceThrottlingBehavior. This class has three useful settings

The MaxConcurrentCalls property limits the number of messages that currently process across a ServiceHost.

The MaxConcurrentInstances property limits the number of InstanceContext objects that execute at one time across a ServiceHost.

The MaxConcurrentSessions property limits the number of sessions a ServiceHost object can accept.


The default for MaxConcurrentCalls is 16, MaxConcurrentInstances int32.MaxValue and MaxConcurrentSessions is 10.
If you're using a self hosted service bump these up or you might DOS yourself like we did :)

Anyway, these defaults are a real barrier to scale and performance. Sure, you can change them easily, but you first have to know about them, and that's the probelm. Hopefully, my wasted time will help you avoid these problems :)


 
Tags: .NET | SOA

While I am on the subject of Project management, One nice feature of Mingle (I am not sure if it is new in v.2 or it was there in v1) is requirements traceability.
Requirements traceability is a requirements management discipline which makes it possible to attribute artifacts (design/code etc) to the requirements that originated them. It also provides a way to mange changes (if you know which artifacts were created to fulfill a requirement you also know the artifacts that are likely to change when the requirement change).

However, above all that, Requirements traceability is a pain in the ass* ! Which means it is hardly ever done. Especially in agile projects (working code over extensive documentation, remeber?)

Anyway, as I said, a nice feature of mingle is that, if you set it to work with your subversion repository (I created a user for mingle) and when you check in code you add the card number to the check-in comments (e.g. #123 - fixed this and that bug). Mingle will let you navigate that relation. i.e. in the History -> revisions you can see the card number and click through to the card it-self. If you also do the same from tasks to stories you get a good traceability with very little effort



And I can tell you that from personal experience having once volunteered(!) to do the whole traceability matrix for a 1000+ man/year project from the analysis to the customer requirements ( it took more than 2 weeks in case you are wondering)
 
June 2, 2008
@ 09:56 PM
When you are working towards a specific goal esp. if it is a looming mile stone or something to that effect, you tend to leave stuff behind, cut some corners, just a wee bit of slack... This "stuff" has a nice metaphor to describe it. It is called "Technical Debt" .

If you don't mind your technical debt it can grow so much you'd end up with a big-ball-of-mud which is hard to maintain, change and, well, do anything useful with.

The best way, I've found to deal with this is to "add a card" every time I feel the urge to add a //TODO comment. Or in other words add the technical debt as a task into the product/task backlog.
Having the technical debt on the backlog has several benefits such as
  • It will not be forgotten - it will be documented...
  • It will not be hidden - The true state of the product will be in the open for management/product owner to see. As a manager I want to know the true state of the product. If I know what I can and can't have I can get ready for that. If I think everything is rosy and then the system blows up in my face, that's not so good..
  • It will be managed - The importance/relevance of the "debt" will be reevaluated every time the product backlog get prioritized.
Technical debt will occur in your project, whether it is agile, "water-falled" , incremental or what not. Don't ignore it



 
Dare Obasanjo complains about Resharper 4.0's recommendation to use implicitly typed locals (i.e. var someVariable = SomeMethod(); rather than SomeType someVariable = SomeMethod();)
It is a small issue and I would probably wouldn't comment about it except I hear some of the same complaints from members in our team. The main grievance Dare has is that using var impedes the readability of the code. He also says that that people using var will be more inclined to use long "hungarian style" variable names.
Dare also mentions MS recomendation on the use of var which recomends

As for me, I am just happy with anything that get me nearer to true duck typing and the extreme-late-binding it offers.
Dr. Alan Kay (inventor of Smalltalk and one of the fathers of OOP) even says that that extreme late-binding is one of the essential attributes of an Object oriented language:
"(I'm not against types, but I don't know of any type systems that aren't a complete pain, so I still like dynamic typing.)
OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them."
Yes, the var keyword is still a far cry from duck typing as it doesn't provide the true late binding to the interface needed you get in languages like Python or Ruby. It does however takes one worry from you and helps reduce the overall accidental complexity. var, extension methods and lambda expressions all help make C# more "dynamic" and easier to work with.

I think Ola Bini summed the issue best:
"A statically typed language with type inference will give you some of the same benefits as a good dynamic language, but definitely not all of them. In particular, you get different benefits and a larger degree of flexibility from a dynamic language that can't be achieved in a static language. Neal Ford and others have been talking about the distinction between dynamic and static typing as being incorrect. The real question is between essence and ceremony. Java is a ceremonious language because it needs you to do several dances to the rain gods to declare even the simplest form of method. In an essential language you will say what you need to say, but nothing else. This is one of the reasons dynamic languages and type-inferenced static languages sometimes look quite alike - it's the absence of ceremony that people react to."
Edit As for the code readability claim. I prefer to focus on stronger methods like keeping methods short, meaningful method and variable names and supporting tests (which can actually help you understand how the code behaves...). Not to mention that, if you really really need that, resharper will tell you the type if you put the mouse over the "var" keyword ;) 
 
Tags: .NET | OO