Posted on 11/17/2008 5:38:05 PM by Justin Etheredge
This weekend, after the Raleigh Code Camp, James Avery put on an Open Spaces conference called Shadow Camp. It was a small group of people who came together on Sunday morning to discuss the topic of complexity in software. While people were discussing topics for the different slots, one of the topics which was suggested by Corey Haines was "Emergent Design". The idea of Emergent Design is not a new one in the agile world, but discussing this in relation to software complexity led to an interesting discussion on entropy in software.
For many people the idea that complexity is emergent may sound like an obvious statement. We all know that from the second law of thermodynamics that all systems trend toward chaos, but we think of those systems as uncontrolled natural systems. We don't think of our software as a natural system with forces that are out of our control which changes and evolves on its own. But our software is constantly changing and evolving in ways that are more than the sum of the changes we put into it. Every time a developer touches a piece of code, the design of the system becomes more divergent from the original design.
Without someone to constantly guide the design at a high level, it will slowly descend into chaos. Small changes in different places in the application will combine to form larger changes which will affect larger swaths of the application. So what do we do? Do we constantly keep an eye on the architecture of our systems? Well, the short answer is yes, but the long answer is that we can make decisions in our architectures that will allow us to minimize the impact of entropy.
Complexity in software is all about interactions. Now obviously interactions must happen, if they didn't then the software couldn't do anything. But complexity isn't just simply about the surface of your software and the number of methods on objects, it is about the combination of the number of methods along with the number of other methods which can call that method. Let's say that we have two classes and each class has five public methods. Then we have another two classes, each with 1 public method, and 4 private methods.
Judging from what we said above, you might think that we could just say 5 * 5 and figure that we have 25 possible interactions on the first action, and one possible interaction on the second example. But the reality is much worse. In the first example single methods could call multiple other methods, and methods within one class could call other methods from the same class. Now you may be saying to yourself, what does it matter if methods in a class call methods in the same class? If they are in the same class, can't we just change them? No we can't, if we have exposed them publicly we have created a contract on that method. If we decide to change this method, then we have to create a new method to support the new contract.
What all of this means is that as you expose more and more methods from your classes the potential for complexity increases exponentially. As you add more and more classes, the numbers just start increasing at a startling rate. So, lets just assume that our simple 5 * 5 numbers above are accurate. If we had 3 classes, then this turns from 25 into 125. If we have 5 classes then we are now at a staggering 3125 possible interactions. If we stick with the 5 method number and go with 25 classes, which is still a fairly small application, then the number become almost incomprehensible at 2.98023224 × 10^17. This potential for interaction is what allows your so beautifully architected application to slowly descend into a ball of chaos if these interactions aren't constantly managed.
One of the tools that you can use to manage this complexity is partitioning. Divide up your application into chunks, and then manage the interactions between the chunks through strictly defined interfaces. In DDD this is referred to as a bounded context and without them, not only do you have ever increasing complexity, but it gets harder and harder to manage the complexity. The reason for this is that as you add more and more classes to your application you have to consider them when designing new classes.
Even though we have five public methods on each class, we have 25 interactions in each context with only minimal interactions between contexts. In extremely large applications, this can be one of the only ways in which to greatly reduce the possible number of interactions.
At a lower level another approach you can take is to try and make methods private or protected. But be careful with this approach! Going overboard can cause your application to be overly rigid, but you also have to remember that every method you expose is a contract that you have tied yourself to in the future, especially if your class is exposed outside of your module.
Yet another approach you can take is to implement the Principle of Least Knowledge (also referred to as the Law of Demeter). This basically says that an object should only interact with methods on objects that it is directly holding, and should not call through an object to another object. For example:
public void Method(SomeObject obj){
obj.OtherObject.MethodOnOtherObject(); //don't do this
}
By make these kinds of calls you are instantly exposing the number of other classes that your class is interacting with directly. Instead the call to "MethodOnOtherObject" should be wrapped in a method on "SomeObject" that does the interaction on behalf of this method. So, something like this:
public void Method(SomeObject obj){
obj.PerformAction();
}
Anyway way you can find which will help to reduce the coupling between your objects will help you refactor later to reduce the complexity that is always going to bleed into your application. Managing complexity, and therefore keeping our applications agile is our primary job as architects and developers. Next time you are designing an application, class, or just a method ask yourself if you are doing everything you can in order to manage the complexity.
Posted on 11/13/2008 3:15:08 PM by Justin Etheredge
I am giving a talk on IronRuby at the Raleigh Code Camp this weekend. It is titled "Microsoft and Ruby Sittin' in a Tree" and is basically a 100-200 level overview of Ruby along with a few IronRuby specifics such as .net integration.
If you are at the Raleigh Code Camp, and you don't suck, then you should come check out my talk!
Posted on 11/11/2008 9:16:26 PM by Justin Etheredge
Here are the previous parts to this series:
Part 1 - Dynamic Keyword
Part 1.1 - Dynamic Keyword Second Look
Part 2 - Default And Named Parameters
Part 2.1 Default Parameter Intrigue
Part 3 Generic Covariance
I went on a serious blogging streak, and then all the sudden I just dropped off with my C# 4.0 new features series. I guess that just goes to show you that I need to spread this stuff out! Anyways, I am back today with part 4 of this series on generic contravariance. This post is actually probably going to be pretty short, because contravariance is a topic that is very similar to covariance, but it also isn't really all that interesting. It can be quite useful though, and that is why I am bringing it to you here.
In the last post we created an interface that looks like this:
public interface IContainer<out T>
{
T GetItem();
}
With this interface we can now say that IContainer is covariant on T, which as explained previously, means that this type can now return objects of Type T and anything more specific than T (subclasses). But what does it mean for something to be contravariant? Well, it is probably easier to show it than it is to explain it. First we can take a look at our class which implements the code above:
public class Container<T> : IContainer<T>
{
private T item;
public Container(T item)
{
this.item = item;
}
public T GetItem()
{
return item;
}
}
We have our container class above, and we can pass in our item in the constructor and use it like this:
Circle circle = new Circle();
IContainer<Shape> container = new Container<Circle>(circle);
So you see that we are declaring a class of type Circle and then passing that into our Container class which is being assigned to the IContainer<Shape>. Here we are seeing Covariance in action. But what happens if we want to perform some action on the item that our container is holding? We could add a method like this to our interface:
public interface IContainer<out T>
{
T GetItem();
void Do(Action<T> action);
}
This won't work though. Well, why not? The issue is that Action is not contravariant on T (it will be in the .net 4.0 release, but it is not yet). Since Action is not contravariant on T, if we declare IContainer<Shape> then the Action delegate would need to be able to accept type T and anything more specific than it.
In order to do this, we first need to declare a new Action delegate type:
public delegate void ContraAction<in T>(T a);
So you see that our keyword for contravariance is "in", since these types can only be passed in to a method. Now we can define our interface like this:
public interface IContainer<out T>
{
T GetItem();
void Do(ContraAction<T> action);
}
And so if we declare our container class like this:
public class Container<T> : IContainer<T>
{
private T item;
public Container(T item)
{
this.item = item;
}
public T GetItem()
{
return item;
}
public void Do(ContraAction<T> action)
{
action(item);
}
}
We can then use it like this:
Circle circle = new Circle();
IContainer<Shape> container = new Container<Circle>(circle);
container.Do(s => s.MethodCallOnShape());
Pretty sweet, huh? So we have a container that can hold a Circle, Square, Triangle, etc... which can then take a delegate into a method that will perform an action on the base "Shape" type, but will accept any shape as a parameter.
Posted on 10/31/2008 7:35:37 AM by Justin Etheredge
Here are the previous parts to this series:
Part 1 - Dynamic Keyword
Part 1.1 - Dynamic Keyword Second Look
Part 2 - Default And Named Parameters
Part 2.1 Default Parameter Intrigue
When generics were introduced in C# 2.0 they were one of the best features that ever came to C#. Anyone who had to create strongly typed collection classes in C# 1.0 knows exactly how much code generics saved us from having to write. The problem though is that generics don't seem to follow the same rules of inheritance that all of the other classes follow. Let's start off by defining two quick classes that we are going to use for the rest of this post:
public class Shape
{
}
public class Circle : Shape
{
}
Here we have our stereotypical class hierarchy, which is not doing anything currently. But the behavior of these classes is not important. Now, lets define a dummy container class that can hold an instance of any class:
public interface IContainer<T>
{
T GetItem();
}
public class Container<T>: IContainer<T>
{
private T item;
public Container(T item)
{
this.item = item;
}
public T GetItem()
{
return item;
}
}
Now that we have our hierarchy and our container class, let's look at something that we can't currently do in C# 3.0:
static void Main(string[] args)
{
IContainer<Shape> list = GetList();
}
public static IContainer<Shape> GetList()
{
return new Container<Circle>(new Circle());
}
We have a method called "GetList" which has a return type of "IContainer<Shape>" and then returns a "Container<Circle>" class. Since Circle descends from Shape and Container implements IContainer, you would think think that this would just work. But in C# 3.0, it doesn't.
In C# 4.0, we have a way to make this work, we can simply add the word "out" in front of the type parameter on our interface declaration (note that variance in C# 4.0 is limited to interfaces and delegate types):
public interface IContainer<out T>
{
T GetItem();
}
This is telling the C# compiler that T is covariant, which means that any IContainer<T> will accept any type equal to or more specific than T. Like we saw above, IContainer<Shape> was the return type, but if we have the out parameter on our interface, then we had no problem returning an IContainer<Circle>.
So why did they decide to use the word "out"? Well, it is because whenever you define a type parameter as covariant, you can only return that type out of the interface. For example, this is invalid:
public interface IContainer<out T>
{
void SetItem(T item);
T GetItem();
}
But why won't that work? Because if that doesn't work, then that means that the IList<T> interface can't be covariant! Noooo! Well, the reason is actually pretty simple, type safety. Let's look at the implications of what we have done above:
static void Main(string[] args)
{
IContainer<Shape> container = new Container<Circle>();
SetItem(container);
}
public static void SetItem(IContainer<Shape> container)
{
container.SetItem(new Square()); // BOOM!!!
}
You see that since T is covariant and so we can assign a "Container<Circle>" to our variable of type "IContainer<Shape>" and then we pass it into our method "SetItem" which accepts a parameter of type "IContainer<Shape>" and then we take that variable and try to add a new type "Square" to it. Well, it looks like this is valid, the parameter type is "IContainer<Shape>" and so we should be able to add a Square, right? Well, wrong. The line above will explode because we are actually trying to add a square to a container that holds circles. This is why they limited covariance to only a single direction.
Are you wondering how all of this is implemented in the clr? Well, there is no need to. Generic covariance in the clr is the way that it just works. Since generics were worked into the clr in .net 2.0 they have allowed this behavior. Since C# tries its best to maintain type safety, they didn't allow what we just did above. The clr though has no problem with it. As an interesting side note, arrays in C# actually allow this behavior, so go try it out! I hope that you enjoyed this post, and then next in the series will be here soon!
Posted on 10/30/2008 2:04:10 PM by Justin Etheredge
With all of the new C# 4.0 stuff coming out, I feel like a kid in a candy store. Sorry for post overload, but I just can't help myself! I also think I need to stop putting numbers on these posts, because obviously I have just completely thrown them to the curb. I hate to put a "Part 3" on this post though, since it is just an extension of a previous one.
Jonathan Pryor pointed out on my last post that the default parameter feature in C# 4.0 was implemented in the same way that that the default parameter feature in VB.net has been implemented. He also points out a seemingly obvious way that they could have made it better, but then he points out why it wouldn't work when combined with the named parameter feature.
So, since Jonathan is a freakin' smart guy, and most of us (including me) aren't that smart, I am going to elaborate on his comment and explain in detail what he is talking about.
So, to start off let's look at the implementation of default parameters in C# 4.0. It all starts with two attributes called OptionalAttribute and DefaultParameterValueAttribute. If you go look these attributes up, you will see that they have been around since since .net 1.1 and .net 2.0 respectively. The reason for this is that other languages besides C# have supported these features going back to .net 1.1. In fact, you could add a DefaultParameterValueAttribute to one of you parameters in a method in C# and it would work perfectly fine, it is just that you can not consume it in C# since C# does not support this feature (until C# 4.0).
In my previous post I created a class that looked like this:
public class TestClass
{
public void PerformOperation(string val1 = "val", int val2 = 10, double val3 = 12.2)
{
Console.WriteLine("{0},{1},{2}", val1, val2, val3);
}
}
So, you see that this class has a method which has three default parameters. This means that we can call this method without passing any arguments to it and the default values would be "filled in" for us. So, how does the C# compiler implement this behavior?
Your first idea may be that C# just generates overloads, something that looks like this:
public void PerformOperation()
{
PerformOperation("val", 10, 12.2);
}
public void PerformOperation(string val1)
{
PerformOperation(val1, 10, 12.2);
}
public void PerformOperation(string val1, int val2)
{
PerformOperation(val1, val2, 12.2);
}
public void PerformOperation(string val1, int val2, double val3)
{
Console.WriteLine("{0},{1},{2}", val1, val2, val3);
}
Well, in reality, it looks like this (this is reflected code):
public void PerformOperation([Optional, DefaultParameterValue("val")] string val1,
[Optional, DefaultParameterValue(10)] int val2, [Optional, DefaultParameterValue(12.2)] double val3)
{
Console.WriteLine("{0},{1},{2}", val1, val2, val3);
}
Hmmmmm. So, instead of just generating overloads for each method with the values filled in, it just applies some attributes to the parameters that declare them as optional and then specifies their default values. But, how does that work?
If you are familiar with attributes in .net then you will know that you have to use reflection to read out the properties of these attributes, and you have to have code running somewhere to process these attributes. All they are is meta-data assigned to the method, not code that executes at runtime. So, is C# doing reflection every time I call a method with default parameters? Fortunately the answer to that question is "no".
The answer to how this works may be a little bit surprising though. If we want to call the above method with no parameters:
var testClass = new TestClass();
testClass.PerformOperation();
What does this compile to? Interestingly it looks like this:
var testClass = new TestClass();
testClass.PerformOperation("val", 10, 12.2);
You'll notice that the default parameter values for this method have just been compiled right into the calling code. The C# compiler is reading those attributes off the method and then using them to just insert the values into the calling code and then compiling them. So, what happens if I change the default values and don't compile my entire system? Well, the calling code will still have the wrong values. That is definitely something that you will have to look out for.
So, why did they choose to implement it this way? Well, as Jonathan pointed out, if they dynamically generated overloads one thing that wouldn't work is the new named parameters feature. Why wouldn't it work? Well, I'm glad you asked.
Lets say we had the overloaded methods that I put in above, and I wanted to call my method like this:
var testClass = new TestClass();
testClass.PerformOperation(val3: 15.1);
Hmmm. What overload would I call? I don't have an overload to call. Even though we generated overloads, we still can't leave out parameters and we would be stuck with inserting values into our IL again. Then we would have a mixed system where sometimes it would bake in values, and other times it wouldn't. No good.
Now, you might say, what about just generating overloads for all parameters in all orders? Well, since we have three parameters of different types, that would work for our particular instance. It would not work for all instances though. What if we had three string parameters? You can't have three overloads of a method that each take three strings, method resolution would be impossible.
It appears for now that these two features just won't interact, and I'm sure that if there was a way in the current .net runtime to make it work without baking in the values, they would have. But for now we just have to accept the way it works and move on. Maybe in the future the runtime will have a way to tag parameters with default values that can stay with the method and then use those values when parameters aren't provided. Who knows. Hopefully you found this little adventure into the default parameter to be interesting, and hopefully you'll come back for part 3 which will be coming along shortly.