NHibernate and default constructors

One of the first things you learn about NHiber­nate is that in order for it to be able to con­struct your instances and take advan­tage of lazy load­ing every per­sis­tent class must have the default, para­me­ter­less con­struc­tor. This leads to hav­ing enti­ties look­ing like this (stan­dard blog with posts and com­ments example).

 public class Post { private readonly IList<comment> comments = new List<comment>();
    private Blog blog;

    [Obsolete("For NHibernate")]
    protected Post()
    {

    }

    public Post(string title, string body, Blog blog)
    {
        Blog = blog;
        Title = title;
        Body = body;
        Published = DateTimeOffset.Now;
    }

    public virtual int Id { get; private set; }
    public virtual DateTimeOffset Published { get; private set; }

    public virtual string Title { get; set; }
    public virtual string Body { get; set; }

    public virtual IEnumerable<comment> Comments
    {
        get { return comments; }
    }

    public virtual Blog Blog
    {
        get { return blog; }
        set
        {
            if (blog != null)
            {
                throw new InvalidOperationException("already set");
            }
            blog = value;
            if (blog != null)
            {
                blog.AddPost(this);
            }
        }
    }

    [EditorBrowsable(EditorBrowsableState.Never)]
    protected internal virtual void AddComment(Comment comment)
    {
        comments.Add(comment);
    }
}

Notice the first con­struc­tor. It doesn’t do any­thing. As the obso­lete mes­sage (which is out there to get com­pi­la­tion warn­ing in case some devel­oper acci­dently calls this con­struc­tor in their code) points out – this con­struc­tor is there only so that NHiber­nate can do its magic. Some peo­ple do put there ini­tial­iza­tion logic for col­lec­tions, (espe­cially when you use auto­matic prop­er­ties) but I use fields and ini­tial­ize them inline. In that case the con­struc­tor is pretty much use­less. I started think­ing why is it even there and that per­haps it doesn’t really belong to the class. But let’s start at the beginning.

What is a constructor

As basic as the ques­tion may seem, it is use­ful to remind our­selves why we need con­struc­tors at all. The best book about C# and .NET defines them as follows:

Con­struc­tors are spe­cial meth­ods that allow an instance of a type to be ini­tial­ized to a good state.

Notice two impor­tant things about this def­i­n­i­tion. First, it doesn’t say that con­struc­tor cre­ates the instance or that con­struc­tors are the only way to cre­ate the instance. Sec­ond, con­struc­tors ini­tial­ize newly cre­ated object to their ini­tial state so that any­thing that uses the object after­wards deals with fully con­structed, valid object.

Con­struc­tors and persistence

The above def­i­n­i­tion very well applies to the other con­struc­tor we have on the Post class. That con­struc­tor ini­tial­izes the Post to a valid state. In this case valid means the following.

  • Post is part of a blog – we can’t have a post that lives on its own. Our posts need to be part of a blog and we make this require­ment explicit by requir­ing a Blog instance to be pro­vided when con­struct­ing Post.
  • Post requires a title and a body and that’s why we also require those two prop­er­ties to be pro­vided when con­struct­ing a post.
  • Posts are usu­ally dis­played in a inverse chrono­log­i­cal order hence we set the Pub­lished timestamp.

We do none of the above in the other, “nhiber­nate” con­struc­tor. That means that accord­ing to the def­i­n­i­tion of a con­struc­tor it is not really doing what a con­struc­tor is sup­posed to be doing. It is never used to con­struct an object.

Hydra­tion

Let’s take a step back now. What NHiber­nate is doing with objects in nut­shell is seri­al­iza­tion. You cre­ate an object in your code and ini­tial­ize it using con­struc­tor, do some stuff with it and then you save the object away, so that it can be retrieved later, after your app has been closed, or per­haps on another server instance. You save away the state of the object so that the state rep­re­sen­ta­tion of the object can live longer than volatile, in-memory rep­re­sen­ta­tion of the object. If you fol­low this path of though the next obvi­ous con­clu­sion is that if you have a load-balanced sys­tem and two server instances work with Post#123 they both are deal­ing with the same object, even though they are two sep­a­rate machines.

The con­clu­sion of that is that when NHiber­nate is retriev­ing an object from the per­sis­tent store it is not con­struct­ing it. It is recre­at­ing an in-memory rep­re­sen­ta­tion of an object that had been cre­ated and per­sisted pre­vi­ously. Hence we are merely recre­at­ing object that already has a well known state and had been ini­tial­ized and just pro­vid­ing dif­fer­ent rep­re­sen­ta­tion for it. This process is called hydration.

Per­sis­tent and Volatile state

The full pic­ture is a bit more com­pli­cated than what I painted so far. The data­base and in-memory object are two rep­re­sen­ta­tion of the same entity but they don’t have to be fully one to one. Specif­i­cally it is pos­si­ble for the in-memory rep­re­sen­ta­tion to have state beyond the per­sis­tent state. In other words the in-memory object may have some prop­er­ties that are spe­cific to it, and not rel­e­vant to the in-database rep­re­sen­ta­tion. A con­ve­nient exam­ple that most peo­ple will be able to relate to would be a log­ger. Please don’t quote me as advo­cat­ing using log­ging in your enti­ties but log­ger is one of the things you may want to have on your in-memory object and use it while exe­cut­ing code in your appli­ca­tion but then let them go once you no longer need the object and not per­sist them. If we had one in the Post class the empty con­struc­tor would change to the following:

[Obsolete("For NHibernate")]
protected Post()
{
    logger = LoggerProvider.LoggerFor(typeof(Post));
}

If we don’t use con­struc­tor for recre­ation of the object, how can we get the log­ger in? How do we make NHiber­nate hold the con­truc­tor seman­tics and give us fully ini­tial­ized object? Remem­ber I said one way of look­ing at NHiber­nate from the object’s per­spec­tive is that’s just a fancy serializer/deserializer. Turns out seri­al­iza­tion mech­a­nism in .NET offers us four(that I know of, pos­si­bly more) ways of tack­ing this issue

  • you can use seri­al­iza­tion sur­ro­gate that knows how to recre­ate the full state of the object
  • you can use dese­ri­al­iza­tion call­back inter­face to be noti­fied when the object has been fully dese­ri­al­ized and then ini­tial­ize the volatile state.
  • you can use seri­al­iza­tion call­back attrib­utes to be noti­fied when var­i­ous steps in the serialization/deserialization process hap­pen and ini­tial­ize the volatile state.
  • you can use ISe­ri­al­iz­able inter­face and imple­ment seri­al­iza­tion con­struc­tor which is used to dese­ri­al­ize the object.

Notice that only one of those approaches uses spe­cial con­struc­tor. Since as we dis­cussed NHiber­nate doesn’t really need the default con­struc­tor (in the­ory that is), can we really get rid of it? Turns one we can (in most cases), and we’ll look at how to do it in the next post.