Unexceptional Exceptions

Dropping our bad habits and preconceptions toward a better codebase

Ciprian

Ciprian

Software Engineer Team Lead at Softvision
Ciprian is a passionate software developer whose main concerns in a codebase are readability and maintainability - no matter the programming language in which they are written. As such, he has devoted his time to studying the universal language of design patterns that have proved themselves or continually emerge in the programming world, be them object-oriented or otherwise. With this knowledge of universal abstractions, his aim is to reconcile the existence and collaboration of functional, procedural and object-oriented programming inside the same codebases.
Ciprian

Latest posts by Ciprian

Instead of explaining some well-known and well-documented concepts that are required in this article, I will provide useful links to such resources or otherwise interesting reads.

Although this article uses C# and F# to illustrate some ideas, the principles I am trying to advocate apply to any object-oriented language.

I have recently been introduced to the functional programming school of thought (this is to say that I am nowhere close to being a master – a neophyte at best) and have since tried to adapt the way I write my C# code to be as transparent as possible, as to resemble the functional paradigm. In other words, write code in such a way that in order to find out how to interact with an interface all you have to know about it can be found in its methods’ headers.

Why do this you ask? If we were to go on journey where we randomly dive into a C# code base and see how they treat their data validation, an overwhelmingly large amount of them will use the throw-try-catch trio in order to do so, with varying degrees of elegance or success.

Take this interface for example:

public interface ICustomerRepository
{
    Customer GetCustomer(string username);
    void CreateCustomer(string username, string email);
}

What does it tell you about what it does? The GetCustomer method will always return a string, given any string instance. The CreateCustomer method will always create a new customer, regardless of the input values and the state of the persistence. Sounds dubious? Well, this is the only explicit knowledge you truly have about it. Everything else that you know about are actually implementation details or documentation that may or may not be up to date. The fact that it returns null or throws an exception if the customer does not exist? Or that it throws an exception if the email is invalid? Implementation details. In other words, you have a leaky abstraction to deal with – in order to use it properly, you need to know how its implementation(s) works.

If we were to look at the most common exceptions, null reference exceptions would dominate the scene. How does this affect us? Countless exceptions, be them thrown exceptions or nulls, that are not actually exceptional – they are expected. Their documentation flow diagram or common sense clearly states them as a possible scenario.

Suppose you are creating a new customer. If the username is null or empty, is it such an exceptional case that it warrants throwing an ArgumentNullException or a ValidationException? Some might argue that such a situation is an exception to the happy flow, so it can warrant throwing an exception. I will not argue against that, as a very professional team can use this approach with success. However, can we all take pride in our teams being the best and most professional around? What I believe (and I will add reasons for this further into the article) is that we should use exceptions for what really are exceptional situations – situations that have not been defined in our domain’s state machine, situations nobody thought of and thus left unhandled properly.

With this idea in our head, did it never cross anyone’s mind that the user might provide invalid data when creating a new customer? Is it something that out of the ordinary for a user to provide invalid input into a text box? Did this scenario never end up into a flow diagram?

Unfortunately, this practice of returning from a method call by throwing an exception is so common that even Martin Fowler, the author of Clean Code, had to debunk this aspect in his writings. To quote him, “if a failure is expected behavior, then you shouldn’t be using exceptions.”

If his reasoning in the linked article did not convince you, there is also the fact that an eight year old can have a better understanding of this issue. Remember, we want to write code that other people can understand at a glance. If a person who does not think like a programmer says that our logical flow is bad and counter-intuitive, we should take a step back and really ask ourselves whether this is the case. It might very well be.

But this fallacy of appealing to Martin’s authority is not the main reason for which I disagree with this method. Mentioning his work here is merely aimed at making you think: if a thoroughly qualified person such as him makes such claims, together with some solid arguments, it may be worth thinking about whether we are doing something we are not supposed to. Instead, my main thoughts against this approach and an alternative to it are explained in the rest of this article.

Consequences of this approach

Often times, when having to work with a new class in your code base, judging from its interface, things look pretty straight-forward, so you plug it in and have the unit tests run. But then, oh horror, business exceptions upon business exceptions. The fun bit about this is that in order to find out all the business flow errors that the class can throw you have to either delve deep into its source code and manually identify every single thing that can go wrong, simply use it and discover the exceptions it throws through testing/unit testing or go check out the documentation (we all know how useful Confluence/Word/Xml documentation is after a while… So useful that in fact it is the last thing many programmers check, only after everything else has failed).

This is both awkward and unreliable. Why put so much work into discovering what expected flow exceptions a method contains, again and again, each time some new developer trips into that class, since someone already did that when originally implementing the class? Why not have it documented somehow and someplace the compiler can actually enforce and keep fresh? Remember, self-documenting code is the best documentation.

Based on my experience, the designs built around this throw everything approach have mostly been less than from ideal from the developer’s point of view (after all, its basic premise is fragile, due to the leaky abstraction issue above). Yes, they all do the job that the end user expects them to, but when it comes to helping the developer to their job, they may very well help him, provided the code based is well-designed and the developers disciplined, but it might also present him with a temptation of turning the codebase into a bowl of spaghetti.

Throw similar in nature to goto

More precisely, this approach of breaking execution through exceptions seems oddly reminiscent of the goto practices of the old days when control flow was done by using the goto statement. Due to the fact that there were little or no guidelines on how to structure their code, code used to easily become an unintelligible mess of goto from one label to another. Our ancestors, starting with Edsger W. Dijkstra’s “Go To Statement Considered Harmful,” have correctly identified this way of working as being prone to needless human errors and have introduced the notion of structured programming.

What structured programming sought to do was create linear flow of logic inside all blocks of code – input comes through <here>, output comes out through <there>, the flow goes from <here> to <there>. No more reasoning about where the logic flow might end up and what state it may alter in the process, aka less spaghetti code.

The reason why goto was considered bad is because in addition to some well-known control structures such as ifs and loops (which may not have necessarily had specific syntax but instead be composed of gotos and labels), developers were able to create goto-label pairs that would follow no pattern, law or consideration, leading to code that is impossible to maintain. So what structured programming sought to do was create a linear flow of logic inside all blocks of code – input comes through <here>, output comes out through <there>, the flow goes from <here> to <there>. No more reasoning about where the logic flow might end up and what state it may alter in the process. The programmer now has a guarantee that once the execution ends up <here>, it will, sometime in the future, also reach <there>.

Are we not repeating the same mistakes that they have done, only this time we are doing it through exceptions rather than gotos? By happily throwing around exceptions with every opportunity we get, we are discarding this guarantee our predecessors have struggled to achieve. I believe that, although at a lesser scale, indiscriminate throwing of exceptions can be a cause artificial complexity of our code, similar to the goto problem of the old. Would it not be better if we narrowed the range of scenarios in which we use exceptions to cases in which they are, as their names says, exceptional? As such, we would limit the complexity of methods to a degree in which our feeble human minds can more easily manage. No exceptions thrown by a method, one less thing to worry about. It is that simple.

A couple of examples of why poorly self-documenting interfaces are painful

Let us go back to the interface presented in the description:

public interface ICustomerRepository
{
   Customer GetCustomer(string username);
   void CreateCustomer(string username, string email);
}

Regarding the first method, people’s natural thought process is to assume that this method will always return a customer, since that is what the signature says. It takes an extra thought to figure out that this may not always be the case. In some cases they might be right, and the implementation of this interface actually returns a Null Object, or they will modify the return type to use the upcoming C# nullable reference types in case their query does not have a corresponding result. For simple scenarios, these two methods will work, but in situations where there can be multiple reasons that can cause the lack of a proper result, it may be useful to know precisely why that happened. Think of the following situation: you receive an URL to a product on an online store (someShop.com/product?id=3463) from a friend in another country, encouraging you to buy it due to its incredible price. Thing is, in reality, that product is available only in the country your friend resides in, so when you enter the website, somewhere in the application layer of the backend, a null is returned from a GetProductMethod. What does it mean? Does that Id not exist in the database? Is it not available for the country of the request? Due to the lack of this information, the UI will have to show you some obscure error which tells you nothing and leaves you dissatisfied with that shop. The idea of this simplistic approach to return values is that it restricts you to two general return states: success, in which you get your desired value or failure in which you gain no info on why it has not found the object of your query.

However, in most cases I have seen so far, what people do is return a null reference, thus causing unsuspecting developers to trip into the much-dreaded null-reference exception. Although this example can be a bit naive, I myself have fell into this trap – not null checking a result coming from a method of this kind. Also, you can surely imagine a more complex scenario where this problem can occur.

As for the second method, it causes the same problem as stated in the introduction. How does one know what kind of behavior the implementation of this interface exhibits when it encounters an invalid username or email?

As the saying goes, “A user interface is like a joke. If you have to explain it, it’s not that good.” But I would say that if you remove the “user,” it remains equally valid in the programming world. Plus, having to do all that digging in order to find out how a class works is definitely a bad joke and an insult to good abstractions. After all, a good abstraction is defined by how easy it is to use without knowing anything about its implementation.

How can this interface be refactored into something that precisely conveys how a method behaves?

Let the user of the interface know what he should be expecting through a method’s return type and have the compiler help (or force) him into providing a resolution for all expected negative flows.

  1. Modify each and every method to adhere to CQS. Having a method clearly state whether it changes the state of the system or merely queries it is a grand first step into the beautiful world of readable and maintainable code.
  2. Methods should have a single well-specified exit point with a clear, all-encompassing return type. In the case of imperative programming queries where business flow errors are thrown around there are the 2 exit points: the return statement and the throw statement.

Now, for the big question: how can these 2 exit points be reconciled into one datatype?

In F#,  the functional cousin of C#, there are these notions of discriminated unions and pattern matching. In layman terms, discriminated unions are a language feature that allows programmers to easily create composite data types and then have the compiler help them correctly implementing flows for each component type contained in that container type through pattern matching. So, if we have a function that is expected to possibly return a business exception, we can create a type that contains both the possible success and failure result.

Back to our customer creation scenario, there are four possible things that can go wrong:

  1. Username is already in use
  2. Username is null or whitespace
  3. Email address is empty
  4. Email address is invalid

Now suppose we want to create a class that will only be able to hold these values and be able to respond to the UI accordingly. I assure you, the F# version is readable and simple enough to understand even if you have had no previous experience with it.

F# flow:

type CustomerCreationFlowException=
   | UsernameAlreadyInUse
   | NameNullOrEmpty
   | EmailAddressNullOrEmpty
   | InvalidEmail

type Result=
   | Ok
   | CustomerCreationFlowException of CustomerCreationFlowException

[<EntryPoint>]
let main argv =
   let result = match Ok with
       | Ok -> printfn “Ok”
       | CustomerCreationFlowException e -> match e with
           | UsernameAlreadyInUse -> printfn “Username already in use”
           | NameNullOrEmpty -> printfn “Name null or empty”
           | EmailAddressNullOrEmpty -> printfn “Email address null or empty”
           | InvalidEmail -> printfn “Invalid email”
   printfn “Hello World from F#!”
   0 // return an integer exit code

You can compare the CustomerCreationFlowException and the Result types with a sort of C# nested enums that can do much more. The other element that might cause confusion is that match statement. It is nothing more than a case of pattern matching, a feature that C# 7 also possesses.

As you can see, this code covers all possible scenarios in our flow and is pretty readable at it. The extremely nice thing about this is that if we add a new expected exception, the compiler will tell us that this code does not cover it.

For example, the database the application is using may be going through some rough times and suffer severe downtimes. In such situations, the business wants to inform the users that everything is under control and they should retry doing their tasks later. Below is the updated version of the “enum” type.

type CustomerCreationFlowException=
   | UsernameAlreadyInUse
   | NameNullOrEmpty
   | EmailAddressNullOrEmpty
   | InvalidEmail
   | DatabaseDown

We have now added this new expected exception scenario and all seems well. But what if we forget to update all the places where this DU is used? There may be multiple places that do a switch based on this type. Kindly enough, the F# compiler will warn us of all such occurrences and if we set the option to treat warnings as errors, we will get a failing build to stop us altogether from leaving a path untreated.

Back to C# – The Result classes

As we have seen, F# has this very useful feature of letting you know when the range of possible known exceptions has been extended and helps you modify your code accordingly.

In C# there is no built in way of doing this, so we will have to do this by hand. And not a paragraph too soon. This article has been loaded with coding philosophy, so here is some code to refresh your eyes.

namespace SomeNamespace
{
   using System;
   using System.Collections.Generic;

   /// <summary>
   /// Defines what a reason for a business flow exception might look like.
   /// </summary>
   public interface IReason
   {
       string MessageText { get; }
   }

/// <summary>
///  Interface that defines the result of a command that may have ended successfully.
/// </summary>
/// <typeparam name=TCaseOneFailure>The type of failure that can stop it from ending successfully.</typeparam>
public interface ICommandResult<out TCaseOneFailure>
   where TCaseOneFailure : IReason
{
   bool IsSuccessful { get; }

   IReadOnlyList<IReason> ReasonsForFailure { get; }

   /// <summary>
   /// Executes the given actions based on the fact whether the requested action was successful or has failed in one of
   /// the specified cases.
   /// </summary>
   /// <param name=actionForSuccess>The action to be called in case of success.</param>
   /// <param name=actionForCaseOneFailure>The action to be called if the command encountered a case one failure.</param>
   TResult Evaluate<TResult>(
       Func<TResult> actionForSuccess,
       Func<TCaseOneFailure, TResult> actionForCaseOneFailure);
}


/// <summary>
/// Interface that defines the result of a query that may have ended successfully, thus containing a TValue, or
/// with some business exception, thus containing a TCaseOneFailure.
/// </summary>
/// <typeparam name=TValue>The type of result the operation should return.</typeparam>
/// <typeparam name=TCaseOneFailure>The type of failure that can stop it from ending successfully</typeparam>
public interface IQueryResult<out TValue, out TCaseOneFailure>
   where TCaseOneFailure : IReason
{
   bool IsSuccessful { get; }

   IReadOnlyList<IReason> ReasonsForFailure { get; }

   /// <summary>
   /// Extracts a result from this IResult implementation based on the given transformation
       functions provided
   /// for success and failure, respectively.
   /// </summary>
   /// <typeparam name=TResult>The type of result the operation should return.</typeparam>
   /// <param name=successFunction>The function to be called in case the operation was successful.</param>
   /// <param name=caseOneFailureFunction>The function to be called if a case one failure was encountered.</param>
   /// <returns>The “unpacked/transformed” result.</returns>
   TResult Evaluate<TResult>(
       Func<TValue, TResult> successFunction,
       Func<TCaseOneFailure, TResult> caseOneFailureFunction);
}

   /// <summary>
   ///  Extension of the above interface.
   /// </summary>

public interface IQueryResult<out TValue, out TCaseOneFailure, out TCaseTwoFailure>
   where TCaseOneFailure : IReason
   where TCaseTwoFailure : IReason
{
   bool IsSuccessful { get; }

   IReadOnlyList<IReason> ReasonsForFailure { get; }

   TResult Evaluate<TResult>(
       Func<TValue, TResult> successFunction,
       Func<TCaseOneFailure, TResult> caseOneFailureFunction,
       Func<TCaseTwoFailure, TResult> caseTwoFailureFunction);

   /// <summary>
   /// In some scenarios, it might totally unexpected or impossible for an evaluation to fail, case in which     this method becomes handy.
   /// The difference between this and a simple access to resultInstance.Value is that this method clearly        states that it will fail
   /// if the performed operation did not end successfully
   /// </summary>
   TResult EvaluateOrThrow<TResult>(Func<TValue, TResult> successFunction);
}

// And so on…

}


The ways in which these interfaces can be used are twofold:

  1. They can be used on a case-by-case basis, the same way it is done in F#. This is done using the Evaluate method that will call the success function or the first function that corresponds to a failing business condition.
  2. They can be used as a whole, by going through the ReasonsForFailure read-only property that allows the developer to act upon the entire list of everything that went wrong. This can, for example, be used in a user registration context, where multiple fields need to be validated and you do not want to do a request-response for each and every field. You want to tell the user “Field X is invalid due to…, field Y is invalid due to… and so on.”

These interfaces allow any developer to clearly see any expected, but invalid behavior up front and also use the compiler to force them into treating all these cases (for the sake of brevity of this article, I will only include the interfaces for up to two cases of business exceptions, but you can take a look at the entire solution on GitHub).

As a screen of code is worth a thousand words, here is a sample of how these interfaces are used, both on a case-by-case basis and as a whole:

public interface ICustomerRepository
{
IQueryResult<
   Customer,
   CustomerDoesNotExist> GetCustomer(string username);

ICommandResult<
   UserNameIsInUse,
   UsernameIsNullOrWhitespace,
   EmailAddressIsEmpty,
   EmailAddressIsInvalid> CreateCustomer(string username, string email);
}

A typical N-Tier, data-centric architecture service, with a little touch of transparency.

public class SomeController : ApiController
{

   private readonly ICustomerRepository customerRepo;


   public SomeController(ICustomerService customerRepo)
   {
       this.customerRepo = customerRepo;
   }

   public IHttpActionResult GetCustomer(string username)
   {
       var result = this.customerRepo.GetCustomer(username).Evaluate<IHttpActionResult>(
           this.Ok,
           customerDoesNotExist => this.NotFound(),
           persistenceDown => this.InternalServerError());

       return result;
   }

   public IHttpActionResult GetCustomerMvcStyle(string username)
   {
       var result = this.customerRepo.GetCustomer(username);

       if (!result.IsSuccessful)
       {
           var messages = result.ReasonsForFailure.Select(r => r.MessageText);
           var aggregatedMessage = string.Join(Environment.NewLine, messages);
           return this.BadRequest(aggregatedMessage);
       }
       else
       {
           return this.Ok();
       }
   }

 

   public IHttpActionResult CreateCustomer(string username, string email)
   {
       var result = this.customerRepo.CreateCustomer(username, email);

       var response = result.Evaluate<IHttpActionResult>(
           this.Ok,
           usernameIsInUse => this.Conflict(),
           usernameIsEmpty => this.BadRequest(usernameIsEmpty.MessageText),
           emailIsEmpty => this.BadRequest(emailIsEmpty.MessageText),
           emailIsInvalid => this.BadRequest(emailIsInvalid.MessageText),
           persistenceDown => this.InternalServerError());

       return response;
   }
}

}

(the application layer)

Back to our simulated scenario where the database is frequently down, the client requests that our C# code also handles the database down as an expected scenario that should be treated explicitly. How can this be done? Simply modify the ICustomerService interface by adding PersistenceDown (an implementation of IReason) to the parameter type list of both methods. Once this is done, ALL code blocks that invoke these two methods will no longer compile and will force the developer to handle this newly introduced scenario.

public interface ICustomerService
{
   IQueryResult<
       Customer,
       CustomerDoesNotExist,
       PersistenceDown> GetCustomer(string username);

   ICommandResult<
       UserNameIsInUse,
       UsernameIsNullOrWhitespace,
       EmailAddressIsEmpty,
       EmailAddressIsInvalid,
       PersistenceDown> CreateCustomer(string username, string email);

}

No more forgetting to handle a situation in all 10 places in the codebase where it occurs.

Having all the expected behavior clearly specified in a method’s header becomes an extremely useful tool for figuring out whether your interfaces obey the SRP. If a method header becomes very long, there is a strong chance your method is doing too much and it is time to break it down into multiple methods.

The downsides and their solutions

Due to the verbose nature of C#, working with generics that have many type parameters can be cumbersome. For example, in the ICustomerService implementation, at one point we will have to do some sort of instantiation of the result:

var result = new CommandResult<
               UserNameIsInUse,
               UsernameIsNullOrWhitespace,
               EmailAddressIsEmpty,
               EmailAddressIsInvalid,
               PersistenceDown>(some parameters)

If we were to also add a bunch of parameters and extra type parameters, you can easily see how this style would go out of hand, but fortunately, there are solutions.

Luckily, although not specifically designed for this purpose, aliases are a very good fit for this scenario:

using CreateCustomerShortHand = Results.Results.CommandResult<
   Domain.Repositories.UserNameIsInUse,
   Domain.Repositories.UsernameIsNullOrWhitespace,
   Domain.Repositories.EmailAddressIsEmpty,
   Domain.Repositories.EmailAddressIsInvalid,
   Domain.Repositories.PersistenceDown>;

Using this construct, you can now instantiate in your code a new instance of the result by simply doing new CreateCustomerShortHand(some parameters).

As for the next issue, since a request can fail for any number of reasons between 1 and n, a Command/QueryResult constructor must be able to take a runtime dictated number of parameters. Luckily, in C#, we have optional constructor parameters which would allow use to have a constructor that permits developers to provide an arbitrary number of reasons for failure. But using them possess one major problem: we cannot force the developer to supply at least one reason for which the operation is failing. The solution to this would be to create a builder which will only allow the instantiation of a failure result with at least one parameter provided. I will not add the implementation here, as it is not necessarily very relevant to my point, but it can be found on the GitHub repository.

As a side note regarding aliases, in a real-world scenario I would use these two approaches for building queries and commands. Besides the numerous benefits of adhering to SRP and having those two very useful abstractions for doing AOP (IQueryHandler and ICommandHandler), this approach also gives you the opportunity to use meaningful aliases without having to resort to excessively long word composition. For instance, CreateCustomerResultBuilderShortHand, could easily be renamed to ResultBuilderShortHand or simply ResultBuilder and CreateCustomerShortHand to ResultShortHand if these alias now reside in a CreateCustomerCommandHandler. Same goes for queries.

I have used the above service implementation for the sake of you, the reader, having some familiar ground beneath your feet from which to learn these new concepts I have tried introducing to you. I certainly hope it worked.

As a conclusion

In this not-so-brief article, I have done my best in trying to convince you to program in a more transparent manner and in this process help your fellow developers better understand and use your code. I hope that by sharing this knowledge of how clean code can look like (I am not claiming this is the only solution to this transparency problem), the programming world will become a cleaner place and my chances of ending up on a project where quality standards are low will decrease.

I also hope I have managed, to at least a bit, convince you of the importance of self-explanatory abstractions. If all that you keep after having finished this article is this, I will be happy. If you also decide to stick to this practice, even partially, I will be even happier. If you decide to go all the way with it and have your interfaces be honest about what they really do, I will be most happy.

Remember, every piece of clean code you write might make someone’s day a bit brighter!

 

Share This Article


Ciprian

Ciprian

Software Engineer Team Lead at Softvision
Ciprian is a passionate software developer whose main concerns in a codebase are readability and maintainability - no matter the programming language in which they are written. As such, he has devoted his time to studying the universal language of design patterns that have proved themselves or continually emerge in the programming world, be them object-oriented or otherwise. With this knowledge of universal abstractions, his aim is to reconcile the existence and collaboration of functional, procedural and object-oriented programming inside the same codebases.
Ciprian

Latest posts by Ciprian

1Comment
  • Supriya Rajgopal
    Posted at 11:12h, 04 June Reply

    Firstly, thank you Ciprian for throwing light on the importance of code readability & transparency!

    As a developer, we often have to make a trade-off between code readability and code consistency. This is especially true when a project enters the maintenance phase.
    While code readability surely makes the code easier to maintain, code consistency helps in keeping the code structured (although imperfect).

Post A Comment