Build To Manage: Proper Exception Handling Makes Your Applications Easier to Build and Manage

Gerry Kovan
9 min readJul 9, 2020

Exception handling continues to be one of the more challenging aspects when designing and writing code. Poor exception handling leads to many problems such as:

· poorly designed code that is difficult to understand, test and refactor

· difficult to diagnose the root cause of errors

· poor logging

· poor monitoring

These problems not only make working with the code difficult from a perspective of refactoring and adding new features, but it also makes managing the application in production environments extremely difficult.

The antidote to these problems involves applying the following best practices:

· apply the single responsibility design principles to exception handling

· proper exception handling and logging of exceptions

· provide meaningful error codes and error messages

· enable distributed tracing

· apply best practice API design for returning HTTP status codes

In this blog, we will explain how each of these best practices can avoid the pitfalls of poor exception handling and allow you to build apps that are easier to manage. A sample mortgage calculator microservice application written using the Java Spring Boot framework is used to demonstrate the concepts however the techniques can be applied to any language.

Description of Mortgage Calculator

There are two microservices in the mortgage calculator sample application: 1) mortgage-calculator 2) interest-rate.

The mortgage-calculator microservice has the business logic to calculate the monthly mortgage payments (either fixed-rate or interest only). If the request payload does not include an interestRate field in it, then the interest-rate microservice is invoked to get the latest real-time rate.

Topology architecture of a mortgage-calculator application

Apply Single Responsibility Principle for Exception Handling

The single responsibility principle states that each class in your codebase should have only a single responsibility. This applies to exception handling as well. Typically, in a software system that implements a REST API, there are several layers in the code base such as:

· controllers — defines the REST API

· tasks — business logic where the logic is bounded to the process

· services — makes calls to external systems

Exceptions can potentially occur at any of these layers in the code. For example, you may have some logic in the controller to validate the request and throw an exception when the request is not valid. The task layer can have some exception handling logic for business logic scenarios and the service layer typically would throw exceptions if any problems occur when calling an external service, either client-side problems such as bad credentials or server errors on the dependent service.

A best practice approach to handle these errors is to introduce a single class to handle these exceptions. In the case of our sample mortgage application, we introduced the ‘MortgageCalculatorExceptionHandler’.

This class is annotated with the @ControllerAdvice Spring annotation which declares that the class is a spring bean specifically designed to handle exceptions.

This class can be thought of as an extension to the controller. Essentially, the controller contains the behavior for the happy path of the API, while the ControllerAdvice class contains the API behavior for the exception cases. You will notice that the logic for handling the exceptions also includes the HTTP status codes that are returned by the API. The nice thing about separating exception handling into its own class is it makes the code design cleaner and better resulting in code that is easier to understand and refactor.

The controller code that handles the happy path is kept very clean and simple.

Testing the code also becomes simpler as well. Below is the code to test the API behavior. Notice that we test for all the different types of HTTP status codes that can be returned from the /calculate API.

Note, other layers of the microservice need appropriate tests as well. In this blog, we are just focusing on the API behavior tests to capture API behavior.

Proper Exception Handling and Logging of Exceptions

Every application will encounter exceptions. Exceptions let the development team know when something went wrong. It is important to make it as easy as possible for the development team to diagnose the root cause of the problem.

Either log an exception or throw an exception, but not both.

The code below shows how we invoke our Interest Rate service via a REST API call and our exception handling. For the exception handling, we wrap the original exception in an application-specific exception class that we define called InterestRateServiceException so that we can pass it some custom properties such as error status code and an appropriate error message. It is also very important to pass in the original exception as well which we call origException. This is so that when we eventually log the exception, the original exception/cause will get logged as well which is critical for diagnosing the problem. Notice that we did not log the exception yet as we only want to log it one time.

Log an exception only once

In our application, the exceptions eventually bubble up to the ControllerAdvice layer (extension to the Controller) and that is where they are handled and logged. The log.error statements (line 8 and 19) is where we log the error. Each exception that occurs only gets logged once to make it easier for the developers to navigate the application logs and identify the root cause of the problem. The exception e is is passed in as a parameter to the log.error statement so that the exception stack trace gets printed out.

As an example, when we call the mortgage calculator /calculate endpoint with the following request body…

{"principal": "100000","term": "30","type": "interest"}

We expect the mortgage calculator to invoke the interest-rate service to get the latest interest rate. In the case when the interest rate service is down or not working, the following exception would be logged by the mortgage calculator…

[2m2020-07-05 23:29:32.828[0;39m [31mERROR [-,5f029a9c2397110f1f4f33672cd54079,1f4f33672cd54079,false][0;39m [35m3391[0;39m [2m---[0;39m [2m[nio-8110-exec-9][0;39m [36mm.c.c.MortgageCalculatorExceptionHandler[0;39m [2m:[0;39m Error invoking interest rate service : com.gk.mortgage.calculator.exceptions.InterestRateServiceException: MC0003:Error invoking the interest rate service.at com.gk.mortgage.calculator.service.InterestRateServiceImpl.getRates(InterestRateServiceImpl.java:64) ~[classes/:na]at com.gk.mortgage.calculator.task.MortgageProcessorTaskImpl.process(MortgageProcessorTaskImpl.java:31) ~[classes/:na]at com.gk.mortgage.calculator.controller.MortgageCalculatorController.calculateMonthlyPayment(MortgageCalculatorController.java:34) ~[classes/:na]......at java.lang.Thread.run(Thread.java:748) [na:1.8.0_144]Caused by: org.springframework.web.client.HttpServerErrorException: 503 Service Unavailable
at org.springframework.web.client.DefaultResponseErrorHandler.handleError(DefaultResponseErrorHandler.java:111) ~[spring-web-4.3.25.RELEASE.jar:4.3.25.RELEASE]
at org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:709) ~[spring-web-4.3.25.RELEASE.jar:4.3.25.RELEASE]at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:662) ~[spring-web-4.3.25.RELEASE.jar:4.3.25.RELEASE]at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:622) ~[spring-web-4.3.25.RELEASE.jar:4.3.25.RELEASE]at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:540) ~[spring-web-4.3.25.RELEASE.jar:4.3.25.RELEASE]at com.gk.mortgage.calculator.service.InterestRateServiceImpl.getRates(InterestRateServiceImpl.java:58) ~[classes/:na]... 118 common frames omitted

Notice, that we log only once, the error code (MC0003), the error message as well as the original exception which is the cause and referred to as Caused by in the log output above.

This technique makes it easier for developers to diagnose problems by looking at the logs.

Provide Meaningful Error Codes and Error Messages

In the mortgage calculator microservice application, we defined the error codes and error messages in a file called messages.yml.

This file gets read in by the application as a configuration and when an exception occurs, we populate the exception with the appropriate error code and exception. The source code below shows the validation code where we check to make sure that the input request has a valid request object that contains a principal amount and a term amount. If the request is not valid we throw a BadRequestInputExcpeption and pass as parameters the error code and the error message that we read from the messages.yml file.

Doing this makes the application easier to manage as it enables:

  • Searching of logs by error code
  • Automated alerts by error code
  • Analytics to identify the number of occurrences of specific error codes

Note, log aggregation systems such as LogDNA, Splunk, and others provide powerful tools for searching and alerting. By having meaningful and well-defined error codes be spit out by your application code, you can enable sophisticated log analytics using those tools mentioned.

Enable Distributed Tracing

A well-designed software application typically consists of several software modules/components that interact with each other to perform a business function. In modern-day microservice architectures, the number of microservice interactions has increased substantially. There needs to be a way to observe the runtime behavior of the application. Distributed tracing makes this possible by attaching a trace id to every request and having the trace id propagate through the entire interaction within a running microservice process as well as across distinct microservice processes.

In our mortgage calculator example, when we submit the following request that does not contain and interest rate value, then the application invokes the interest-rate service to get the latest interest rate:

{"principal": "100000","term": "30","type": "interest"}

The logs that get generated for the mortgage calculator microservice are as follow:

2020-07-06 11:05:49.161[0;39m [32m INFO [-,5f033dcd88a9c8a95399b8e4793a0ea6,5399b8e4793a0ea6,false][0;39m [35m7626[0;39m [2m---[0;39m [2m[nio-8110-exec-8][0;39m [36mc.g.m.c.c.MortgageCalculatorController  [0;39m [2m:[0;39m Calculating morthly mortgage payment.[2m2020-07-06 11:05:49.173[0;39m [32m INFO [-,5f033dcd88a9c8a95399b8e4793a0ea6,5399b8e4793a0ea6,false][0;39m [35m7626[0;39m [2m---[0;39m [2m[nio-8110-exec-8][0;39m [36mt.InterestOnlyMortgageCalculatorTaskImpl[0;39m [2m:[0;39m In calculate method of InterestOnly

The logs that get generated for interest rate service are:

2020-07-06 11:05:49.170[0;39m [32m INFO [-,5f033dcd88a9c8a95399b8e4793a0ea6,b7f80d68be7fb243,false][0;39m [35m7625[0;39m [2m---[0;39m [2m[nio-8111-exec-7][0;39m [36mc.g.i.r.c.InterestRateController        [0;39m [2m:[0;39m In controller of /interest-rates[2m2020-07-06 11:05:49.170[0;39m [32m INFO [-,5f033dcd88a9c8a95399b8e4793a0ea6,b7f80d68be7fb243,false][0;39m [35m7625[0;39m [2m---[0;39m [2m[nio-8111-exec-7][0;39m [36mc.g.i.rate.task.InterestRateTaskImpl    [0;39m [2m:[0;39m Getting latest interest rates[2m2020-07-06 11:05:49.170[0;39m [32m INFO [-,5f033dcd88a9c8a95399b8e4793a0ea6,b7f80d68be7fb243,false][0;39m [35m7625[0;39m [2m---[0;39m [2m[nio-8111-exec-7][0;39m [36mc.g.i.rate.task.InterestRateTaskImpl    [0;39m [2m:[0;39m The interest rates are: InterestRatesResponse(interestRates=[InterestRate(type=30, rate=5.0), InterestRate(type=15, rate=3.5)])

The bolded text in the logs above represents the trace id and span id (separated by comma). Notice how the trace id and span id are identical across the two microservices. This enables observability of the runtime behavior of the application, with all microservice interactions. Furthermore, with the trace id, you can search the logs by the trace id or the span id and see all the logs across all components for that particular request which is very powerful in understanding system behavior. This example demonstrated distributed tracing for the happy working path, however, the same principle applies when exceptions occur.

Apply Best Practice API design for Returning HTTP Status Codes

The mortgage application microservice provides the /calculate REST API endpoint. The API supports the following HTTP error codes:

200 request executed successfully

400 request is missing or request does not contain all required values

401 unauthorized when incorrect security credentials are provided

500 server error when a call to a dependent service (interest rate service) fails

These HTTP status codes follow best practice usage and are clear for a developer who wants to understand how your API behaves.

The following blog can help you choose the appropriate HTTP status codes for your APIs: https://medium.com/@gkovan/http-status-code-resources-for-building-rest-apis-c01df798b446.

By having well-defined APIs that use the status codes appropriately, makes the application easier to manage. For example, you can use APM (application performance monitoring) tools such as New Relic or Dynatrace for monitoring and alerts based on rules applied to status codes. As an example, you can set up an alert that notifies the development team whenever HTTP 500 status code occurs more than 3 times in a time window of 300 seconds. This would indicate there is a problem with the interest rate service. You can set up another alert whenever more than five HTTP 401 error codes occur within a time window of 60 seconds. This may indicate that someone is trying to hack your APIs by cracking the security authentication codes. These are just two examples of alerts that can be applied to help the development team manage the application. Many other scenarios can be monitored as well.

Summary

This blog discusses how to make your application easier to manage. We first discussed the problems with poor exception handling such as:

· poorly designed code that is difficult to understand and test

· difficult to diagnose the root cause of errors

· poor logging

· poor monitoring

The blog then discussed in detail the following techniques to make your application easier to manage:

· apply the single responsibility design principles to exception handling

· proper exception handling and logging of exceptions

· provide meaningful error codes and error messages

· enable distributed tracing

· apply best practice API design for returning HTTP status codes

By following these best practices, you will enable your development team to be more productive. Refactoring the code and adding new features because easier because your code has a clean design with appropriate tests that capture exception cases. Equally as important, the development team can more efficiently manage the application by quickly finding the root cause of problems and fixing them quickly.

The source code to this example can be found in the git repo: https://github.com/gkovan/mortgage-calculator and https://github.com/gkovan/interest-rate

https://www.ibm.com/garage/method/practices/code/build-to-manage

IBM Garage is built for moving faster, working smarter, and innovating in a way that lets you disrupt disruption.

Learn more at www.ibm.com/garage

--

--

Gerry Kovan

IBMer, software engineer, Canadian living in New York, husband, father and many other things.