SOA component design: thinking about error handling

When designing components for a SOA landscape (or any multiprocess system), the primary concern is with the communication behavior of the component: how messages are passed to and from the component and in what order, what those messages are and what constitutes a valid message and what doesn’t. When the time comes to implement the component, related concerns come into play: how are messages projected from the communication language into the domain model and into the implementation language, how are communication patterns met and ensured, et cetera. In addition the project technical architect has to consider how to implement the component’s domain without hardlinking it to any other components whose domains are known or to the communication medium du jour (unless the component’s purpose is linked to that medium).

Now here’s the strange thing: with all the concerns that go into design of components at all levels (from the enterprise architect down to the developers of the different components), one of the most overlooked things in SOA component building is the handling of cross-component error handling.

Error handling concerns

The fact that error handling is often the lowest-priority concern is doubly weird if you consider that cross-component error handling is the same concern as core functionality messaging. In both cases there are the same sets of concerns, both with regards to communication with external components and interaction between the component’s internal implementation and the communication layer. Some typical concerns that development teams have to deal with are:

Error message definition: Just as SOA components require clear definitions of the messages that will be exchanged with client components for mainline communication, clear definitions must also be given of messages that carry error information.
Error communication behavior definition: Just as mainline communication behavior between SOA components must be clearly and formally defined, so must similar definitions be given for when a component can send an error message in response to a request for operation.
Projecting exception types from the domain language onto error types from the communication language: In the case of error handling, this is one half of a problem that also exists for mainline communication. In mainline communication request messages must be projected onto domain model types and domain model types must be projected onto communication language types when the component returns a response. In the case of error handling of course there is no concept of request projection since nobody requests an error; however, the analog with projecting a domain model type onto a response communication message remains.
Maintaining component independence: This concern actually affects a component more as a consumer of other services than as a publisher of services. Maintaining component independence is related to avoiding domain models leaking over into foreign components (as Eric Evans puts it). In the case of messaging it means not building your domain model so that it is a mere copy of a foreign component’s communication model. In the more specific case of error handling it relates to not linking error handling too closely to the definition of errors used by foreign components. Instead, as with mainline messages, foreign component-generated errors should be projected onto the local domain model’s concept of errors or exceptions.

Balancing error handling strategies

Another concern in error handling (in addition to the ones mentioned above, which occur in all message processing systems) is introduced when communication layers introduce their own concept of error handling. A typical example of this is the WSDL/SOAP combination with its SOAPFault concept of error messaging. These built-in mechanisms can be very convenient in that they dictate an error handling strategy and leave you nothing to think about in your design. Not to mention that standardization is very valuable in these systems: it introduces an error handling strategy that everybody can agree to. However, there is also an inherent risk that these standardized error handling mechanisms will hardlink your component to a particular communication layer or framework (especially if the mechanism is allowed to leak through into the component’s inner domain model).

It is clear that each project must find a way to balance concerns in cross-component error handling strategy. Several strategies exist and must be considered:

Ignoring the communication layer’s mechanism

Doubtlessly the simplest strategy is to ignore the communication layer’s error handling mechanism. Instead one of several possible alternatives must then be considered. A simple alternative might be not communicating errors across components ever (this might for example be an enterprise policy) but allowing “empty” responses instead. Or else a project might define its own error reporting as part of regular messaging, creating either/or responses (i.e. responses that contain either a regular response or an error).

The upside of this strategy is its simplicity, plus the fact that it is guaranteed to keep the component implementation separate from the communication layer. However, client components (especially third-party ones) are not likely to appreciate the divergence from standardized mechanisms — making this a strategy that is only suitable for components that are for internal use only.

Using the communication layer’s mechanism

This strategy is more or less the other extreme of ignoring the communication layer’s mechanism. The choice here is to take the mechanism (and its reflection in the implementation language) and incorporate that throughout the component’s implementation. For example, a Java project might elect to use AxisFaults throughout the project instead of project-specific exception types.

Again, the upside of this strategy is its simplicity, plus its guaranteed interoperability with the communication layer. However, the component is now hardlinked to the communication layer. This is not necessarily a bad thing: there are projects and components whose purpose in life is linked to a single communication layer. A custom integration layer for different SOAP web services for instance might want to use SOAP Faults and related types for error handling (since it is technically part of the domain). In general though, this strategy will limit the component implementation’s future flexibility.

Mapping internal error handling to communication layer error handling

The final option seeks to combine the best of both worlds, separating the internal domain model from the specifics of the communication layer and allowing the two to touch only through a translation layer. This strategy also allows the internal component implementation to be linked to several different communication layers. For example, a component implementation might report errors using java.lang.Exceptions. Different translations might translate these exceptions into SOAP Faults for a SOAP web service publishing, a JMS exception for JMS and a specific error page for a RESTful service.

Architecturally this probably sounds like the go-to strategy for all situations. And factually this is probably the option that most projects will want to use, since relatively few projects want to be hardlinked to the communication layer. However, it is more work to implement, especially if the projection of domain errors onto the communication layer is not simple to implement.

Conclusion

Error handling is one of the most overlooked topics in the design and implementation of SOA components. However, it is at least as important as the design and implementation of mainline messaging and functionality, for the same reasons. And like mainline functionality, error handling requires some careful forethought and weighing of options. Different strategies are available, each with its own characteristics with regards to implementation difficulty and ease of use. Projects must consider very carefully which strategy fits project goals and expected future usage patterns.

SOA component design: thinking about error handling

3 thoughts on “SOA component design: thinking about error handling”

sharma
February 21, 2011 at 1:58 pm

what is the best practice and the approach to follow Exception Handling in BPEL

1) Do we need to implement Exception Handling in BPEL as we do in Java, means
method 3 throws error to method 2 (if any) and
method 2 throws error to method 1 (if any) and
finally method 1 throws error to the main Class.

If we replicate the above scenairo to BPEL

In BPEL main Scope have Custom Fault, Catch ALL

At each Invoke’s surrounded by a Scope Acitivity with Remote Fault, Binding Fault & Custom Fault

[or]

2) In BPEL main Scope have all exceptions defined like
Remote Fault,
Binding Fault,
anyOther System Fault,
Custom Fault (if required) and
CatchALL

and also
each Invoke is surrounded by a Scopes Acitivity with Custom Fault (business fault) exception Handling
Eric Elzinga
August 2, 2008 at 9:37 pm

Hi Ben,

I’m doing a lot integration projects lately and we noticed implementing good eror handling functionality takes a lot of time. Mostly these parts are also not well planned in the projects. For integration components it’s hard to have good error handling functionality because reaches a lot of components and environments. Every possible error can also have it’s own handling functionality. Integration components like bpels could make use of compensation handlers but in java you would try to handle it which the exceptions, and other components could have their own strategies for those situations. So…it’s hard to describe one generic error handling strategy which applies all.But for sure it’s important to ‘think more’ about error handling’ 🙂
jettro
July 27, 2008 at 7:59 am

Nice one again Ben. I think a good error handling strategy for projects and frameworks is overlooked a lot. This is often done as one of the last things, when it is to late and no real strategy is used at all. A good way of noticing this is a lot of catch RuntimeException/Exception in the code.

I am running into the same problem with flex, there is also a fault mechanism with events implemented. But you do not have the capabilities of java available to handle the events in a standard way. You can create a mechanism yourself, but for bigger appications that is not what you want.

Error handling keeps being one of 4 attention points for projects: configuration management, application layering, standardized package, class and method names and error handling.

thanks for the article.

Comments are closed.