Skip to content

Diameter Protocol Failover

The Failover state machine is taken from a Diameter project. As described in RFC3588, the Diameter Base Protocol is intended to provide an Authentication, Authorization and Accounting (AAA) framework for applications such as network access or IP mobility.

Failover access requirements for AAA protocols require, among others, the failover mechanism described in RFC3539.

The state machine in the RFC3539 document defines the required functionality in the form of a C code and a state transition table. Instead of providing a complete specification of the behavior as a pure state machine, it contains a flag Pending and a NumDWA counter that are required to compensate for the missing states. The counter, which counts tests in a sequence 1, 2, 3 also has a special state, at -1, which serves as an extra flag. Altogether it presents a rather confusing picture that requires a lot of time to implement. It is a typical approach of using a state machine: to consider the state machine model only as an informal tool, to define (or, rather, describe) software that will then be refined and completed in the code.

In contrast, with StateWORKS we define every state machine completely and formally, to meet the requirements in full. In the solution presented here we neither use the flag Pending, nor the NumDWA counter. Replacing them by states gives a much more easily understood state machine. The state OKAY_Pending corresponds to OKAY with pending DWA request. REOPEN states with the extension _NoPending “count” the DWA answers and REOPEN states with extension _Pending represent pending DWA answers. We added the state PAUSE to delay the transition from REOPEN to DOWN. The state transition diagram below shows the Failover state machine specified in our project. It presents a complete specification and can be implemented as is.

failover state machine

The complete documentation of the Failover state machine as generated by StateWORKS Studio (in XML) is available under Failover project. The Trace example shows the state changes in some typical disturbances in the peer connection: the temporary loss of DWA and recovery, and a more prolonged loss of the connection and recovery. Our design has undergone a great deal of testing, and works very well.

To ease comparison with the RFC3539 standard we have used names from that standard in our state machine specification.