Auto Recovery feature in BPEL

Most of us are well versed with Fault Management Framework in 11g, where one of the generic feature that we implement is Retry mechanism. Recently I heard about the feature ‘Auto Recovery’ in BPEL and was a part of discussion to conclude when we should (not) rely on this feature during BPEL process execution. Actually this was a new feature for me as I ever explored and considered during the development though I heard of manual recovery. This made me realize that I am still novice :).

So the purpose of this post is to explore ‘Auto Recovery’ in BPEL that include the following. And does not discuss about required configuration in clustered environment, startup related configuration and Callback Recovery.

  • Configuration
  • BPEL Recovery Console
  • Auto Recovery Behavior
  • Auto Recovery in Action
    • Invoke
    • Activity

Configuration:

‘Auto Recovery’ configuration is done by setting few of the MBean properties in EM console. To configure it in EM console one should navigate to soa-infra -> SOA Administration -> BPEL Properties -> More BPEL Configuration Properties -> RecoveryConfig. This will bring up the following screen showing the default parameters. BPEL Auto recovery is enabled by default.

RecoveryConfig

The properties startWindowTime and stopWindowTime specify the period during which Auto Recovery is active. By default auto recovery feature will be active from 12AM to 4AM everyday (remember that it’s SOA server time), shown in above screenshot. We can change these settings by simply updating the time values in 24 hr format and do click on Apply.

The property maxMessageRaiseSize specifies the number of messages to be sent in each recovery attempt, in effect resembles the batch size.

The property subsequentTriggerDelay specifies interval between consecutive auto recovery attempts and the value is 300 sec by default.

The property threshHoldTimeInMinutes is used by BPEL engine, to mark particular instance eligible for auto recovery once the recoverable fault occurs which is 10 min by default.

If we observe closely, none of these properties mention about number of recovery attempts to be made which is altogether a separate MBean property. To set, navigate to soa-infra -> SOA Administration -> BPEL Properties -> More BPEL Configuration Properties -> MaxRecoverAttempt. The default value is 2.

RecoveryAttempt

To disable ‘Auto Recovery’, set the maxMessageRaiseSize property value to 0.

BPEL Recovery Console:

Navigate to soa-infra -> Service Engines -> BPEL -> Recovery to view the recoverable instances. Note that, the console shows all recoverable instances irrespective of enabled/disabled ‘Auto Recovery’. We can manually recover the  faulted instances from this console when Auto recovery is not enabled.

recoveryconsole

Auto Recovery Behavior:

Whenever a recoverable fault (this term is more abstract, I verified this behavior with Remote, Binding and User Defined Faults) occurs during the BPEL processing, it will be visible in Recovery console. If Auto Recovery is enabled, after threshHoldTimeInMinutes BPEL runtime will try to auto recover the instance. If it’s not successful, again number of recovery attempts will be made as given for MaxRecoverAttempt with an interval as given for subsequentTriggerDelay. If instance fails even after these maximum recover attempts, the instance will be marked as exhausted (can be queried on recovery console using message state as exhausted). We can use ‘Reset’ button to make these instances eligible for Auto Recovery again.

Note that, we observe this behavior only when the fault is thrown back to BPEL runtime or fault is not caught within BPEL process.

Auto Recovery in Action:

Developed a simple one-way BPEL process for demonstration. This BPEL has invoke activity that results in RemoteFault and dehydrate activity after that.

Scenarios Verified:

  • No Catch -> Got Remote Fault -> Auto Recovery happened.
  • Catch All -> Got Remote Fault -> Auto Recovery did not happen.
  • Catch All (Scope level) -> Got Remote Fault -> Re-throw Remote Fault -> Auto Recovery happened.
  • Catch All (Scope level) -> Got Remote Fault -> Re-throw User Defined Fault -> Auto Recovery happened.
  • Catch All (Scope level) -> Got Binding Fault -> Re-throw User Defined Fault -> Auto Recovery happened.
  • Catch All (Scope level) -> Got User Defined Fault -> Re-throw User Defined Fault -> Auto Recovery happened.

Configuration Used:

startWindowTime – 0.00

stopWindowTime – 7.00

maxMessageRaiseSize – 50

subsequentTriggerDelay – 300 (sec)

threshHoldTimeInMinutes – 5 (min)

MaxRecoverAttempt – 4

Invoke Auto Recovery in Action:

The instance is faulted with remote fault.

invoke1

The BPEL process instance is visible in Recovery console as ‘Undelivered’.

invoke2

Observed that, ‘BPEL Message Recovery Required’ notification is visible after expiration of time as given for the property threshHoldTimeInMinutes.

invoke3

After the first auto recovery attempt made by BPEL engine. Observe that retry happened by initiating process from the start as there is no dehydration point before faulted invoke.

invoke4

After the 2nd recovery attempt. Observe the time difference between the successive recovery attempts.

invoke5

After the 4 the and final recovery attempt.

invoke6

Now this BPEL process can be seen in recovery console with message state as ‘Exhausted’ (shown below) as all the 4 recovery attempts are done. Now we can recover this BPEL process manually by clicking on ‘Recover’ button or click on ‘Reset’ button to make this process eligible for auto recovery again.

invoke7

Clicking on Reset button which makes this process to be eligible for auto recovery again and BPEL engine will restart recovery attempts (shown below).

invoke8

Activity:

To demonstrate Activity auto recovery, modify BPEL process to add Dehydrate and Assign activity before faulted invoke. This case also demonstrates that auto recovery will happen from the last break point. The highlighted part shown below shows the difference from the previous scenario with Dehydrate activity along with remote fault at invoke activity level.

act1

In BPEL recovery console, we can search for the activities that are marked for recovery. Assign3 is the first activity after the dehydrate activity so the recovery should happen from this activity.

act2

Following screenshots show flow trace after the first auto recovery attempt made by BPEL engine. Observe the difference from previous run in this flow trace. Now the entire BPEL process is not started rather it starts from Assign 3 activity as expected.

act3

act4

After the 4 the recovery attempt.

act5

act6

act7

Now this BPEL process can be seen in recovery console with message state as ‘Exhausted’ (shown below) as all the 4 recovery attempts are done. Now we can recover this BPEL process manually by clicking on ‘Recover’. Observe that reset button is not available and it needs a manual recovery.

act8

Other Observations:

  • The above mentioned behavior is observed only for ASync BPEL and for Sync BPEL processes (Transient Sync) no auto recovery is performed. However, the same is not verified in case of Durable Sync BPEL processes for the time being.

 

Sample code can be downloaded from here.

References:

http://docs.oracle.com/cd/E17904_01/integration.1111/e10226/bp_config.htm

4 Responses to “Auto Recovery feature in BPEL”


  1. 1 Anonymous December 14, 2022 at 1:01 AM

    Hi Siva,

    how we can recover the instance from backend ?

    updating the soainfa tables ..

    regards,
    Hussain

  2. 2 faruk August 24, 2014 at 5:34 PM

    thank, good explanation

  3. 4 Bijendra February 28, 2014 at 5:59 PM

    Very nicely demostrated…thanks.


Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.




Pages

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 379 other subscribers

Enter your email address to follow this blog and receive notifications of new posts by email.