Blogs

Expert Advice: Making Scripts Robust: Managing Retries

By Erdem posted 08-11-2015 14:51

  

Part 5 of 5 of Making Scripts Robust

 

Previous Article: Making Scripts Robust: Recovering from Power Loss or Routing Engine Restart

Next Article: Best Practices Series: Considerations for On-Box Data Storage and Access

 

NOTE: This applies to SLAX version 1.0 and higher.

Managing Retries


“If we do not succeed, we run the risk of failure.” -- Vice President Dan Quayle


Not every incidence of random chance or malfunction needs to be fatal to your script. Many error conditions might be transient. For example, your operation may take longer than the default time-out interval because the system is temporarily very busy, or someone else may be updating the configuration and you’re locked out. Instances like those present an opportunity to add some time-out/retry logic to your scripts.

 

The example below is a function that checks for a semaphore value in the Utility MIB. In the case of a time-out, the function will recursively call itself until it exceeds a maximum value.

 

<func:function name="util:check-semaphore-with-reties"> {
param $con; /* handle to mgd */
param $name; /* object name (will be translated to OID */
param $type; /* data type for OID */
param $count; /* current attempt counter */
var $max-tries = 20; /* Max attempts before throwing an error */
var $sleeptime = 10; /* Seconds to sleep between retries */
/* translate name to oid */
var $name-oid = util:build-oid-name($name, $type);
var $get-rpc = <get-snmp-object> {
<snmp-object-name> $name-oid;
}
/* get the data */
var $val = {
if ($count < $max-tries ) {
var $index = $count _ "/" _ $max-tries;
var $res = jcs:execute($con, $get-rpc);
if ( ($res//xnm:error) || ($res/snmp-object/error) || ( not($res/snmp-object/object-value) ) ) {
if ( contains($res/snmp-object/error,"No such instance")) {
expr util:emit-msg('info', concat("No semaphore found for ", $name, " Proceeding."));
expr (-2); /* -2 = "wheels fell off somewhere */
} else {
expr jcs:sleep($sleeptime);
expr util:emit-msg('warning', concat("Problem getting ", $name , ": ",
$res//xnm:error/message, $res/snmp-object/error,
". Attempt ", $index, ". Retry in ",$sleeptime," sec.") );
expr util:check-semaphore-with-retries($con, $name, $type, $count+1);
}
} else {
expr number($res/snmp-object/object-value);
}
} else {
/* If retries are exhausted, return semaphore value = -1 as a flag (PIDs can't be < 1) */
expr util:emit-msg('error', concat("Retries exceeded getting ", $name, " value from Utility MIB."));
expr number(-1);
}
}
<func:result select = "$val">;
}
/*******************************************************************/

 

The example code in the above function illustrates the retry. In this particular example, we end up in the else clause (ca. lines 22-28) typically when the snmp query times out. In this case, we simply sleep a bit, and call ourselves again, but increment the counter for the number of tries (count).

 

Since we check that counter every time the function is called, we will either eventually succeed and continue normally – or we’ll exceed our $max-tries value and return a failed status to our caller.

 

Written by Douglas McPherson
Solutions Consultant at Juniper


#Slax
#ExpertAdvice