SRX

last person joined: 6 hours ago 

Ask questions and share experiences about the SRX Series, vSRX, and cSRX.
  • 1.  SRX cluster routing engine has GR Error gres-not-ready

    Posted 03-04-2019 03:51

    Hi all,

     

    I have a cluster problem, and no clue to it.

    After some years of running I had to stop one firewall node(srx550) - this was the node1. After the reboot it's interfaces were down (bot in fpc0 and in fpc3) - so I took if offlilne until replace the HW.

    Later I tried to start the fw node withot any cable and the interfaces started normally, so I tried to put it back to the cluster.

    When It started it immediately become the active node on RG0 but the reth interfaces remain in down status (with all the ge interraces up) so I turnd off again. No preemtion configured so the interfaces remained active in the other node node1.

    After it I discovered, that node1 RG0 shows an error (GR) - probably this is the reason why node 0 took mastership when I plugged back.

    Now node0 is turned off, I have this GR (GRES monitoring) error and the firewall is working.

    I would like to take node0 back in charge, but first I want to clear this GR error.

    When I check show chassis cluster information deatil I can see that gres-not-ready ....

     

    {primary:node1}

    user@firewall-node1> show chassis cluster status

    Monitor Failure codes:

        CS  Cold Sync monitoring        FL  Fabric Connection monitoring

        GR  GRES monitoring             HW  Hardware monitoring

        IF  Interface monitoring        IP  IP monitoring

        LB  Loopback monitoring         MB  Mbuf monitoring

        NH  Nexthop monitoring          NP  NPC monitoring

        SP  SPU monitoring              SM  Schedule monitoring

        CF  Config Sync monitoring      RE  Relinquish monitoring

     

    Cluster ID: 1

    Node   Priority Status         Preempt Manual   Monitor-failures

     

    Redundancy group: 0 , Failover count: 0

    node0  0        lost           n/a     n/a      n/a

    node1  255      primary        no      yes      GR

     

    Redundancy group: 1 , Failover count: 0

    node0  0        lost           n/a     n/a      n/a

    node1  0        primary        no      no       CS

     

     

    {primary:node1}

    user@firewall-node1> show chassis cluster information detail

    node1:

    --------------------------------------------------------------------------

    Redundancy mode:

        Configured mode: active-active

        Operational mode: active-active

    Cluster configuration:

        Heartbeat interval: 1000 ms

        Heartbeat threshold: 3

        Control link recovery: Disabled

        Fabric link down timeout: 66 sec

    Node health information:

        Local node health: Not healthy

        Remote node health: Healthy

     

    Redundancy group: 0, Threshold: 255, Monitoring failures: gres-not-ready

     

    Please help me clearing this gr error.

     

    Thanks,

    Balázs



  • 2.  RE: SRX cluster routing engine has GR Error gres-not-ready
    Best Answer

    Posted 03-04-2019 18:50

    node1 is not in healthy state. I think it is becuase of the split brian scenario occured. And the node1 RG0 is priority is 255 which means there was a manual failover. Reset the value. "request chassis cluster failover reset redundancy-group 0"
     You have to reboot node1 to recover from the unhealthy state. 

    First reboot node1 and same time power on node0 so that down time can be reduced and kernel state will not be synced to node0 from node1

     



  • 3.  RE: SRX cluster routing engine has GR Error gres-not-ready

    Posted 03-04-2019 23:35

    Hello,

     

    Thaks for the reply, Yesterday I restarted and it cleared the error.

    I left the other node turned off.

     

    Finally I've found that junbo frame was not enabled on the switch where the HA link travelled between the nodes, that caused the original error.

     

    Thanks,

     

    Balázs