SRX cluster routing engine has GR Error gres-not-ready
I have a cluster problem, and no clue to it.
After some years of running I had to stop one firewall node(srx550) - this was the node1. After the reboot it's interfaces were down (bot in fpc0 and in fpc3) - so I took if offlilne until replace the HW.
Later I tried to start the fw node withot any cable and the interfaces started normally, so I tried to put it back to the cluster.
When It started it immediately become the active node on RG0 but the reth interfaces remain in down status (with all the ge interraces up) so I turnd off again. No preemtion configured so the interfaces remained active in the other node node1.
After it I discovered, that node1 RG0 shows an error (GR) - probably this is the reason why node 0 took mastership when I plugged back.
Now node0 is turned off, I have this GR (GRES monitoring) error and the firewall is working.
I would like to take node0 back in charge, but first I want to clear this GR error.
When I check show chassis cluster information deatil I can see that gres-not-ready ....
Re: SRX cluster routing engine has GR Error gres-not-ready
node1 is not in healthy state. I think it is becuase of the split brian scenario occured. And the node1 RG0 is priority is 255 which means there was a manual failover. Reset the value. "request chassis cluster failover reset redundancy-group 0" You have to reboot node1 to recover from the unhealthy state.
First reboot node1 and same time power on node0 so that down time can be reduced and kernel state will not be synced to node0 from node1
Thanks, Nellikka JNCIE x3 (SEC #321; SP #2839; ENT #790) Please Mark My Solution Accepted if it Helped, Kudos are Appreciated too!!!