Switching

last person joined: yesterday 

Ask questions and share experiences about EX and QFX portfolios and all switching solutions across your data center, campus, and branch locations.
Expand all | Collapse all

Ex4300 Virtual Chassis member issue

  • 1.  Ex4300 Virtual Chassis member issue

    Posted 01-29-2016 13:46
      |   view attached

    Hi everyone! It's my first time configuring Juniper devices and I would like to ask some help if you may.

     

    Scenario: We have SRX240 HA Cluster connected 2 EX4300 in virtual chassis. I attached the topology.

     

    Issue: When the member1 switch is the Master, I can ping from router to switch, router to server, and vice-versa so basically no problem at all. But when member0 switch becomes the Master, I cannot ping from router to switch, router to server, or vice versa. There is also RTO when member0 is down.

     

    Here is the virtual chassis status:

     

    root@EX-4300# run show virtual-chassis

    Virtual Chassis ID: c14c.9326.a5d7
    Virtual Chassis Mode: Enabled
                                                    Mstr           Mixed Route Neighbor List
    Member ID  Status   Serial No    Model          prio  Role      Mode  Mode ID  Interface
    0 (FPC 0)  Prsnt    PE3715020154 ex4300-48t     255   Backup       N  VC   1  vcp-255/1/3
    1 (FPC 1)  Prsnt    PE3713320098 ex4300-48t     255   Master*      N  VC   0  vcp-255/1/0

    Member ID for next new member: 2 (FPC 2)

     

    root@EX-4300# run show virtual-chassis vc-port
    fpc0:
    --------------------------------------------------------------------------
    Interface   Type              Trunk  Status       Speed        Neighbor
    or                             ID                 (mbps)       ID  Interface
    PIC / Port
    1/3         Configured         -1    Up           40000        1   vcp-255/1/0

    fpc1:
    --------------------------------------------------------------------------
    Interface   Type              Trunk  Status       Speed        Neighbor
    or                             ID                 (mbps)       ID  Interface
    PIC / Port
    1/0         Configured         -1    Up           40000        0   vcp-255/1/3

     

     

    Here are the interfaces between SRX and EX:

     

    SRX-A ge-0/0/14 ---> EX1 ge-0/0/46

    SRX-A ge-0/0/15 ---> EX2 ge-1/0/46

    SRX-B ge-5/0/14 ---> EX1 ge-0/0/47

    SRX-B ge-5/0/15 ---> EX2 ge-1/0/47

     

    EX1 ge-0/0/10 ---> Server

    EX2 ge-1/0/10 ---> Server

     

    ISP ---> SRX-A ge-0/0/7

     

    root@EX-4300# run show interfaces terse | match ae*
    Interface               Admin Link Proto    Local                 Remote
    ge-0/0/10.0             up    up   aenet    --> ae3.0
    ge-0/0/46.0             up    up   aenet    --> ae0.0
    ge-0/0/47.0             up    up   aenet    --> ae0.0
    ge-1/0/10.0             up    up   aenet    --> ae3.0
    ge-1/0/46.0             up    up   aenet    --> ae1.0
    ge-1/0/47.0             up    up   aenet    --> ae1.0
    ae0                     up    up
    ae0.0                   up    up   eth-switch
    ae1                     up    up
    ae1.0                   up    up   eth-switch
    ae3                     up    up
    ae3.0                   up    up   eth-switch

     

    root@SRX-B> show interfaces terse | match reth*
    ge-0/0/7.0              up    up   aenet    --> reth2.0
    ge-0/0/14.0             up    up   aenet    --> reth0.0
    ge-0/0/15.0             up    up   aenet    --> reth1.0
    ge-5/0/7.0              up    down aenet    --> reth2.0
    ge-5/0/14.0             up    up   aenet    --> reth0.0
    ge-5/0/15.0             up    up   aenet    --> reth1.0
    reth0                   up    up
    reth0.0                 up    up   inet     10.10.0.1/24
    reth1                   up    up
    reth1.0                 up    up   inet     10.10.0.2/24
    reth2                   up    up
    reth2.0                 up    up   inet     210.4.118.114/29

     

    If you guys have any idea what could possibly be wrong or has suggestions, I would be happy to try them out.



  • 2.  RE: Ex4300 Virtual Chassis member issue

    Posted 01-30-2016 07:11
    Hi,

    Have you tried to enable the no-split-detection statement ?

    {master:0}[edit]
    root@switch# set virtual-chassis ?
    Possible completions:
    + apply-groups Groups from which to inherit configuration data
    + apply-groups-except Don't inherit configuration data from these groups
    > fast-failover Fast failover mechanism
    id Virtual chassis identifier, of type ISO system-id
    > mac-persistence-timer How long to retain MAC address when member leaves virtual chassis
    > member Member of virtual chassis configuration
    no-split-detection Disable split detection. This command is recommended to only be enabled in a 2 member setup
    preprovisioned Only accept preprovisioned members
    > traceoptions Global tracing options for virtual chassis

    {master:0}[edit]
    root@switch# set virtual-chassis no-split-detection
    root@switch# commit

    If not , and if this didn't help , can you share with us the following ?
    1- related configuration
    2- versions


  • 3.  RE: Ex4300 Virtual Chassis member issue

    Posted 01-30-2016 10:22
      |   view attached

    Hi Abed AL-R,

     

    Thank you for your reply.

     

    Yes, no-split-detection is already enabled. I have attached the full configuration.

     

    root@EX-4300# show virtual-chassis

    no-split-detection;

    member 0{ mastership-priority 255; } member 1 { mastership-priority 255; }

     

    Here are the versions:

    --------------------------------------------------

    root@EX-4300-GFL# run show system software

    fpc0:

    --------------------------------------------------------------------------

    Information for fips-mode-powerpc:

    Comment: JUNOS FIPS mode utilities [13.2X51-D26.2]

    Information for jdocs-ex:

    Comment: JUNOS Online Documentation [13.2X51-D26.2]

    Information for junos:

    Comment: JUNOS EX Software Suite [13.2X51-D26.2]

    Information for junos-ex-4300:

    Comment: JUNOS EX 4300 Software Suite [13.2X51-D26.2]

    Information for jweb-ex:

    Comment: JUNOS Web Management [13.2X51-D26.2]

    Information for py-base-powerpc:

    Comment: JUNOS py-base-powerpc [13.2X51-D26.2]

     

    fpc1:

    --------------------------------------------------------------------------

    Information for fips-mode-powerpc:

    Comment: JUNOS FIPS mode utilities [13.2X51-D26.2]

    Information for jdocs-ex:

    Comment: JUNOS Online Documentation [13.2X51-D26.2]

    Information for junos:

    Comment: JUNOS EX Software Suite [13.2X51-D26.2]

    Information for junos-ex-4300:

    Comment: JUNOS EX 4300 Software Suite [13.2X51-D26.2]

    Information for jweb-ex:

    Comment: JUNOS Web Management [13.2X51-D26.2]

    Information for py-base-powerpc:

    Comment: JUNOS py-base-powerpc [13.2X51-D26.2]

    Attachment(s)

    txt
    JuniperEx4300.txt   19 KB 1 version


  • 4.  RE: Ex4300 Virtual Chassis member issue

    Posted 01-30-2016 10:38
    Hi eaguilar,

    Thank you .

    Please let us demonstrate the issue again .
    Reboot the master deivce to let a failover happen again in order switch over the backup device , and lookup into the var log messages if you notice any log of this kind :

    Nexthop index allocation failed


    Alternativly, please provide us the output of show log messages after the testing .


  • 5.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-01-2016 09:13
      |   view attached

    Hi ,

     

    I can see some lines that contain

    (DELETE NEXTHOP) failed, err 7

     

    Although I'm not sure if that is related to what you mentioned.

     

    I have attached the logs from the failover point. Thank you for your help.

    Attachment(s)

    txt
    ex4300-log.txt   61 KB 1 version


  • 6.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-02-2016 01:24

    Hi,

     

    Sorry for answering late .

     

    I'll check the output you attached and then I'll update you .



  • 7.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-02-2016 06:03
    Hi,

    Please try this command , and update me if the issue still exist :

    set system processes ethernet-connectivity-fault-management disable


  • 8.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-02-2016 09:35

    Thank you

     

    Unfortunately, the same issue exists after entering

    "set system processes ethernet-connectivity-fault-management disable"

     

    I also did the same process of failover as before.



  • 9.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-02-2016 11:31
    Hi,

    Does The server receive IP address after the failover ?
    Apparently , this issue isn't as I expected it . There might be a little chance that it is version related but I'm not sure .

    I recognized some of the error logs , such as :

    CHASSISD_VCHASSIS_MEMBER_OP_NOTICE: Member change: vc delete of member 1
    which is removing member 1 from the VC .

    And:
    pfe_listener_disconnect: conn dropped

    As a precaution, you should contact JTAC and get some info on the error. Or an expert on this forum should answer this out .
    If you manage to open JTAC ticket I think they well ask you further logs from the FPC that is recording the VC disconnection .

    Please update me about the solution .


  • 10.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-02-2016 12:03

    Hi ,

     

    The server is on static IP. The server is pinging the router and when I reboot FPC1, the ping is continuous. But when I reboot FPC0, the ping connection fails and only comes back after the switch has completed its boot.

     

    I will check if there is a new version and try to update it.

     

    Thank you for your help!



  • 11.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-08-2016 13:11

    Hi again,

     

    Issue is still existing even after Software update from 13.2 -> 14.1.

     

    It seems frustrating as I think I've configured it right. Anyway thanks for the help.



  • 12.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-09-2016 17:05

    hi eaguilar,

     

    Have you check out support LACP setting between SRX and EX as per http://kb.juniper.net/InfoCenter/index?page=content&id=KB22474&actp=search ?

     

    Let me know how you go.

     

    cheers,

    Alex



  • 13.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-10-2016 08:13

    Hi Alex,

     

    Yes I have tried both lacp and non-lacp before but they are still the same.

     

    Here are the interfaces between SRX and EX:

     

    SRX-A ge-0/0/14 ---> EX1 ge-0/0/46

    SRX-A ge-0/0/15 ---> EX2 ge-1/0/46

    SRX-B ge-5/0/14 ---> EX1 ge-0/0/47

    SRX-B ge-5/0/15 ---> EX2 ge-1/0/47

     

    Since the EX are in a virtual chassis, I assume that's how they should be connected in regards with the link you have provided.



  • 14.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-10-2016 13:50

    hi eaguilar,

    The way I see it, based on your setup:

    - You should not have LACP configured on neither EX chassis nor SRX cluster 

    - When EX member 1 is DOWN,reth1 is DOWN and  you wont be able to ping 10.10.0.2 but 10.10.0.1 because reth0 remain UP

    - Your reth0 should contain (ge-0/0/14 & ge-5/0/15)  because when EX member 1 is DOWN and you want reth0 to remain UP

    - the SRX cluster is UP which I assume you are using direct links between SRXs?

    If you beleive you have exhaust all config setups,I would troubleshoot L2 on the switch by baselining before and after EX member 1 is DOWN.



  • 15.  RE: Ex4300 Virtual Chassis member issue

     
    Posted 02-10-2016 18:35

    Eagular, been following this thread.  I believe your config, at least on the EX side is improper.  On the EX4300 none of the interfaces should be members of an AE, they should all be single links.  The Reth config on the SRXs takes care of making sure traffic goes across the proper links.  This does assume no AE config on the SRX side, which from the config snipit you send appears to be the case.  I would start there, and then see where you are at.

     

    Refer to this doc: http://kb.juniper.net/InfoCenter/index?page=content&id=KB22474&actp=search&smlogin=true

     

    Forget about the lacp config part, not needed.  You have configured like the top portion which is incorrect.  Take out the EX ae1 portion of the example and it would now be correct.  And since no AE, no LACP.  Forget about the rest of the article, as that discuss how to properly configure if you actually want multiple links active in an AE between EX and SRX.  A diagram with this article may have helped.  Just FYI if you want AEs on the SRX to be part of your Reth 0 and Reth 1, you would actually need 4 x AEs on the EX side.

     

    Questions, just respond to this thread.  I have not yet found good documentation, IMHO, regarding how to configure this fairly common connectivity between SRX cluster and EX VC, at least not one with a diagram that explains things well.

     

    Good luck.



  • 16.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-11-2016 10:25

    Hi 

     

     

     



  • 17.  RE: Ex4300 Virtual Chassis member issue

     
    Posted 02-11-2016 10:42

    Something is wrong, obviously!  The Reth works like active back-up.  The secondary Node keeps the interface Up, but never sends and drops all receive.  So you can not run an AE from the EX to the SRX Reth (Think only one Reth) as EX AE will always think both links are active, while in fact one link is a drop all.

     

    The EX will learn MAC and IP/ARPs on the active link only, so it therefore knows to use only the active link in the Reth.  This is somewhat similar to how RTG works in EX.  Maybe send your complete EX config without the AE's configured.  A Reth of 2 phyical links, 1 active on Primay Node and 1 in standby on Secondary Node would have no LACP as NOT an AE to start with.

     

    What does show chassis cluster status look like on the SRX Cluster?  SRX Cluster config is right, yes?

     

    You should really troubleshoot this with Juniper TAC, I'd say call in under SRX not EX.



  • 18.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-11-2016 15:21

    Hi 

     

    Thank you very much for your help. I am trying to understand the situation based from what you all said in order to resolve this issue.

     

    So here is the chassis status of our SRX. The reth2/RG2 is used for testing and will be for our ISP. So reth0 and reth1 are the connection from SRX to EX. LACP is also not configured on any devices. As far as I understand your reply, I think because the routing engine of our SRX are on active/active, which is why it didn't work on non-AE EX interfaces.

     

    Routing Engine 0 status:
    State Online Master
    Temperature 38 degrees C / 100 degrees F
    CPU Temperature 34 degrees C / 93 degrees F

    node1:
    --------------------------------------------------------------------------
    Routing Engine 0 status:
    State Online Master
    Temperature 38 degrees C / 100 degrees F
    CPU Temperature 37 degrees C / 98 degrees F

    --------------------------------------------------------------------------------------

    root@SRX-B# run show chassis cluster status

    Cluster ID: 1
    Node Priority Status Preempt Manual Monitor-failures

     

    Redundancy group: 0 , Failover count: 0
    node0 100 primary no no None
    node1 50 secondary no no None

     

    Redundancy group: 1 , Failover count: 5
    node0 50 secondary yes no None
    node1 100 primary yes no None

     

    Redundancy group: 2 , Failover count: 1
    node0 50 secondary yes no None
    node1 100 primary yes no None

    -------------------------------------------------------------------------

    root@SRX-B# run show chassis cluster interfaces
    Control link status: Up

    Control interfaces:
    Index Interface Status Internal-SA
    0 fxp1 Up Disabled

    Fabric link status: Up

    Fabric interfaces:
    Name Child-interface Status
    (Physical/Monitored)
    fab0 ge-0/0/2 Up / Up
    fab0
    fab1 ge-5/0/2 Up / Up
    fab1

    Redundant-ethernet Information:
    Name Status Redundancy-group
    reth0 Up 1
    reth1 Up 1
    reth2 Up 2
    reth3 Down Not configured

    Redundant-pseudo-interface Information:
    Name Status Redundancy-group
    lo0 Up 0

    Interface Monitoring:
    Interface Weight Status Redundancy-group
    ge-5/0/15 255 Up 1
    ge-0/0/15 255 Up 1
    ge-5/0/14 255 Up 1
    ge-0/0/14 255 Up 1
    ge-5/0/7 255 Up 2
    ge-0/0/7 255 Up 2

    -----------------------------------------------------------------

    Here is how we setup the reths:

     

    set interfaces ge-0/0/7 gigether-options redundant-parent reth2
    set interfaces ge-0/0/14 gigether-options redundant-parent reth0
    set interfaces ge-0/0/15 gigether-options redundant-parent reth1
    set interfaces ge-5/0/7 gigether-options redundant-parent reth2
    set interfaces ge-5/0/14 gigether-options redundant-parent reth0
    set interfaces ge-5/0/15 gigether-options redundant-parent reth1

    set interfaces reth0 redundant-ether-options redundancy-group 1
    set interfaces reth0 redundant-ether-options minimum-links 1
    set interfaces reth0 redundant-ether-options lacp active
    set interfaces reth0 unit 0 family inet address 10.10.0.1/24
    set interfaces reth1 redundant-ether-options redundancy-group 1
    set interfaces reth1 redundant-ether-options minimum-links 1
    set interfaces reth1 redundant-ether-options lacp active
    set interfaces reth1 unit 0 family inet address 10.10.0.2/24
    set interfaces reth2 vlan-tagging
    set interfaces reth2 redundant-ether-options redundancy-group 2
    set interfaces reth2 unit 0 family inet address 210.4.118.114/29
    deactivate interfaces reth2 unit 0
    set interfaces reth2 unit 60 vlan-id 60
    set interfaces reth2 unit 60 family inet address 210.4.118.114/29



  • 19.  RE: Ex4300 Virtual Chassis member issue
    Best Answer

     
    Posted 02-11-2016 16:20

    You are running your SRX cluster as Active-Active?  This is why nothing probably makes sense.  If you run SRX A/A because the SRX is stateful FW you transmit and receive path MUST be the same or else SRX drops the packet.  A/A is a nice idea, but hardly anyone uses this in the real world due to exactly what you are seeing.  Trying to troubleshoot A/A is extremely difficult.  Can you set you SRX cluster as A/P (standby) and also take out EX AE config and then see what happens?

     

    Did you decide on this design yourself or did you work with whomever sold the Juniper equipment?



  • 20.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-15-2016 16:31

     

     

     



  • 21.  RE: Ex4300 Virtual Chassis member issue

     
    Posted 02-15-2016 16:39

    i think you'll have MUCH better results, as this design is at like 100s of sites world wide.  Wait to hear back with your results.

     

    Good luck.



  • 22.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-16-2016 15:08

    Hey it worked! The first problem was solved. The devices can now ping each other vice versa whichever is the master switch. Now I know the issue is with the reth and AE interfaces. I have also converted the srx cluster to active/passive.

     

    Only one issue remaining now, there is still downtime when I unplug the member0 switch and connectivity comes back after it has restarted. However, there is no issue with connectivity when I unplug member1 switch. So I was thinking maybe it is still connected with the reth, since reth0 is connected to EX member0 and reth1 is connected to EX member1.

     

    Anyway, there is progress now with our setup. I'll try to troubleshoot further on this downtime. Thank you very much!



  • 23.  RE: Ex4300 Virtual Chassis member issue

     
    Posted 02-16-2016 17:23

    Not sure of your exact set-up now, but if still having issues, suggest you send a diagram.  You do realize that based upon your original diagram SRX A (one on left with ISP connection) is a single point of failure, no matter how redudant the bottom section is made??



  • 24.  RE: Ex4300 Virtual Chassis member issue

    Posted 02-17-2016 08:06

    Actually I got it working now! I've reconfigured the reth interfaces and now the failover issue was gone. I can unplug any of the switch now without downtime. I can say that the issue is resolved now. Regarding the ISP, yes it is currently on 1 router only though I plan to add another switch on top of the routers to make it redundant.

     

    Thank you very much for all your help!