Switching

last person joined: yesterday 

Ask questions and share experiences about EX and QFX portfolios and all switching solutions across your data center, campus, and branch locations.
  • 1.  EX2200 LACP issue

    Posted 04-21-2015 05:57

    Hi Guys,

     

    I've got a setup with two EX2200's in a Virtual Chassis. Above that is a HA stack of two Fortinet Fortigate 100D's, each with an 4x1G LACP trunk to the switches.

     

    AE1 has the following ports to Firewall1:

    ge-0/0/0

    ge-0/0/1

    ge-1/0/0

    ge-1/0/1

     

    AE2 has the following ports to Firewall 2:

    ge-0/0/2

    ge-0/0/3

    ge-1/0/2

    ge-1/0/3

     

    Ports 20 through 23 on each switch are for the Virtual chassis.

     

    My customers have been complaining that the services behind the switches are not stable, after some investigation, I see the following error messages a lot in the message log:

     

    Apr 21 14:53:32  dc-ede-2c6.9-sw1 lacpd[3534]: LACPD_TIMEOUT: ge-1/0/2: lacp current while timer expired current Receive State: CURRENT
    Apr 21 14:53:35  dc-ede-2c6.9-sw1 /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd ge-1/0/0 - ATTACHED state - acting as standby link
    Apr 21 14:53:35  dc-ede-2c6.9-sw1 /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd ge-1/0/2 - DETACHED state - will not carry traffic
    Apr 21 14:53:36  dc-ede-2c6.9-sw1 lacpd[3534]: LACPD_TIMEOUT: ge-1/0/0: lacp current while timer expired current Receive State: CURRENT
    Apr 21 14:53:39  dc-ede-2c6.9-sw1 /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd ge-1/0/0 - DETACHED state - will not carry traffic
    Apr 21 14:53:46  dc-ede-2c6.9-sw1 /kernel: KERN_ARP_DUPLICATE_ADDR: duplicate IP address 10.0.2.11! sent from address: 64:64:9b:10:43:7f (error count = 4579)
    Apr 21 14:54:01  dc-ede-2c6.9-sw1 /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd ge-1/0/2 - ATTACHED state - acting as standby link
    Apr 21 14:54:02  dc-ede-2c6.9-sw1 lacpd[3534]: LACPD_TIMEOUT: ge-1/0/2: lacp current while timer expired current Receive State: CURRENT
    Apr 21 14:54:05  dc-ede-2c6.9-sw1 /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd ge-1/0/0 - ATTACHED state - acting as standby link
    Apr 21 14:54:05  dc-ede-2c6.9-sw1 /kernel: KERN_LACP_INTF_STATE_CHANGE: lacp_update_state_userspace: cifd ge-1/0/2 - DETACHED state - will not carry traffic
    Apr 21 14:54:06  dc-ede-2c6.9-sw1 lacpd[3534]: LACPD_TIMEOUT: ge-1/0/0: lacp current while timer expired current Receive State: CURRENT

     

    At the same time, I see the following messages in the chassisd log:

    Apr 18 14:21:13  op 2 for ifd ge-1/0/0
    Apr 18 14:21:33  op 2 for ifd ge-1/0/2
    Apr 18 14:21:43  op 2 for ifd ge-1/0/0
    Apr 18 14:22:03  op 2 for ifd ge-1/0/2
    Apr 18 14:22:13  op 2 for ifd ge-1/0/0
    Apr 18 14:22:33  op 2 for ifd ge-1/0/2
    Apr 18 14:22:43  op 2 for ifd ge-1/0/0
    Apr 18 14:23:03  op 2 for ifd ge-1/0/2
    Apr 18 14:23:13  op 2 for ifd ge-1/0/0
    Apr 18 14:23:33  op 2 for ifd ge-1/0/2
    Apr 18 14:23:43  op 2 for ifd ge-1/0/0

     

    Can anybody explain to me what is going on here? Is the Virtual chassis using the port 1/0/0 en 1/0/2 for something? My virtual chassis configuration is as followed:

     

    virtual-chassis {
        preprovisioned;
        no-split-detection;
        member 0 {
            role routing-engine;
            serial-number <removed>;
        }
        member 1 {
            role routing-engine;
            serial-number <removed>;
        }

     

     

     

     

     



  • 2.  RE: EX2200 LACP issue

    Posted 04-21-2015 07:10

    It sounds like LACP is timing out on some of the links for some reason, causing them to be dropped from the AE.  I would check the CPU utilization on both boxes to make sure they aren't too "busy" to respond to the LACP frames, and possibly also try switching to LACP slow (if you are using fast).

     

    Ron



  • 3.  RE: EX2200 LACP issue

    Posted 04-21-2015 07:10

    And when I say "both boxes", I mean both the firewalls, as well as the switches.

     

    Ron



  • 4.  RE: EX2200 LACP issue

    Posted 04-21-2015 23:10

    Hi Ronf,

     

    Thank you for your reply. I've checked the CPU on both ends and neither of the boxes are very busy. The CPU usage on the Fortigate is 1% and the Juniper  has 91% idle on Slot 0 and 79% idle on slot 1.

     

    I've found another strange thing. I am connecting to my switch stack using the IP address on me0 and this gets disconnected quite often (every minute or so). When I log back in, i see the following message:

     

    warning: This chassis is operating in a non-master role as part of a virtual-chassis (VC) system.
    warning: Use of interactive commands should be limited to debugging and VC Port operations.
    warning: Full CLI access is provided by the Virtual Chassis Master (VC-M) chassis.
    warning: The VC-M can be identified through the show virtual-chassis status command executed at this console.
    warning: Please logout and log into the VC-M to use CLI.

    And when I look at the virtual-chassis status, it switched from the master to the backup.

    Preprovisioned Virtual Chassis
    Virtual Chassis ID: b0fc.cc5b.afb9
    Virtual Chassis Mode: Enabled
                                               Mstr           Mixed Neighbor List
    Member ID  Status   Serial No    Model     prio  Role      Mode ID  Interface
    0 (FPC 0)  Prsnt    CW0214220283 ex2200-24t-4g 129 Backup*   NA  1  vcp-255/0/20
                                                                     1  vcp-255/0/21
                                                                     1  vcp-255/0/22
                                                                     1  vcp-255/0/23
    1 (FPC 1)  Prsnt    CW0214220365 ex2200-24t-4g 129 Master    NA  0  vcp-255/0/20
                                                                     0  vcp-255/0/21
                                                                     0  vcp-255/0/22
                                                                     0  vcp-255/0/23
    
    

     

    Something is going wrong here, but I can't find it..

     

     

    I also read that I have to set the link speed of the ae interfaces, but that didn't help my error message go away.

     

    When I look at the LACP configuration on my Fortigate router, I get the output below. It says that the LACP speed is slow, but you can see that there is a difference in the actor state of port1 compared to the rest of the members. Port 1 has frame collection and frame distribution disabled.

     

    LACP flags: (A|P)(S|F)(A|I)(I|O)(E|D)(E|D)
    (A|P) - LACP mode is Active or Passive
    (S|F) - LACP speed is Slow or Fast
    (A|I) - Aggregatable or Individual
    (I|O) - Port In sync or Out of sync
    (E|D) - Frame collection is Enabled or Disabled
    (E|D) - Frame distribution is Enabled or Disabled
    
    status: up
    ports: 4
    link-up-delay: 50ms
    min-links: 0
    ha: master
    distribution algorithm: L4
    LACP mode: active
    LACP speed: slow
    LACP HA: enable
    aggregator ID: 1
    actor key: 17
    actor MAC address: 08:5b:0e:83:b7:f4
    partner key: 3
    partner MAC address: 64:64:9b:10:57:c0
    
    slave: port1
      link status: up
      link failure count: 0
      permanent MAC addr: 08:5b:0e:83:b7:f4
      LACP state: negotiating
      actor state: ASAIDD
      actor port number/key/priority: 1 17 255
      partner state: ASIODD
      partner port number/key/priority: 0 1 255
      partner system: 65535 00:00:00:00:00:00
      aggregator ID: 2
      speed/duplex: 1000 1
      RX state: DEFAULTED 5
      MUX state: ATTACHED 3
    
    slave: port2
      link status: up
      link failure count: 0
      permanent MAC addr: 08:5b:0e:83:b7:f5
      LACP state: established
      actor state: ASAIEE
      actor port number/key/priority: 2 17 255
      partner state: AFAIEE
      partner port number/key/priority: 3 3 127
      partner system: 127 64:64:9b:10:57:c0
      aggregator ID: 1
      speed/duplex: 1000 1
      RX state: CURRENT 6
      MUX state: COLLECTING_DISTRIBUTING 4
    
    slave: port3
      link status: up
      link failure count: 0
      permanent MAC addr: 08:5b:0e:83:b7:f6
      LACP state: established
      actor state: ASAIEE
      actor port number/key/priority: 3 17 255
      partner state: AFAIEE
      partner port number/key/priority: 9 3 127
      partner system: 127 64:64:9b:10:57:c0
      aggregator ID: 1
      speed/duplex: 1000 1
      RX state: CURRENT 6
      MUX state: COLLECTING_DISTRIBUTING 4
    
    slave: port4
      link status: up
      link failure count: 0
      permanent MAC addr: 08:5b:0e:83:b7:f7
      LACP state: established
      actor state: ASAIEE
      actor port number/key/priority: 4 17 255
      partner state: AFAIEE
      partner port number/key/priority: 4 3 127
      partner system: 127 64:64:9b:10:57:c0
      aggregator ID: 1
      speed/duplex: 1000 1
      RX state: CURRENT 6
      MUX state: COLLECTING_DISTRIBUTING 4

     

    All ports on the LACP are configured exactly the same on both the Fortigate as the Juniper hardware. Could this be a faulty cable?

     



  • 5.  RE: EX2200 LACP issue

    Posted 04-24-2015 05:39

    When in a virtual-chassis, you should be assigning an IP address to vme0, not me0, which will redirect to the current master routing-engine.  There is definitely something amiss with port 1 in the capture you identified.  I would try swapping interfaces as well as cables to try to rule that out for sure.

     

    Ron



  • 6.  RE: EX2200 LACP issue

     
    Posted 04-24-2015 14:09
    Can you share the config of the Virtual-Chassis ports ? i'm guessing their is something wrong