Routing

last person joined: 4 days ago 

Ask questions and share experiences about ACX Series, CTP Series, MX Series, PTX Series, SSR Series, JRR Series, and all things routing, including portfolios and protocols.
Expand all | Collapse all

Seeing lots of ospfTxRetransmit SNMP Traps

  • 1.  Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-06-2013 15:42

    Hi,

    I'm getting lots of OSPF-MIB:ospfTxRetransmit traps sent to my SNMP server by one of my EX4200 switches.

     


    From what i understand this trap means that an LSA has been sent but no ACK has been received so it needed to be retransmitted.  This sounds like a bit of an issue, but I can't see any problems in my network.

     

    The topology is:

    EX4200---------Ae0.91----------EX4200

     

    Both EX switches are on Junos 11.4r2.14

     

    OSPF neighbours have been up and established for 4 weeks.

    show ospf neighbor instance Internet interface ae0.91 extensive    
    Address          Interface              State     ID               Pri  Dead
    10.255.255.46    ae0.91                 Full      10.255.0.2       128    36
      Area 0.0.0.0, opt 0x52, DR 10.255.255.46, BDR 10.255.255.45
      Up 4w3d 17:42:50, adjacent 4w3d 17:42:50
        Link state retransmission list:
    
          Type      LSA ID           Adv rtr          Seq
    
         Network   10.255.255.46    10.255.0.12      0x800834f1
    
       Topology default (ID 0) -> Bidirectional

     Any thoughts? 



  • 2.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

    Posted 01-07-2013 00:52

    Hi ,  

         

     

    1. if you are getting continously traps for ospfTxRetransmit ?

    2. Is there any ospf routes flapping happening ?

     

    you can get a better understand of ospf traps by this link :

     

    http://kb.juniper.net/InfoCenter/index?page=content&id=KB23854&cat=M320&actp=LIST

     

    I feel you need to check it JTAC by raising issue , if you are not having above problems ! 

     

     

     



  • 3.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-07-2013 03:16

    Hi Luca,

     

    it would be good to look at show ospf statistics counters. Additionally, please verify is there any errors on the AE interfaces/members.

    jtb 



  • 4.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-08-2013 20:55

    Here is the OSPF statistics.

    Not seeing any errors in the OSPF DB, and the route table is stable - so no flapping routes.

     

     show ospf statistics instance Internet    
    
    Packet type             Total                  Last 5 seconds
                       Sent      Received        Sent      Received
       Hello            212           109           0             0
         DbD             18             8           0             0
       LSReq              3             2           0             0
    LSUpdate        2857270       2505278           6             5
       LSAck        2019265        978918           5             2
    
    DBDs retransmitted     :                    6, last 5 seconds :          0
    LSAs flooded           :               256899, last 5 seconds :          1
    LSAs flooded high-prio :              2194123, last 5 seconds :          4
    LSAs retransmitted     :               396888, last 5 seconds :          1
    LSAs transmitted to nbr:                13096, last 5 seconds :          0
    LSAs requested         :                   13, last 5 seconds :          0
    LSAs acknowledged      :              2295510, last 5 seconds :          5
    
    Flood queue depth      :               0
    Total rexmit entries   :               2
    db summaries           :               0
    lsreq entries          :               0
    
    Receive errors:
      None
    

     



  • 5.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-09-2013 00:58

    Hi Luca,

     

    it would be good to compare the states from both sides. I assume the statistics comes from the EX that sends the traps. Please notice the line:

     

    [...]
    LSAs retransmitted     :               396888, last 5 seconds :          1

    It's a big number. LSAs are just retransmitted, you may see no problems with routing. I suspect some lower layer problems (check interface stats, AE and members).

     

    Just noticed you are using routing instances; on both sides ? any other instances with OSPF on the EX ?

     

    How big the is OSPF (nodes/routes) ? Looks like there are  many LSUpdates/LSAs (high-priority - changed LSAs).

    If the OSPF network is small, your OSPF is not stable.

     

    Collect 2 samples of show ospf statistics from both devices. Wait ~30s before getting the second sample to see the rate of changes. It may be required to enable ospf traceoptions to see the updates.

    jtb



  • 6.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-09-2013 13:41

    It seems this could be a bug related to 11.4r2.14

    Or perhaps it was just the reboot...

     

    I have upgraded one of the switches to 11.4r5.5 and there hasn't been any retransmitted LSA since.

     



  • 7.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-09-2013 14:25

    Hmm appears I may have spoken to soon.  I upgaraded the other switch to 11.4r5 and the retrnamits have come back.  Althought not as frequent as last time.

     

    So the topology is two EX4200 switches (not VC) connected via an AE port which is in a routing instance on both devices.

    Config looks like this:

    SWITCH 1:
    show configuration routing-instances Internet protocols ospf 
    inactive: traceoptions {
        file ospf-internet-log;
        flag all;
    }
    external-preference 200;
    export mgmt-public-to-ospf;
    area 0.0.0.0 {
        interface ge-0/0/12.0;
        interface ae0.91;
        interface lo0.1 {
            passive;
        }
        interface vlan.96 {
            passive;
        }
        interface vlan.69 {
            passive;
    
    
    SWITCH2:
    show configuration routing-instances Internet protocols ospf 
    external-preference 200;
    area 0.0.0.0 {
        interface ge-0/0/12.0;
        interface ae0.91;
        interface lo0.1 {
            passive;
        }
        interface vlan.96 {
            passive;
        }
        interface vlan.69 {
            passive;
        }

     

    So Ae0.91 is the interfaced used for OSPF in this routing instance.

     

    Here is the config from the interfaces:

    SWITCH1:
    show configuration interfaces ae0 
    vlan-tagging;
    aggregated-ether-options {
        link-speed 1g;
        lacp {
            active;
        }
    }
    unit 91 {
        vlan-id 91;
        family inet {
            address 10.255.255.45/30;
    
    
    
    SWITCH2:
    show configuration interfaces ae0     
    vlan-tagging;
    aggregated-ether-options {
        link-speed 1g;
        lacp {
            passive;
        }
    }
    unit 91 {
        vlan-id 91;
        family inet {
            address 10.255.255.46/30;

     

     

    Here are the stats

    SWITCH1:
    show ospf statistics instance Internet 
    
    Packet type             Total                  Last 5 seconds
                       Sent      Received        Sent      Received
       Hello              0             0           0             0
         DbD              0             0           0             0
       LSReq              0             0           0             0
    LSUpdate            914           877           3             3
       LSAck            709           340           2             1
    
    DBDs retransmitted     :                    0, last 5 seconds :          0
    LSAs flooded           :                  160, last 5 seconds :          0
    LSAs flooded high-prio :                  703, last 5 seconds :          2
    LSAs retransmitted     :                   70, last 5 seconds :          1
    LSAs transmitted to nbr:                    7, last 5 seconds :          0
    LSAs requested         :                    0, last 5 seconds :          0
    LSAs acknowledged      :                  830, last 5 seconds :          2
    
    Flood queue depth      :               0
    Total rexmit entries   :               2
    db summaries           :               0
    lsreq entries          :               0
    
    Receive errors:
      None
    
    
    
    
    SWITCH2:
    show ospf statistics instance Internet 
    
    Packet type             Total                  Last 5 seconds
                       Sent      Received        Sent      Received
       Hello              0             0           0             0
         DbD              0             0           0             0
       LSReq              0             0           0             0
    LSUpdate            835          1067           3             4
       LSAck             16           343           0             2
    
    DBDs retransmitted     :                    0, last 5 seconds :          0
    LSAs flooded           :                    4, last 5 seconds :          0
    LSAs flooded high-prio :                  656, last 5 seconds :          2
    LSAs retransmitted     :                    9, last 5 seconds :          0
    LSAs transmitted to nbr:                  166, last 5 seconds :          1
    LSAs requested         :                    0, last 5 seconds :          0
    LSAs acknowledged      :                   16, last 5 seconds :          0
    
    Flood queue depth      :               0
    Total rexmit entries   :               1
    db summaries           :               0
    lsreq entries          :               0
    
    Receive errors:
      None

      And again, 1 minute later

     

    show ospf statistics instance Internet    
    
    Packet type             Total                  Last 5 seconds
                       Sent      Received        Sent      Received
       Hello              0             0           0             0
         DbD              0             0           0             0
       LSReq              0             0           0             0
    LSUpdate           1017           971           6             5
       LSAck            788           381           4             2
    
    DBDs retransmitted     :                    0, last 5 seconds :          0
    LSAs flooded           :                  172, last 5 seconds :          1
    LSAs flooded high-prio :                  783, last 5 seconds :          4
    LSAs retransmitted     :                   79, last 5 seconds :          1
    LSAs transmitted to nbr:                    9, last 5 seconds :          0
    LSAs requested         :                    0, last 5 seconds :          0
    LSAs acknowledged      :                  916, last 5 seconds :          4
    
    Flood queue depth      :               0
    Total rexmit entries   :               1
    db summaries           :               0
    lsreq entries          :               0
    
    Receive errors:
      None
    
    
    
    SWITCH2:
    show ospf statistics instance Internet    
    
    Packet type             Total                  Last 5 seconds
                       Sent      Received        Sent      Received
       Hello              0             0           0             0
         DbD              0             0           0             0
       LSReq              0             0           0             0
    LSUpdate            906          1163           3             4
       LSAck             16           376           0             1
    
    DBDs retransmitted     :                    0, last 5 seconds :          0
    LSAs flooded           :                    4, last 5 seconds :          0
    LSAs flooded high-prio :                  710, last 5 seconds :          2
    LSAs retransmitted     :                    9, last 5 seconds :          0
    LSAs transmitted to nbr:                  183, last 5 seconds :          1
    LSAs requested         :                    0, last 5 seconds :          0
    LSAs acknowledged      :                   16, last 5 seconds :          0
    
    Flood queue depth      :               0
    Total rexmit entries   :               1
    db summaries           :               0
    lsreq entries          :               0
    
    Receive errors:
      None

     

    So it seems it's just one switch that has a much higher retransmission rate (SWITCH1)

    It has got less frequent since I upgraded junos but still is happening.

     

    However i'm not seeing any network instabilities, my route table is stable. I have no errors on the physical ports

    Might log this with JTAC.  I can't see any issues.

     

    I also have OSPF running on both these switches in the global route table and I don't see any LSA retransmits there



  • 8.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-09-2013 14:38

    Hi Luca,

     

    Could you show two samples of show ospf database to verify unstable LSA ?

    Wait ~60 second between samples.

    jtb



  • 9.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-09-2013 19:24

    Here it is...

    Seems there are a couple of offending routes:

     

        OSPF database, Area 0.0.0.0
     Type       ID               Adv Rtr           Seq      Age  Opt  Cksum  Len 
    Router  *10.255.0.1       10.255.0.1       0x80000428  2007  0x22 0x3c0   96
    Router   10.255.0.2       10.255.0.2       0x80000406  1249  0x22 0xd575  96
    Router   10.255.0.11      10.255.0.11      0x800003fc  2239  0x22 0x7c17  60
    Router   10.255.0.12      10.255.0.12      0x8000056c  1290  0x22 0x7a74  60
    Router   10.255.0.13      10.255.0.13      0x800003fa  2425  0x22 0x7853  60
    Router   10.255.0.14      10.255.0.14      0x800003f0  2792  0x22 0x18aa  60
    Router   10.255.1.2       10.255.1.2       0x80000408   100  0x22 0xd124  84
    Network  10.255.254.9     10.255.1.2       0x800003f1   850  0x22 0xeb1a  32
    Network  10.255.254.13    10.255.1.2       0x800002b2  1601  0x22 0x53ed  32
    Network  10.255.255.42    10.255.0.11      0x80000008  1239  0x22 0xebe0  32
    Network  10.255.255.46    10.255.0.2       0x800010ef     4  0x22 0x9f43  32
    Network  10.255.255.46    10.255.0.12      0x800934b4     7  0x22 0x3a98  32
    Network  10.255.255.50    10.255.0.14      0x800003d8  1866  0x22 0xa539  32
    Network  10.255.255.53    10.255.0.11      0x800003ef   239  0x22 0x4d7d  32
    Network  10.255.255.57    10.255.0.12      0x80000518  2040  0x22 0xe0b8  32
    Network  10.255.255.62    10.255.0.12      0x80000024   540  0x22 0xfc9c  32
        OSPF AS SCOPE link state database
     Type       ID               Adv Rtr           Seq      Age  Opt  Cksum  Len 
    Extern   0.0.0.0          10.255.1.2       0x800003ee  2351  0x22 0x2b0a  36

     a few mins later

        OSPF database, Area 0.0.0.0
     Type       ID               Adv Rtr           Seq      Age  Opt  Cksum  Len 
    Router  *10.255.0.1       10.255.0.1       0x80000428  2007  0x22 0x3c0   96
    Router   10.255.0.2       10.255.0.2       0x80000406  1249  0x22 0xd575  96
    Router   10.255.0.11      10.255.0.11      0x800003fc  2239  0x22 0x7c17  60
    Router   10.255.0.12      10.255.0.12      0x8000056c  1290  0x22 0x7a74  60
    Router   10.255.0.13      10.255.0.13      0x800003fa  2425  0x22 0x7853  60
    Router   10.255.0.14      10.255.0.14      0x800003f0  2792  0x22 0x18aa  60
    Router   10.255.1.2       10.255.1.2       0x80000408   100  0x22 0xd124  84
    Network  10.255.254.9     10.255.1.2       0x800003f1   850  0x22 0xeb1a  32
    Network  10.255.254.13    10.255.1.2       0x800002b2  1601  0x22 0x53ed  32
    Network  10.255.255.42    10.255.0.11      0x80000008  1239  0x22 0xebe0  32
    Network  10.255.255.46    10.255.0.2       0x800010ef     4  0x22 0x9f43  32
    Network  10.255.255.46    10.255.0.12      0x800934b4     7  0x22 0x3a98  32
    Network  10.255.255.50    10.255.0.14      0x800003d8  1866  0x22 0xa539  32
    Network  10.255.255.53    10.255.0.11      0x800003ef   239  0x22 0x4d7d  32
    Network  10.255.255.57    10.255.0.12      0x80000518  2040  0x22 0xe0b8  32
    Network  10.255.255.62    10.255.0.12      0x80000024   540  0x22 0xfc9c  32
        OSPF AS SCOPE link state database
     Type       ID               Adv Rtr           Seq      Age  Opt  Cksum  Len 
    Extern   0.0.0.0          10.255.1.2       0x800003ee  2351  0x22 0x2b0a  36

     



  • 10.  RE: Seeing lots of ospfTxRetransmit SNMP Traps
    Best Answer

     
    Posted 01-10-2013 02:02

    Hi Luca,

     

    it looks you have pasted the same LSDB info twice, so we don't see changes.  Anyway, we have some interesting entries:

     

       Type       ID            Adv Rtr           Seq      Age  Opt  Cksum  Len
    [...] Network 10.255.255.46 10.255.0.2 0x800010ef 4 0x22 0x9f43 32 Network 10.255.255.46 10.255.0.12 0x800934b4 7 0x22 0x3a98 32

     

    • there should be single Network LSA in LSDB for LAN segment, originated by DR
    • here, two routers announce the same lsa-id (DR address = 10.255.255.46)
    • I guess, 10.255.0.2 (OSPF ID)  is switch 2
    • 10.255.0.12 (OSPF ID) is unknown router; looks like it has 10.255.255.46 addressed interface and floods the LSA frequently (0x934b4 = 603316 updates) once it gets LSA from 10.255.0.2 

    Find the 10.255.0.12 and verify/fix the issue. If you like it, we may look at LSA details to verify what's advertised.

    We would need: show ospf database detail for full LSDB or at least show ospf database detail lsa-id XXX for selected lsa-id (XXX = 10.255.0.2, 10.255.0.12, 10.255.255.46). 

    jtb



  • 11.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-10-2013 13:37

    Ah well spotted!

    Yep I had the 10.255.255.44/30 subnet configured on two seperate network segments.

    I've fixed that up now... Will monitor for a few hours and see how it goes.



  • 12.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-10-2013 21:49

    looks much nicer now

    show ospf statistics instance Internet 
    
    Packet type             Total                  Last 5 seconds
                       Sent      Received        Sent      Received
       Hello              4             2           0             0
         DbD              0             0           0             0
       LSReq              0             0           0             0
    LSUpdate            575           819           0             0
       LSAck            410           227           0             0
    
    DBDs retransmitted     :                    0, last 5 seconds :          0
    LSAs flooded           :                  181, last 5 seconds :          0
    LSAs flooded high-prio :                  778, last 5 seconds :          0
    LSAs retransmitted     :                    0, last 5 seconds :          0
    LSAs transmitted to nbr:                    0, last 5 seconds :          0
    LSAs requested         :                    0, last 5 seconds :          0
    LSAs acknowledged      :                  598, last 5 seconds :          0
    
    Flood queue depth      :               0
    Total rexmit entries   :               0
    db summaries           :               0
    lsreq entries          :               0
    
    Receive errors:

     



  • 13.  RE: Seeing lots of ospfTxRetransmit SNMP Traps

     
    Posted 01-11-2013 05:38

    hi Luca,

     

    I've reproduced the problem in a small setup. We can see how routers try to flush (delete) LSA from the other peer, and not ACK the LSA:

     

    admin@srx240a# show ospf database instance R3
    
        OSPF database, Area 0.0.0.0
     Type       ID               Adv Rtr           Seq      Age  Opt  Cksum  Len
    Router   1.1.1.1          1.1.1.1          0x80000004    74  0x22 0x2e6a  48
    Router  *2.2.2.2          2.2.2.2          0x80000004    73  0x22 0xef9f  48
    Router   20.20.20.20      20.20.20.20      0x80000005    75  0x22 0x3c18  36
    Router   30.30.30.30      30.30.30.30      0x80000005    74  0x22 0x4fb3  36
    Network  8.8.8.8          1.1.1.1          0x8000000c  3600  0x22 0x8ae   32  !!!
    Network *8.8.8.8          2.2.2.2          0x8000000c     4  0x22 0x284   32  <<<
    Network *10.10.10.2       2.2.2.2          0x80000002    73  0x22 0x48bf  32
    

     

    Notice the LSA 8.8.8.8 with age 3600s. It was interesting case; for future reference, please mark it as solved (if it's true).

    jtb