Routing
Highlighted
Routing

QFX5100 does not route traffic where the forwarding table says it will route traffic

‎02-24-2020 08:48 PM

QFX5100 does not route traffic where the forwarding table says it will route traffic.
 
Use case
=========
- /19 is a local static discard route (you know, so you don't loop traffic with your peers)
- /23s in the aggregate /19 received from IBGP peers 
- Junos: 17.3R3-S1.5

 

I have tried over 5 versions of JunOS. The issue is the same.

 

Configuration
==============
set routing-options static route 1.3.224.0/19 discard
set protocols bgp group bgp_routes type internal
set protocols bgp group bgp_routes import bgp-routes-import-policy
set protocols bgp group bgp_routes peer-as 1234
set protocols bgp group bgp_routes neighbor 1.2.175.119
set policy-options policy-statement bgp-routes-import-policy term 1 then origin egp
set policy-options policy-statement bgp-routes-import-policy then local-preference 110
set policy-options policy-statement bgp-routes-import-policy then validation-state valid
set policy-options policy-statement bgp-routes-import-policy then community add origin-validation-state-valid
set policy-options policy-statement bgp-routes-import-policy then accept

Note: The horrible BGP hack is just to try and force the routes to be preferred out of desperation.

Information
===========

> show route 1.3.224.0/19                                                           
inet.0: 15955 destinations, 37852 routes (15955 active, 0 holddown, 0 hidden)
@ = Routing Use Only, # = Forwarding Use Only
+ = Active Route, - = Last Active, * = Both
1.3.224.0/19     *[Static/5] 02:28:18
                      Discard
1.3.224.0/23     *[BGP/170] 03:09:17, localpref 110
                      AS path: E, validation-state: valid
                    > to 1.2.175.119 via irb.6

….

> show route forwarding-table family inet table default destination 1.3.224.1/32    
Routing table: default.inet
Internet:
Enabled protocols: Bridging, 
Destination        Type RtRef Next hop           Type Index    NhRef Netif
1.3.228.0/23     user     0                    indr   131071    13
                              1.2.175.119     ucst     1832     4 et-0/0/53.0

 

That all looks good right? Yep, it does.

 

The problem is, when the QFX receives a packet destined to 1.3.224.1, it does not forward it to 1.2.175.119 and instead discards it. The data plane is broken.

 

If I ping/traceroute or do anything from JunOS it routes correctly (thanks control plane).

8 REPLIES 8
Highlighted
Routing

Re: QFX5100 does not route traffic where the forwarding table says it will route traffic

‎02-24-2020 09:21 PM
Hello lbw,

Thought the 1-1 comparison of control/data plane should be like this, could you please check?
show route 1.3.224.1/32 exact
show route forwarding-table destination 1.3.224.1/32

Hope this helps.

Regards,
-r.

--------------------------------------------------

If this solves your problem, please mark this post as "Accepted Solution."
Kudos are always appreciated :).
Highlighted
Routing

Re: QFX5100 does not route traffic where the forwarding table says it will route traffic

‎02-24-2020 10:34 PM

Hello,

 


@lbw wrote:

 

> show route 1.3.224.0/19                                                           
inet.0: 15955 destinations, 37852 routes (15955 active, 0 holddown, 0 hidden)
@ = Routing Use Only, # = Forwarding Use Only
+ = Active Route, - = Last Active, * = Both
1.3.224.0/19     *[Static/5] 02:28:18
                      Discard
1.3.224.0/23     *[BGP/170] 03:09:17, localpref 110 
                      AS path: E, validation-state: valid
                    > to 1.2.175.119 via irb.6

….

> show route forwarding-table family inet table default destination 1.3.224.1/32    
Routing table: default.inet
Internet:
Enabled protocols: Bridging, 
Destination        Type RtRef Next hop           Type Index    NhRef Netif
1.3.228.0/23     user     0                    indr   131071    13
                              1.2.175.119     ucst     1832     4 et-0/0/53.0

 

 

 

You have mixed up the digits in the 3rd octet. Not sure if this is a copy-paste error or Your attempt to sanitize the printouts.

Please supply the following printouts:

 

show route receive-protocol bgp 1.2.175.119 extensive
show route forwarding-table destination 1.3.224.1  extensive

 

HTH

Thx

Alex

 

 

_____________________________________________________________________

Please ask Your Juniper account team about Juniper Professional Services offerings.
Juniper PS can design, test & build the network/part of the network as per Your requirements

+++++++++++++++++++++++++++++++++++++++++++++

Accept as Solution = cool !
Accept as Solution+Kudo = You are a Star !
Highlighted
Routing

Re: QFX5100 does not route traffic where the forwarding table says it will route traffic

[ Edited ]
‎02-24-2020 11:45 PM

Thanks for the replies.

 

> show route 1.3.224.1/32 exact

{master: 0}

 

> show route forwarding-table destination 1.3.224.1/32 extensive 

Destination: 1.3.224.0/23

  Route type: user                  

  Route reference: 0                   Route interface-index: 0   

  Multicast RPF nh index: 0             

  P2mpidx: 0              

  Flags: sent to PFE, prefix load balance  

  Next-hop type: indirect              Index: 131071   Reference: 17   

  Nexthop: 1.2.175.118

  Next-hop type: unicast               Index: 1832     Reference: 4    

  Next-hop interface: et-0/0/53.0  

 

Routing table: __juniper_services__.inet [Index 3] 

Internet:

Enabled protocols: Bridging, 

    

Destination:  128.0.0.0/2

  Route type: interface             

  Route reference: 0                   Route interface-index: 545 

  Multicast RPF nh index: 0             

  P2mpidx: 0              

  Flags: sent to PFE 

  Next-hop type: resolve               Index: 1676     Reference: 1    

  Next-hop interface: jsrv.1       

 

Routing table: __master.anon__.inet [Index 4] 

Internet:

Enabled protocols: Bridging, Dual VLAN, 

    

Destination:  default

  Route type: permanent             

  Route reference: 0                   Route interface-index: 0   

  Multicast RPF nh index: 0             

  P2mpidx: 0              

  Flags: sent to PFE 

  Next-hop type: reject                Index: 1679     Reference: 1    

 

Routing table: default-switch.inet [Index 6] 

Internet:

Enabled protocols: No VLAN, 

    

Destination:  default

  Route type: permanent                 

  Route reference: 0                   Route interface-index: 0   

  Multicast RPF nh index: 0             

  P2mpidx: 0              

  Flags: none 

  Next-hop type: reject                Index: 1703     Reference: 1    

 

> show route receive-protocol bgp 1.2.175.118 extensive

 

inet.0: 15971 destinations, 37872 routes (15971 active, 0 holddown, 0 hidden)

....

* 1.3.224.0/23 (1 entry, 1 announced)

     Accepted

     Nexthop: 1.2.175.118

     Localpref: 100

     AS path: I 

 

However, it does not route the packet to 1.2.175.118 (I can see from a traceroute and a packet trace on the device which has the Ethernet link to the QFX which has the IP 1.2.175.118) it just gets discarded at the QFX. 

 

Highlighted
Routing

Re: QFX5100 does not route traffic where the forwarding table says it will route traffic

‎02-25-2020 12:29 AM
Hi lbw,

Thank you, I have a bunch of questions in a few steps as follows:

a) What’s the Junos version on the QFX?
b) Have you made sure which direction is the traffic drop? I mean is it traffic destined to 1.3.224.1 dropping at QFX or the return from 1.3.224.1 dropping on the QFX. Asking because you said ping from the QFX to 1.3.224.1 itself works fine. If it is the traffic TO 1.3.224.1, got to next step.
c) Is the et-0/0/53 the right next-hop to the BGP neighbor? If that is good, we are looking at some other issue here. Are three any other prefixes in play here working alright?

d) Apologize the right CLI command to check the route on the RE should be “show route 1.3.224.1”. And the BGP receive-protocol extensive command provided by Alex.

Also, please check this for any filters/config glitches that render et-0/0/53 not usable for this traffic:
show vlans
show configuration et-0/0/53 | display inheritance| display set
show configuration irb.6 |display inheritance | display set
show interfaces et-0/0/53 extensive

e) As a dirty troubleshooting guess, please check “show log messages | grep parity”. If there are any parity errors on the box, that’ll be a spoil sport for traffic forwarding from the FPC 0, a reboot of the box should alleviate that for now. And if its repetitive, get a replacement. Again, this one is just a wild guess, sorry 😊.


Hope this helps.

Regards,
-r.

--------------------------------------------------

If this solves your problem, please mark this post as "Accepted Solution."
Kudos are always appreciated :).
Highlighted
Routing

Re: QFX5100 does not route traffic where the forwarding table says it will route traffic

‎02-25-2020 01:00 AM

Hi

 

(a) As stated in my original post, the JunOS version is 17.3R3-S1.5. I've tried 5 different verisons including 19R.

(b) Yes, I have performed a packet trace and see the packet going in and no packet going out.

(c) No, the correct next-hop from a layer-2 perspective is ae0 which is accessible via irb.6. It should not touch et-0/0/53. That is a bundle connected to the host with the IP address 1.2.175.119.

(d) See:

> show route 1.3.224.1 

inet.0: 15961 destinations, 37840 routes (15961 active, 0 holddown, 0 hidden)

@ = Routing Use Only, # = Forwarding Use Only

+ = Active Route, - = Last Active, * = Both

 

1.3.224.0/23     *[BGP/170] 08:19:00, localpref 110

                      AS path: E, validation-state: valid

                    > to 1.2.175.119 via irb.6

 

set interfaces irb unit 6 bandwidth 10g

set interfaces irb unit 6 family inet no-redirects

set interfaces irb unit 6 family inet address 1.2.175.114/28 vrrp-group 6 virtual-address 1.2.175.113

set interfaces irb unit 6 family inet address 1.2.175.114/28 vrrp-group 6 priority 150

set interfaces irb unit 6 family inet address 1.2.175.114/28 vrrp-group 6 accept-data

set interfaces et-0/0/53 unit 0 family ethernet-switching interface-mode trunk

set interfaces et-0/0/53 unit 0 family ethernet-switching vlan members all

set interfaces et-0/0/53 unit 0 family ethernet-switching storm-control default

 

There's nothing interesting in show interfaces et-0/0/53 extensive. There is no output from show log messages | grep parity. My firm belief is that there is a JunOS bug that causes the forwarding table to become corrupted/broken/not updated/stale in the TCAM. 

 

 

 

Highlighted
Routing

Re: QFX5100 does not route traffic where the forwarding table says it will route traffic

‎02-25-2020 01:47 AM
Hello lbw,

If the BGP neighbor 1.2.175.119 is advertising the prefix, it sounds like it's being treated as the next-hop based on the BGP advertisement. Did we check what the BGP neighbor is sending to the QFX?

show route receive-protocol bgp 1.2.175.119 extensive

Regarding your comment --- "(c) No, the correct next-hop from a layer-2 perspective is ae0 which is accessible via irb.6. It should not touch et-0/0/53. That is a bundle connected to the host with the IP address 1.2.175.119."

If that host is the BGP neighbor advertising the previous, we need to look at what it's sending because that will govern how the QFX will use that route. This doesn't sound like a bug from what we see so far, else we wouldn't see it after reboots/re-images, unless you're seeing this work differently at different times.

Hope this helps.

Regards,
-r.

--------------------------------------------------

If this solves your problem, please mark this post as "Accepted Solution."
Kudos are always appreciated :).
Routing

Re: QFX5100 does not route traffic where the forwarding table says it will route traffic

[ Edited ]
‎02-25-2020 06:19 PM

I got to the bottom of the issue and it's the same as I've previously experienced in another scenario but without the immediate error messages to tip me off. After toggling the forwarding profile (out of desperation) which I sadly often have to do to clean up the TCAM, I see this from the log:

 

Feb 26 01:30:07 qfxhostname fpc0 brcm_rt_ip_uc_lpm_install:1362(LPM route add failed) Reason : Table full unit 0

Feb 26 01:30:07  qfxhostname fpc0 brcm_rt_ip_uc_entry_install:1220brcm_rt_ip_uc_entry_install Error: lpm ip route install failed vrf 1 ip 1.2.3/24 nh-swidx 1895 nh-hwidx 100062

...

Feb 26 01:30:50  qfxhostname fpc0 brcm_rt_ip_uc_lpm_install:1362(LPM route change failed) Reason : Table full unit 0

Feb 26 01:30:50  qfxhostname fpc0 brcm_rt_ip_uc_entry_install:1220brcm_rt_ip_uc_entry_install Error: lpm ip route install failed vrf 1 ip 123.45.1/24 nh-swidx 2006 nh-hwidx 100170

Feb 26 01:30:50  qfxhostname fpc0 RT-HAL,rt_entry_topo_handler,5033: rt_halp_vectors->rt_rebake failed

Feb 26 01:30:50  qfxhostname fpc0 RT-HAL,rt_entry_topo_handler,5039: nh_id = 0x7d6, nh_type = 2, rpf_nh_index = 0x0, prefix = 123.45.67/24

Feb 26 01:30:50  qfxhostname fpc0 RT-HAL,rt_entry_topo_handler,5042: rt = 0x0x29dccca8, rtt = 0x0x29678208, nh = 0x0x29dcce98, dc = 0x0x0, sc = 0x0x0, rpf_nh = 0x0x0, cos = 0x0x0

 

> show route forwarding-table | grep user | count

Count: 14621 lines

 

The cause of the issue is pretty simple. The QFX platform is fundementally broken.

 

There's about ~15,000 entries in the TCAM and it's full. It fills and fails with both l2-profile-three and l3-profile. The vast majority of those forwarding table entries are IP unicast routes.

 

I dropped a BGP peer and reduced the amount of routes in the forwarding table to 1049 and packets now route where they are supposed to route. I then put an export filter between the RIB and the FIB and the issue also went away.

 

I've previously brough this to JTAC's attention but they don't understand/care. The platform is broken and the specification published by Juniper that says it can fit 144K or 208K unicast L3 routes is plainly wrong, misleading and deceptive. See their specs page:

 

https://www.juniper.net/documentation/en_US/junos/topics/topic-map/unified-forwarding-table-D15-qfx-...

 

I cannot be clearer to anyone who is coming across this post - do not, under any circumstnaces, attempt to do anything but the most basic next-hop data centre layer-3 routing with a QFX5100. The specifications are not to be believed.  Do not buy this product if you want anything other than expensive layer-2 switch.

 

Highlighted
Routing

Re: QFX5100 does not route traffic where the forwarding table says it will route traffic

‎02-25-2020 09:00 PM

Hello lbw,

 

Regarding your comments about QFX5100 - please know that "HOST Routes" are not the same as "LPM" routes that were full in your setup.  To be clear, host routes mean /32 for IPv4 and /128 for IPv6.   And everything else is "LPM".  So filling TCAM entries can be done in multiple ways depending on the routes/prefixes used in your setup.

 

Hopefully you can now read and make better sense of what's meant in this Juniper documentation link:

https://www.juniper.net/documentation/en_US/junos/topics/topic-map/unified-forwarding-table-D15-qfx-...

 

For example:

 

root@QFX5100# run show chassis forwarding-options
fpc0:
--------------------------------------------------------------------------
UFT Configurtion:
l2-profile-three. (MAC: 160K L3-host: 144K LPM: 16K) (default)
num-65-127-prefix = 1K

 

Hope this helps.

 

Regards,
-r.

--------------------------------------------------

If this solves your problem, please mark this post as "Accepted Solution."
Kudos are always appreciated :).

 

Feedback