We run a multivendor mpls network spread across around 8 DC's. Currently I'm redesigning our "internet breakout" and I'm puzzled about some odd behavior I'm seeing (or misunderstanding the output).
I can simulate our problem in a very simple topology with only Juniper equipment:
Customer CPE/Router <-> QFX5100 <-> MX204
- Our IGP is OSPF, we run MPLS with LDP and BGP on top
- MPLS traffic engineering is on BGP, we configured explicit-null's
- Routing options are configured with indirect-next-hop, we use per-prefix-labels
We put the "connected" interface interco's on the MX and QFX in BGP to satisfy the route lookups with the right next-hop (OSPF is only responsible for backbone interfaces + loopbacks)
On the MX204 we also announce an aggregate by redistributing a static 0/0 towards the network. This problem happens on the default and as well on upstream interface ranges on the MX.
Problem:
When sending traffic out the customer CPE towards something behind the MX we get a loop:
X.X.X.2 = intero between CPE and QFX
Y.Y.Y.1 = IP behind an interface on the MX (something connected, but not the MX itself)
Z.Z.Z.1 = IP of the MX on the link between the MX/QFX
thomas@503-r51b09-2> traceroute source X.X.X.2 Y.Y.Y.1 ttl 5
traceroute to Y.Y.Y.1 (Y.Y.Y.1) from X.X.X.2, 5 hops max, 40 byte packets
1 X.X.X.1 (X.X.X.1) 23.820 ms 13.688 ms 23.873 ms
2 Z.Z.Z.1 (Z.Z.Z.1) 10.475 ms 4.051 ms 1.992 ms
3 Z.Z.Z.1 (Z.Z.Z.1) 2.674 ms 4.898 ms 3.866 ms
4 Z.Z.Z.1 (Z.Z.Z.1) 3.412 ms 4.424 ms 1.500 ms
5 Z.Z.Z.1 (Z.Z.Z.1) 2.679 ms 2.483 ms 2.566 ms
If we do a traceroute towards Y.Y.Y.2, the IP of the MX of that interco, it works fine.
Output's:
1) first the qfx does do a lookup on 0/0 in inet.0 and after that inet.3 should be checked to reach our MX-LOOPBACK:
root@qfx> show route table inet.0 0.0.0.0
inet.0: 95 destinations, 167 routes (83 active, 0 holddown, 78 hidden)e
+ = Active Route, - = Last Active, * = Both
0.0.0.0/0 *[BGP/170] 1d 11:10:16, localpref 100, from MX-LOOPBACK
AS path: I, validation-state: unverified
> to Z.Z.Z.1 via et-0/0/10.0, Push 0
root@qfx> show route table inet.3 MX-LOOPBACK
inet.3: 9 destinations, 12 routes (9 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both
MX-LOOPBACK/32 *[LDP/9] 1w0d 11:32:50, metric 1
> to Z.Z.Z.1 via et-0/0/10.0, Push 0
I guess pushing 0 to the packet isn't optimal as this router is also the last hop before our MX (they are back to back).
2)
The packet arrives at the MX and a lookup should be done on the mpls table:
root@mx> show route table mpls.0 label 0
mpls.0: 28 destinations, 28 routes (28 active, 0 holddown, 0 hidden)
+ = Active Route, - = Last Active, * = Both
0 *[MPLS/0] 6w2d 19:46:38, metric 1
to table inet.0
0(S=0) *[MPLS/0] 6w2d 19:46:38, metric 1
to table mpls.0
So label 0 makes the MX look into inet.0 that one has all the right entries.
So in theory, if you look at the outputs this should work, but it is not? Looking at the traceroute it looks like we are stuck on the MX itself, traffic is not bouncing back towards other interfaces.
It also seems to work for the MX IP's on the connected interfaces but not for the peer's.
a) I do get that the ingress PE is also the PHP, is this the cause of the problem? Shouldn't the fowarding logic be smart enough to handle this and just send an IP packet towards the MX?
b) I guess moving towards RSVP and doing tunnel-services & UHP on the MX would solve this but I do not fully understand why the current config isnt working. How are all you solving these kind of problems?
Thanks,
Thomas