SRX

last person joined: yesterday 

Ask questions and share experiences about the SRX Series, vSRX, and cSRX.
  • 1.  OSPF issue over VPLS links

    Posted 06-26-2012 08:09

    Hi,

     

    I have the following setup:

     

    Juniper SRX240 port ge-0/0/0.0 ip 172.31.254.65/27 -> Juniper SRX240 port ge-0/0/0.0 ip 172.31.254.66/27

     

    The routers then run OSPF between them to advertise the networks they host.

     

    I have had this running on the bench perfectly fine for 48 hours. However now that I have posted the one unit to the US (other end is UK) and hooked it up to the VPLS network I am finding that I am having the OSPF fail and then come back up.

     

    Pinging the interfaces from each box works 100% and has a response time of 140ms - 150ms.

     

    Running traces I'm seeing the following:

    US router

    Jun 26 22:43:29.153130 RPD_OSPF_NBRDOWN: OSPF neighbor 172.31.254.65 (realm ospf-v2 ge-0/0/0.0 area 0.0.0.0) state changed from Full to Down due to InActiveTimer (event reason: neighbor was inactive and declared dead)

     

    UK router

    Jun 26 15:22:02.150983 RPD_OSPF_NBRDOWN: OSPF neighbor 172.31.254.68 (realm ospf-v2 ge-0/0/1.0 area 0.0.0.0) state changed from Full to Init due to 1WayRcvd (event reason: neighbor is in one-way mode)

     

    Does anyone have any suggestions? I can post full traces if that would help?

     

    Thanks

     

    Andrew.

     



  • 2.  RE: OSPF issue over VPLS links

    Posted 06-26-2012 09:28

    Possibly adjust your OSPF timers to compensate for latency?



  • 3.  RE: OSPF issue over VPLS links

    Posted 06-26-2012 09:39

    Which timers would you recommend?

     

    If I do "show ospf neighbour" from the UK router then I see "Dead" oscilating between 31 and 39 seconds. On the US router I'm seeing it start at 39 and then drop to 0.



  • 4.  RE: OSPF issue over VPLS links

    Posted 06-26-2012 11:27

    Hi Adrewa,

     

    If you can attach the topology, config, traceoptions along wiht DBD change when the OSPF neighbour goes down.

    It will help me to understand that better, If you can send me topology then i can have a try in my local setup as well.

     

     

    Regards,

    Deepak



  • 5.  RE: OSPF issue over VPLS links

    Posted 06-27-2012 04:04
      |   view attached

    Hi,

     

    I have some additional information to add now. I have attached a rough topology.

     

    We have a link from our building down to a datacentre where we have a cross connect into a VPLS network. In the diagram all three routers are in the same OSPF area 0 and each port indicated is connected as a layer 3 inet port with 172.31.254.64/26 IP addresses (172.31.254.65,66 and 67).

     

    Router A and Router B are able to communicate through OSPF correctly and it is stable.

     

    Router C has a link to the other two routers for 20 seconds and then it drops.

     

    Log from Router C:

    Jun 26 15:22:02.150004 OSPF rcvd Hello 172.31.254.68 -> 224.0.0.5 (ge-0/0/1.0 IFL 71 area 0.0.0.0)
    Jun 26 15:22:02.150428   Version 2, length 44, ID 172.31.255.151, area 0.0.0.0
    Jun 26 15:22:02.150481   checksum 0x0, authtype 0
    Jun 26 15:22:02.150549   mask 255.255.255.192, hello_ivl 10, opts 0x12, prio 128
    Jun 26 15:22:02.150603   dead_ivl 40, DR 172.31.254.68, BDR 0.0.0.0
    Jun 26 15:22:02.150983 RPD_OSPF_NBRDOWN: OSPF neighbor 172.31.254.68 (realm ospf-v2 ge-0/0/1.0 area 0.0.0.0) state changed from Full to Init due to 1WayRcvd (event reason: neighbor is in one-way mode)
    Jun 26 15:22:02.151563 OSPF sent Hello 172.31.254.65 -> 224.0.0.5 (ge-0/0/1.0 IFL 71 area 0.0.0.0)
    Jun 26 15:22:02.151632   Version 2, length 48, ID 172.31.255.104, area 0.0.0.0
    Jun 26 15:22:02.151702   mask 255.255.255.192, hello_ivl 10, opts 0x12, prio 128
    Jun 26 15:22:02.151755   dead_ivl 40, DR 172.31.254.68, BDR 172.31.254.65
    Jun 26 15:22:02.152355 OSPF sent Hello 172.31.254.65 -> 224.0.0.5 (ge-0/0/1.0 IFL 71 area 0.0.0.0)
    Jun 26 15:22:02.152434   Version 2, length 48, ID 172.31.255.104, area 0.0.0.0

     

    Log from router A:

    Jun 26 22:42:50.778578 OSPF rcvd LSAck 172.31.254.65 -> 224.0.0.5 (ge-0/0/0.0 IFL 72 area 0.0.0.0)
    Jun 26 22:42:50.778683   Version 2, length 64, ID 172.31.255.104, area 0.0.0.0
    Jun 26 22:42:50.778734   checksum 0x0, authtype 0
    Jun 26 22:42:53.356144 OSPF periodic xmit from 172.28.12.33 to 224.0.0.5 (IFL 71 area 0.0.0.0)
    Jun 26 22:42:55.040184 OSPF periodic xmit from 172.28.12.1 to 224.0.0.5 (IFL 69 area 0.0.0.0)
    Jun 26 22:42:57.622364 OSPF periodic xmit from 172.31.254.68 to 224.0.0.5 (IFL 72 area 0.0.0.0)
    Jun 26 22:43:01.774604 OSPF periodic xmit from 172.28.12.33 to 224.0.0.5 (IFL 71 area 0.0.0.0)
    Jun 26 22:43:05.026102 OSPF periodic xmit from 172.28.12.1 to 224.0.0.5 (IFL 69 area 0.0.0.0)
    Jun 26 22:43:05.695990 OSPF periodic xmit from 172.31.254.68 to 224.0.0.5 (IFL 72 area 0.0.0.0)
    Jun 26 22:43:11.071100 OSPF periodic xmit from 172.28.12.33 to 224.0.0.5 (IFL 71 area 0.0.0.0)
    Jun 26 22:43:13.170410 OSPF periodic xmit from 172.28.12.1 to 224.0.0.5 (IFL 69 area 0.0.0.0)
    Jun 26 22:43:15.473150 OSPF periodic xmit from 172.31.254.68 to 224.0.0.5 (IFL 72 area 0.0.0.0)
    Jun 26 22:43:20.777683 OSPF periodic xmit from 172.28.12.1 to 224.0.0.5 (IFL 69 area 0.0.0.0)
    Jun 26 22:43:20.871361 OSPF periodic xmit from 172.28.12.33 to 224.0.0.5 (IFL 71 area 0.0.0.0)
    Jun 26 22:43:23.998428 OSPF periodic xmit from 172.31.254.68 to 224.0.0.5 (IFL 72 area 0.0.0.0)
    Jun 26 22:43:29.153130 RPD_OSPF_NBRDOWN: OSPF neighbor 172.31.254.65 (realm ospf-v2 ge-0/0/0.0 area 0.0.0.0) state changed from Full to Down due to InActiveTimer (event reason: neighbor was inactive and declared dead)
    Jun 26 22:43:29.154273 OSPF sent Hello 172.31.254.68 -> 224.0.0.5 (ge-0/0/0.0 IFL 72 area 0.0.0.0)
    Jun 26 22:43:29.154356   Version 2, length 44, ID 172.31.255.151, area 0.0.0.0
    Jun 26 22:43:29.154411   mask 255.255.255.192, hello_ivl 10, opts 0x12, prio 128
    Jun 26 22:43:29.154486   dead_ivl 40, DR 172.31.254.68, BDR 0.0.0.0

     

    I can obviously add more logs if you require.

     

    I've indicated on the diagram that the link passes through 2 cisco routers before getting to the VPLS network, these ports are Layer 2 untagged ports.

     

    Do you have any suggestions?


    Thanks

     

    Andrew.



  • 6.  RE: OSPF issue over VPLS links

     
    Posted 06-27-2012 06:39

    Hi Andrew,

     

    it would be better to have the logs collected at the same time on both devices (NTP synced) and compare the evidence. A few minutes of logging.  Anyway the logs say:

     

    Router A  (172.31.254.68 - not in 172.31.254.65,66,67 range you listed)  doesn't receive Hello from 172.31.254.65 (Router C)  for at least 40 seconds:

     

    Jun 26 22:42:50.778578 OSPF rcvd LSAck 172.31.254.65 -> 224.0.0.5 (ge-0/0/0.0 IFL 72 area 0.0.0.0)
    Jun 26 22:42:50.778683   Version 2, length 64, ID 172.31.255.104, area 0.0.0.0
    Jun 26 22:42:50.778734   checksum 0x0, authtype 0
    [...]
    Jun 26 22:43:29.153130 RPD_OSPF_NBRDOWN: OSPF neighbor 172.31.254.65 (realm ospf-v2 ge-0/0/0.0 area 0.0.0.0) state changed from Full to Down due to InActiveTimer (event reason: neighbor was inactive and declared dead)

     

    We had LSA Ack from 172.31.254.65 at  22:42:50 and after ~40s of silence, Router A assumes C is dead.

    When Router A sends next hello, C will notice A does not list it as a neighbor:

     

    Jun 26 15:22:02.150983 RPD_OSPF_NBRDOWN: OSPF neighbor 172.31.254.68 (realm ospf-v2 ge-0/0/1.0 area 0.0.0.0) state changed from Full to Init due to 1WayRcvd (event reason: neighbor is in one-way mode)

     

    Do you have the same ospf traceoptions flags on both sides ? Please make it similar. 


    To have better picture we need longer logs covering the same time frame from both devices. Please note if time is in sync and timezone difference for both sides.

     

    Anyway, it looks like VPLS service problem.

    jtb



  • 7.  RE: OSPF issue over VPLS links

    Posted 06-27-2012 09:23

    You're right, sorry I've messed up the IP in the diagram. I have a third router on the VPLS but I have a fibre issue with that so I can't talk to it and that has the other address.

     

    I have attached a new diagram with IP details in tact.

     

    I have also sorted out the clock issue and the timezones on the devices.

     

    I have also attached the log files, but they're quite large.

     

    The first "down" occurs on the AR at Jun 27 16:39:38.235392 and SDF at Jun 27 16:39:38.235392

     

    What can I post here to make it easier for you to see the logs?

    Attachment(s)

    txt
    ospf-log.txt   531 KB 1 version
    txt
    ospf-log.0.txt   1.00 MB 1 version


  • 8.  RE: OSPF issue over VPLS links

    Posted 06-27-2012 19:18

    Is it possible that there is an MTU mismatch somewhere in there?

     



  • 9.  RE: OSPF issue over VPLS links
    Best Answer

     
    Posted 06-28-2012 11:30

    hi,

     

    I very busy right now, only short comments. First, it doesn't look like MTU issue, the logs (very verbose; 'traceoptions flag all' enabled ?) are clear:

     

    • Router A (172.31.254.68) doesn't get many OSPF Hello packets from Router C (172.31.254.65)
    • based on logs from C and A we can say the same problem is seen on Router B (172.31.254.67) - missing some hellos from C
    • Router C sends the OSPF Hello packets
    • Router C gets OSPF Hellos from A and B 
    • no problem with A<>B communication

    So, you have problems with OSPF Hellos in C to A & B direction: 172.31.254.65 -> 224.0.0.5. Is it related to multicast 

    address 224.0.0.5 ? Probably. Some problems on the path: Cisco devices ? VPLS ?

     

    Not, sure if  it will give any useful results, but please test 'ping 224.0.0.5 interface ge-x/x/x bypass-routing' from C and A&B sides. You will see duplicate answers (DUP), but you may also notice problems in test started from C 

    jtb

     



  • 10.  RE: OSPF issue over VPLS links

    Posted 06-29-2012 05:59

    Hi All,

     

    Thank you all for your help. It turns out that after a lot of "oh no, it can't possibly be an issue with our equipment" from our VPLS provider it turns out that they had storm-control on their (cisco) router on the UK link limited to 100pps which basically is nothing. The US (Juniper) routers didn't have it enabled so it wasn't an issue there.

     

    I blame Cisco 🙂

     

    As soon as they turned this off everything sprang into life and works perfectly.

     

    Thank you all for your help on this. It really helped being able to eliminate MTU issues etc.


    Thanks

     

    Andrew.



  • 11.  RE: OSPF issue over VPLS links

     
    Posted 06-29-2012 08:40

    Hi Andrew,

     

    great you were able to convince the provider to re-check the network (so fast) Smiley Surprised

     

    Regarding MTU, there was no problem with MTU missmatch to have OSPF neighbors stuck in Exstart/Exchange but it's worth to verify VPLS MTU by sending max sized pings with do-not-fragment set from each side. Just to be sure.

    jtb