rssLink RSS for all categories
 
icon_red
icon_green
icon_red
icon_red
icon_blue
icon_green
icon_green
icon_red
icon_red
icon_red
icon_orange
icon_green
icon_green
icon_green
icon_green
icon_blue
icon_green
icon_orange
icon_red
icon_green
icon_red
icon_red
icon_green
icon_red
icon_red
icon_red
icon_red
icon_orange
icon_green
 

FS#340 — FS#4421 — ams-1-6k

Attached to Project— Network
Incident
Whole Network
CLOSED
100%
Thereafter establishing "wrr-queue" on all 10G network interfaces, on the router ams-1-6k, the card 2 is established on random.

Jul 29 08:52:14 GMT: %PM_SCP-SP-2-LCP_FW_ERR_INFORM: Module 2 is experiencing the following error: RO[2] (166004 noncritical int in the last 10s, they are now disabled). ROINTMSK[2]:
2E9=0xC,00F=0x728,024=0x1FFF,0E8=0x4,052=0x0,04C=0x1E,049=0x0,09D=0x2FFF,009=0x0,00C=0x0,

The traffic passing by this card has been impacted. We have disconnected the port and the traffic has returned. We are in touch with the card restarting.

Date:  Friday, 30 July 2010, 00:43AM
Reason for closing:  Done
Comment by OVH - Thursday, 29 July 2010, 10:21AM

Jul 29 10:00:31 40G.ams-1-6k.routers.ovh.net 6687: Jul 29 09:00:10 GMT: %SYS-3-CPUHOG: Task is running for (2000)msecs, more than (2000)msecs (33/3),process = CEF Reloader.
Jul 29 10:00:31 40G.ams-1-6k.routers.ovh.net 6688: -Traceback= 41D7B360 41042F5C 413C3E60 413C487C 413C4F48 41044C40 41044C2C
Jul 29 10:00:33 40G.ams-1-6k.routers.ovh.net 6689: Jul 29 09:00:12 GMT: %SYS-2-MALLOCFAIL: Memory allocation of 65536 bytes failed from 0x410433D8, alignment 8
Jul 29 10:00:33 40G.ams-1-6k.routers.ovh.net 6690: Pool: Processor Free: 7057952 Cause: Memory fragmentation
Jul 29 10:00:33 40G.ams-1-6k.routers.ovh.net 6691: Alternate Pool: None Free: 0 Cause: No Alternate pool
Jul 29 10:00:33 40G.ams-1-6k.routers.ovh.net 6692: -Process= "CEF Reloader", ipl= 0, pid= 146
Jul 29 10:00:33 40G.ams-1-6k.routers.ovh.net 6693: -Traceback= 4102AD28 41030958 410433E0 413C26A0 413C3E04 413C487C 413C4F48 41044C40 41044C2C


Comment by OVH - Thursday, 29 July 2010, 10:22AM

ams-1-6k#sh mem stat
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 44B1D6B0 927836496 891342240 36494256 0 4132240
I/O 8000000 67108864 11948344 55160520 53479168 55056824


Comment by OVH - Thursday, 29 July 2010, 10:22AM

Jul 29 10:02:54 40G.ams-1-6k.routers.ovh.net 6703: Jul 29 09:02:34 GMT: %QM-2-TCAM_BAD_LOU: Bad TCAM LOU operation in ACL


Comment by OVH - Thursday, 29 July 2010, 10:22AM

The router is crashed

Jul 29 10:07:09 40G.ams-1-6k.routers.ovh.net 6774: Jul 29 09:06:51 GMT: %C6KFIB-4-DISABLED: Hardware FIB forwarding disabled, reverting to only software forwarding.
Jul 29 10:07:13 40G.ams-1-6k.routers.ovh.net 6775: Jul 29 09:06:53 GMT: %FIB-2-FIBDOWN: CEF has been disabled due to a low memory condition.
Jul 29 10:07:13 40G.ams-1-6k.routers.ovh.net 6776: It can be re-enabled by configuring "ip cef [distributed]"

We are isolating it of the network


Comment by OVH - Thursday, 29 July 2010, 10:25AM

Router isolation has induced a service disconnection.

We are dealing again with the router. We are saving the configuration and we are restarting it.


Comment by OVH - Thursday, 29 July 2010, 14:48PM

We take this opportunity to update the IOS to a newer version 17a.


Comment by OVH - Thursday, 29 July 2010, 14:49PM

The router is back. We'll put it again in the backbone.


Comment by OVH - Thursday, 29 July 2010, 14:50PM

ams-1-6k#sh mem stat
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 44B219D0 927819312 715646936 212172376 0 1836520
I/O 8000000 67108864 12821888 54286976 54219792 54104760
ams-1-6k#reload

System configuration has been modified. Save? [yes/no]:
% Please answer 'yes' or 'no'.

System configuration has been modified. Save? [yes/no]:
% Please answer 'yes' or 'no'.

System configuration has been modified. Save? [yes/no]:
% Please answer 'yes' or 'no'.

System configuration has been modified. Save? [yes/no]: no
Proceed with reload? [confirm]
Connection closed by foreign host.

The router has replanted. We managed to reload it



Comment by OVH - Thursday, 29 July 2010, 14:51PM

Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 920: Jul 29 10:40:10 GMT: %SYS-2-MALLOCFAIL: Memory allocation of 65536 bytes failed from 0x41044EEC, alignment 8
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 921: Pool: Processor Free: 1395584 Cause: Memory fragmentation
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 922: Alternate Pool: None Free: 0 Cause: No Alternate pool
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 923: -Process= "IP RIB Update", ipl= 0, pid= 164
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 924: -Traceback= 4102C83C 4103246C 41044EF4 413C2334 413C2578 4228B548 40641B40 42307BD0 409D3998 4098445C 4098457C
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 925: Jul 29 10:40:13 GMT: %FIB-3-NORPXDRQELEMS: Exhausted XDR queuing elements while preparing message for slot/cpu 6/0
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 926: -Process= "IP RIB Update", ipl= 0, pid= 164
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 927: -Traceback= 413C273C 4228B548 40641B40 42307BD0 409D3998 4098445C 4098457C
Jul 29 11:40:34 40G.ams-1-6k.routers.ovh.net 928: Jul 29 10:40:13 GMT: %FIB-3-UPDATEFAIL: Update of prefix 124.138.241.0/-256 failed, resulting in it being deleted.
Jul 29 11:40:48 40G.ams-1-6k.routers.ovh.net 929: Jul 29 10:40:17 GMT: %FIB-3-NOMEM: Malloc Failure, disabling DCEF


Comment by OVH - Thursday, 29 July 2010, 14:54PM

A hardware problem is certainly at the origin of these problems.
We will intervene on the site to change the hardware. Whether the card
10G or the sup, or the 2. It would take 3 hours of roads
from Roubaix. The traffic will goes through London and Frankfurt.




Comment by OVH - Thursday, 29 July 2010, 18:15PM

We will change the 10G card and then restart the router
and returned the traffic. We'll see if the router will replant. If
yes, which is more likely, we will change the sup
card.



Comment by OVH - Thursday, 29 July 2010, 18:22PM

We will reload the router and return it into production.


Comment by OVH - Friday, 30 July 2010, 00:43AM

Done.

Even though we have disabled the MPLS. With the MPLS the router has not enough RAM and would crash. that runs to 10Mo ... though we deactivated the MPLS on ldn-1 also.

The router is stable.

What a day ...