Get webhook notifications whenever Network & Infrastructure creates an incident, updates an incident, resolves an incident or changes a component status.
There is a problem on the chassis. The card seems defected and blocks the chassis.
Update(s):
Date: 2011-08-04 08:52:31 UTC The intervention did not pass properly because of the problem of versions on the new card. We were forced to restart the chassis on a new cold reboot. The chassis is in a stable condition, but we suspect another card to be the source of the problem. So, as a precaution we replace card 3.
Date: 2011-08-04 08:47:34 UTC We insert a new card in the slot #2
Date: 2011-08-03 13:35:09 UTC We are turning actually on one sup in slot1. Apparently at least one of the spare cards inserted yesterday has been defected. We retest all the cards in lab and we are planning an intervention tonight to insert a new card sup in slot2. We are changing eventually in a preventive title the cards 3 and 4.
Date: 2011-08-03 13:19:56 UTC we are restarting on a new card in slot1. One sup. We re-descend again the setting at the backup.
Date: 2011-08-03 13:15:36 UTC Card #1 is not restarting:
Local Test Mode encounters Minor hardware problem in Module # 1
Supervisor module 1 encontered CRITICAL failure: 0x1e - EARL_FAILURE L3_FAILURE RWENGINE_FAILURE L2_FAILURE
Failed Module Bringup Process
Use 'show test 1' to see results of tests.
Use 'reset 1' to reset the module.
we are trying to restart the chassis without cards 3 and 4 which are the last common elements to the previous setting.
Date: 2011-08-03 13:14:05 UTC card1:
*** Bus Timeout NMI ***
PC = 0x80b808c8, SP = 0x87fff110 frame = 0xa0005ea8
*** Unknown External Interrupt ***
Stacked Cause = 0x800, Stacked Status Reg = 0x2441fc03
Current Cause IP[7..0] = 0x8, Current SREG IP[7..0] = 0xfc
Date: 2011-08-03 13:13:53 UTC card1:
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
PC = 0xbfc0a6f4, Cause = 0x4c00, Status Reg = 0x2441fc03
Date: 2011-08-03 13:13:35 UTC A new crash. We are restarting the chassis in cold. Cards 3 and 4 are still unchanged.
Date: 2011-08-03 11:54:21 UTC Card #2 took relay while rebooting #1. Another card other than the sups is probably at the origin of the encountered problems since yesterday night. We are going to replace card #5.
Date: 2011-08-03 11:52:31 UTC Aug 3 13:15:38 rbx-31-c1.routers.ovh.net 2011 Aug 03 11:15:17 %SYS-4-SUPERVISOR_ERR:Forwarding engine IP checksum error counter = 6
Aug 3 13:15:35 rbx-31-c1.routers.ovh.net 2011 Aug 03 11:15:14 %SYS-5-MOD_OK:Module 16(WS-F6K-MSFC,SAD040604MY) is online
Aug 3 13:15:34 rbx-31-c1.routers.ovh.net 2011 Aug 03 11:15:13 %SYS-3-MOD_PORTINTFINSYNC:Port Interface in sync for Module 2
Aug 3 13:15:34 rbx-31-c1.routers.ovh.net 2011 Aug 03 11:15:12 %SYS-5-MOD_OK:Module 5(WS-X6408A-GBIC,SAD05030JDD) is online
Aug 3 13:15:32 rbx-31-m2.routers.ovh.net 58: Aug 3 13:15:13 GMT: %SCP-5-ONLINE: Module online (supervisor switchover)
Date: 2011-08-03 11:52:24 UTC Card #1 just been crashed again
Date: 2011-08-03 11:51:44 UTC It is going to be alright tonight :)
Date: 2011-08-03 11:51:11 UTC We need to find the problem's origin. We are testing different cards in the chassis.
http://yfrog.com/kl9ambvj
We put a new card in #2, the card is not switched on. we are changing the slot's power in the chassis, it is switching on:
It's aright it's the chassis. there we go we will change it, we taking out the card of the chassis, we remove the chassis of the rack from the back,
we take the cards then we reinsert the chassis from the back then we reinsert the cards.
http://yfrog.com/kepqdsyj
Date: 2011-08-02 20:57:46 UTC The chassis + 2 sup are grilled. We have replaced them all and we had to entirely reset the router. The service is up on #1. we are finishing with #2.
sportive ...
Date: 2011-08-02 20:55:42 UTC We have grabbed a spare chassis and we have replaced it.
Date: 2011-08-02 20:54:47 UTC EOBC channel fail on #2
Date: 2011-08-02 20:54:33 UTC #1 continuing to boot.
#2 is booting too.
Date: 2011-08-02 20:53:43 UTC The fact to remove #4 it blocked the boot. So we guess it is at the origin of the problem.
Date: 2011-08-02 20:52:39 UTC #2 is dead.
we are putting in #1. We are removing other cards. We are trying to boot already #1 and check whether it works.
Date: 2011-08-02 20:49:41 UTC The boot is performing.
We are taking out #1. putting in #2.
Date: 2011-08-02 20:48:48 UTC We are preparing meanwhile a spare of the card #2.
At least one card is defected.
Date: 2011-08-02 20:48:04 UTC Uptime is 1051 days, 15 hours, 22 minutes
Date: 2011-08-02 20:47:56 UTC The card #2 is out. Other cards are no longer been detected.
We are rebooting in Hardware.