OVHcloud Network Status

Current status
Legend
  • Operational
  • Degraded performance
  • Partial Outage
  • Major Outage
  • Under maintenance
FS#11823 — 192.95.32.0/24 and 192.95.33.0/24
Scheduled Maintenance Report for Network & Infrastructure
Completed
The couple n5 that manage the 2 networks has crashed.

%SYSMGR-2-HAP_FAILURE_SUP_RESET: System reset due to service
\"eth_port_sec\" in vdc 1 has had a hap failure

It has just rebooted and the fex is back up.

We are investigating.

Update(s):

Date: 2014-10-15 12:53:56 UTC
Port conf is good, un shut / no shut resolved the problem for
last servers.

Everything is operational again and for 192.95.32.0/24
192.95.33.0/24

Date: 2014-10-15 11:41:29 UTC
All FEX are UP again.

8 remains partially unreachable servers, I am looking into this.

Date: 2014-10-15 11:32:53 UTC
The FEX is retuning little by little.

sh fex
FEX FEX FEX FEX Fex

Number Description State Model Serial

------------------------------------------------------------------------
100 FEX0100 Online N2K-C2248TP-E-1GE
SSI16370ACF
101 FEX0101 Online N2K-C2248TP-E-1GE
SSI16370ABZ
102 FEX0102 Connected N2K-C2248TP-1GE
SSI1603063C
105 FEX0105 Connected N2K-C2248TP-E-1GE
SSI16370AG7
109 FEX0109 Online N2K-C2248TP-E-1GE
SSI16370EDR
111 FEX111 Online N2K-C2248TP-E-1GE
SSI16370ED7
--- -------- Connected N2K-C2248TP-E-1GE
SSI16370ECS
--- -------- Connected N2K-C2248TP-E-1GE
SSI16370EDT
--- -------- Connected N2K-C2248TP-1GE
SSI16080AR8
--- -------- Connected N2K-C2248TP-E-1GE
SSI16370EDX
--- -------- Connected N2K-C2248TP-E-1GE
SSI16370E7W

Date: 2014-10-15 11:24:07 UTC
The situation:

It is in a degraded but stable state, only servers on the fex 102 (Bay T01C52) are impacted.
the other server on the other fex are still accessible.

We had a bug (in addtion to the port-security)during the ISSU upgrade which blocked the FEX 102 and the VPC.


sh fex
FEX FEX FEX FEX Fex
Number Description State Model Serial
------------------------------------------------------------------------
100 FEX0100 Online N2K-C2248TP-E-1GE SSI16370ACF
101 FEX0101 Online N2K-C2248TP-E-1GE SSI16370ABZ
102 FEX0102 Check Upg Seq N2K-C2248TP-1GE SSI1603063C
103 FEX0103 Online N2K-C2248TP-1GE SSI16080AR8
104 FEX0104 Online N2K-C2248TP-E-1GE SSI16370E7W
105 FEX0105 Online N2K-C2248TP-E-1GE SSI16370AG7
106 FEX0106 Online N2K-C2248TP-E-1GE SSI16370ECS
107 FEX0107 Online N2K-C2248TP-E-1GE SSI16370EDT
108 FEX0108 Online N2K-C2248TP-E-1GE SSI16370EDX
109 FEX0109 Online N2K-C2248TP-E-1GE SSI16370EDR
111 FEX111 Online N2K-C2248TP-E-1GE SSI16370ED7

Eth1/31 vPC nodUpgrad trunk full 10G SFP-H10GB-C
Eth1/32 vPC nodUpgrad trunk full 10G SFP-H10GB-C

=> Ethernet1/32 is down (LC upgrade in progress)

Ongoing actions:
We are going to upate the 2nd n5 then reload the first n5.


The fex 100 and 101 will automatically switch to the 2nd n5 has it has already been updated.

The fex103->111 has will be reloaded and this will make the server unavailable during the time of reboot FEX.

Date: 2014-10-15 10:45:31 UTC
0% -- FAIL. Return code -1.

Remaining action::
\"Module(s) 103, 104, 105, 106, 107, 108, 109, 111 still need to be
upgraded\".

Install has failed. Return code 0x40930020 (Non-disruptive upgrade of
a module failed).
Please identify the cause of the failure, and try 'install all' again.

The ISSU on fex 102 is stuck.
and the 2nd N5 has just crashed with (same error)

Date: 2014-10-15 10:34:45 UTC
2014 Oct 15 11:54:43 sw %$ VDC-1 %$ %SATCTRL-FEX108-2-SATCTRL_IMAGE:
FEX108 Image update complete. Install pending
2014 Oct 15 11:54:56 sw %$ VDC-1 %$ %SATCTRL-FEX107-2-SATCTRL_IMAGE:
FEX107 Image update complete. Install pending
2014 Oct 15 11:55:37 sw %$ VDC-1 %$ %SATCTRL-FEX109-2-SATCTRL_IMAGE:
FEX109 Image update c[####################] 100% -- SUCCESS

Module 100: Non-disruptive upgrading.
[# ] 0%

Date: 2014-10-15 10:33:28 UTC
show install all status
There is an on-going installation...
Enter Ctrl-C to go back to the prompt.

Continuing with installation process, please wait.
The login will be disabled until the installation is completed.

Performing supervisor state verification.
SUCCESS

Supervisor non-disruptive upgrade successful.

Pre-loading modules.

The FEX are being updated

Comment by OVH - Wednesday, 15 October 2014, 11:56AM

2014 Oct 15 11:54:43 sw %$ VDC-1 %$ %SATCTRL-FEX108-2-SATCTRL_IMAGE: FEX108 Image update complete. Install pending
2014 Oct 15 11:54:56 sw %$ VDC-1 %$ %SATCTRL-FEX107-2-SATCTRL_IMAGE: FEX107 Image update complete. Install pending
2014 Oct 15 11:55:37 sw %$ VDC-1 %$ %SATCTRL-FEX109-2-SATCTRL_IMAGE: FEX109 Image update c[####################] 100% -- SUCCESS

Module 100: Non-disruptive upgrading.
[# ] 0%

Date: 2014-10-15 09:44:29 UTC
Compatibility check is done:
Module bootable Impact Install-type Reason
------ -------- -------------- ------------ ------
1 yes non-disruptive reset
3 yes non-disruptive rolling
100 yes non-disruptive rolling
101 yes non-disruptive rolling
102 yes non-disruptive rolling
103 yes non-disruptive rolling
104 yes non-disruptive rolling
105 yes non-disruptive rolling
106 yes non-disruptive rolling
107 yes non-disruptive rolling
108 yes non-disruptive rolling
109 yes non-disruptive rolling
111 yes non-disruptive rolling



Images will be upgraded according to following table:
Module Image Running-Version New-Version Upg-Required
------ ---------------- ---------------------- ---------------------- ------------
1 system 6.0(2)N2(2) 6.0(2)N2(5) yes
1 kickstart 6.0(2)N2(2) 6.0(2)N2(5) yes
1 bios v3.6.0(05/09/2012) v3.6.0(05/09/2012) no
1 power-seq v1.0 v3.0 yes
1 SFP-uC v1.0.0.0 v1.0.0.0 no
3 power-seq v2.0 v2.0 no
100 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
101 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
102 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
103 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
104 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
105 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
106 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
107 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
108 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
109 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
111 fexth 6.0(2)N2(2) 6.0(2)N2(5) yes
1 microcontroller v1.2.0.1 v1.2.0.1 no

Powered by Flyspray
OVH
rssiconRSS

Date: 2014-10-15 09:32:39 UTC
Download of images on switch done, I will begin the update.

Date: 2014-10-15 08:51:13 UTC
Note: Thanks to Cisco ISSU, the update can be made hot
and there should not be any outage.

Date: 2014-10-15 08:48:36 UTC
The switches are on version 6.0(2)N2(2).

It's still early in North America, we will update the
couple 6.0 (2) N2 (5), which includes port-security bug fixes.
Posted Oct 15, 2014 - 08:42 UTC