=====Updated June 26, 5:30 AM PST=====
We have completed the installation of new hardware and have brought all of the Curse Network websites back online. Thank you all for your patience and understanding during this trying process. Please know that we are going to continue to expand our infrastructure and work with some of the brightest minds in the world to guarantee that we can offer the best experience available on all Curse Network websites.
=====Updated June 25, 5:30 PM PST=====
We are putting our websites into maintenance mode briefly while we install vital hardware in our effort to bring all of the Curse Network websites online. We expect this maintenance period to last roughly one hour. Thank you for your continued patience and understanding.
As many of you know, the Curse family of sites was hit with an outage over the past 48 hours. We are extremely sorry for the downtime and have been working tirelessly to get everything back online and running smoothly.
An important piece of hardware and its backup system failed causing all of our websites to go offline.
Around 7:30 AM PST on Wednesday, June 22nd one of the storage array network (SAN) controller nodes in our Atlanta datacenter failed, causing all the sites on the Curse network to go offline. This is a highly redundant system with a backup controller which should have taken over automatically. However, it did not, despite reporting as healthy. After replacing the failed controller, it booted, and began copying its configuration from its peer server. Unfortunately, as soon as the configuration was copied, the secondary controller also died.
After replacing the 2nd failed controller we began powering the servers that relied on the SAN for their data – all of the database servers and the network-attached storage (NAS) file server storing media, static content, and most web files. When the NAS server booted and reconnected to its volume on the SAN, it began to run a checkdisk command to make sure there were no errors on the drive. This proved to be a drawn out process, and was the primary reason for the length of the downtime.
In addition, we had yet another roadblock with our Linux servers. Both of the new controller nodes for the SAN had a newer firmware version, preventing these databases from reading their storage. The vendor acknowledged this as a known issue, and recommended a firmware upgrade to fix it. In order to ensure our data integrity, we are conducting a full backup of this storage before implementing the firmware upgrade. Once complete, this will bring the Linux based databases online.
After the fix is applied, we will be able to restore the database to its pre-crash state and restore full functionality to all of our sites.
Your Personal Information
We can reassure you that at no time during the hardware failure was any of your personal information compromised. We take the sacred trust you put in us with your information VERY seriously.
We are currently working hard to get all of our sites restored and functioning normally. We hope to have everything up and working again by tomorrow.
We realize that you depend on Curse for the information and add-ons that enhance your gameplay experience. As a way of thanking you for your patience and loyalty through this downtime we are giving premium access to the Curse Client to all users starting on July 1st and running through July 5th. We will also be announcing something special for existing premium users shortly. In addition, all guilds on Wowstead.com will receive free premium access – stay tuned on WowStead for more information.
Once again, we sincerely apologize for the downtime and hope you’ll continue to enjoy all the great services Curse has to offer.
The Curse Team