Thursday, 2016-06-23

Scheduled Maintenance: Security Upgrades and Patches, scheduled 3 years ago

UPDATE: New time.

Date & Time: Wednesday June 23, 3am CST (everytimezone.com)

Duration: 1 hour.

Downtime: No.

Purpose: We'll be upgrading our production environment with the latest security patches, as well as updating a webserver configuration to prevent occasional errors during code deploys.

Additional Information: As part of this upgrade, we will be upgrading OpenSSL to the latest version. Though we consider it extremely unlikely, if you're using a very old SSL client on your server, you may have difficult connecting to our systems via the API. (There should be no impact at all to browsers.)

--

Resolution: We will add a note here when the maintenance window has closed.

  • 1:45am CDT: We have begun the system upgrades.
  • 1:52am CDT: All upgrades have been performed successfully.
  • 2:19am CDT: We have identified a problem that we believe to be related to locales. This is causing errors with certain stores.
  • 2:33am CDT: We believe we have resolved the issue related to locales and currency. We are continuing to monitor, and will post more details as they become available.
  • 4:21am CDT: We have confirmed the fix. We will continue to diagnose the root cause.

Root Cause Analysis: As part of the scheduled security upgrades, a patch to part of the Linux kernel was included. Unbeknownst to us, this specific system, when updated, requires locales to be reinstalled.

Though our deploy automation does indeed automate locale installation, the timing of the automation and the upgrade was such that the following happened.

  1. The Linux kernel patch was applied, resetting locales.
  2. At some point within the next 20 seconds, our automation restarted our webservers' PHP processes, which picked up the reduced number of locales.
  3. 20 seconds after step #1, locales were reinstalled automatically.

Unfortunately, PHP wasn't restarted after the locales were reinstalled, which resulted in the 50 minutes of locale issues.

We are updating our automation to ensure PHP is not restarted if the required locales aren't present (step #2 above), and to ensure that PHP is restarted in the event additional locales are installed during the automation.