On Monday, October 20th, 03:11 AM EST, ARC’s AWS reported a global outage, which affected several of ARC API’s functionality, resulting in impaired availability to customers in the us-east-1 region.
AWS reported elevated error rates in DNS resolution of the DynamoDB API endpoint, network connectivity issues, launch failures in virtual servers, and intermittent function errors while making network requests to other services.
All times ET + 24 hour clock
| Time | Event |
|---|---|
| 03:11 | Increased error rates and latencies for multiple services |
| 03:47 | Confirmation of Global Outage in the main Infrastructure Provider |
| 05:04 | Outage continues with the provider identifying the potential root cause |
| 05:55 | ARC Status Page update with systems performance |
| 07:17 | Systems are functional momentarily |
| 09:44 | The systems were partially recovered |
| 10:28 | The platform report elevated errors again. Confirmation on the provider’s system status. |
| 11:40 | ARC Status Page update advising outage is still affecting our APIs. |
| 13:17 | Availability continues to be spotty |
| 14:39 | Any deployment or code change is frozen to avoid any further issues for ARC’s customers. ARC’s Infrastructure Provider advises 2+ hours for full recovery |
| 16:22 | Systems are still performing under stress and availability is limited. |
| 16:40 | Services start to recover |
| 18:15 | Confirmation that all systems have been fully recovered. |