On Sunday January 13th, from 17:54 PM to 23:03 PM, some clients experienced degraded performance with Content API Search and some slowness loading content in Composer. Infrastructure was scaled up and performance restored.
On January 13th, a workload with poor performance characteristics overloaded a part of Arc XP infrastructure, causing slowness in loading content in Composer and degraded performance for search requests. Infrastructure was scaled up and additional rate-limits were put in place to speed up recovery, once the scaling concluded and performance was restored, limits were lifted, and the system returned to its nominal state.
To prevent further recurrence additional restrictions were put in place on the offending workload.
All times ET + 24 hour clock
Time | Event |
---|---|
16:30 | Cluster write load starts to have a degraded performance |
17:20 | Automated alerts notify engineers about an instability in Content API. Team starts investigating the case |
19:31 | Cluster resources are upscaled |
20:20 | First customer opened a ticket about slowness in Composer |
21:38 | Rate-limits added |
22:45 | Infrastructure scaling complete |
22:50 | All metrics returned to safe levels |
23:00 | All customer traffic restored |
23:03 | System fully restored |