DATE(S):
2022/08/04 09:35 AM PDT - 2022/08/04 11:10 AM PDT
2022/08/04 12:00 PM PDT - 2022/08/04 03:30 PM PDT
SUMMARY AND IMPACT:
A subset of Zuora Billing customers in the NA2 production environment experienced intermittent performance degradation. The impact of this performance degradation manifested as slower than normal response times and/or 504 timeouts to Billing API, UI and integration calls. Timeouts predominantly occurred for Billing SOAP API calls. Billing batch operations such as bill runs, journal runs and payment runs were not impacted.
ROOT CAUSE:
Zuora detected that an underlying caching data store used by certain Billing services was maxed out on its resource usage. This resulted in slower than normal response time resulting in intermittent performance degradation.
The root cause for the resource exhaustion was a recent change introduced to the Billing application.
During the incident, our normal auto scaling methods as well as rolling restarts did not remediate the issue. The issue was fixed by optimizing the lookup calls to the cache.
RESOLUTION:
The impact was mitigated through the following actions:
Additional system level checks were completed to ensure that performance returned to optimal baseline levels.
FUTURE PREVENTATIVE MEASURES: