Performance degradation in a subset of EU Production tenants
Incident Report for Zuora
Postmortem

TIMEFRAME:
15/Apr/2021 3:40PM PDT to 15/Apr/2021 5:55PM PDT

SUMMARY AND IMPACT:
During the above timeline, a subset of customers in our EU Cloud datacenter experienced delays for selected asynchronous batch operations such as Data Query, Export API & AQuA API

ROOT CAUSE:
A single customer workflow triggered many concurrent Data Query jobs where the queries were not optimized for performance. This impacted our read-only database response times (latency) and caused longer than expected processing time for the impacted services.

RESOLUTION:
Once the source of the performance issue was identified, the queries causing the impact were stopped, and following a thorough review and optimization, were rerun successfully

FUTURE PREVENTATIVE MEASURES:

  • Work with the customer directly to better optimize their workflow and queries
  • Improve concurrency / rate-limiting systems for this service
Posted Apr 20, 2021 - 08:56 PDT

Resolved
This incident has been resolved.
Posted Apr 15, 2021 - 18:10 PDT
Investigating
We are experiencing performance degradation in a small subset of EU production tenants impacting export jobs, batch query, and reporting functions starting at 3:47 PM PDT.
Our engineering teams are currently investigating the issue.
Posted Apr 15, 2021 - 16:39 PDT
This incident affected: EUROPE - CLOUD 1 (EU1) - *.eu.zuora.com (Production Batch Operations).