Intermittent issues impacting Lever Hire

Incident Report for Lever

Postmortem

On the early morning of Feb 18, 2025 (Pacific Time), Lever product engineers were alerted by internal monitoring tools for a database instance being unavailable. Around 50% of customers may have initially experienced higher latency on Lever Hire and the Lever API.

A few hours later, compounding issues in the database replicas caused Lever Hire to be inaccessible for those ~50% of customers, for a few hours. The impacted customer accounts were unable to access Lever Hire and candidate-related Data API endpoints at all on Feb 18 from 5:55-10:05 PST (the longest outage, ~4 hours), 14:08-14:18, 14:48-14:52, 21:52-22:09; and Feb 19 from 00:06-00:14. Lever-hosted job sites continued to work for all customers.

Lever product engineers were engaged for investigation and troubleshooting pointed to database issues caused by unusual external Lever API load. The issue was resolved by:

Rebuilding the affected database replicas

Spreading Lever API load across additional database replicas

As part of this database recovery and mitigation measures, the Lever API also became inaccessible for all customers for a few hours. The database rebuild also caused some Lever Hire pipeline numbers and search results to be temporarily out of sync.

To mitigate this situation from occurring in the future and to reduce the risk of a future impact the following measures have been put into place:

Per above, spreading Lever API load across additional database replicas

Limiting database time for individual Lever API requests, to prevent a few individual requests from having a wider impact

Optimizing database query performance

Posted Mar 19, 2025 - 13:27 PDT

Resolved

The issue where 500 errors when attempting to access Lever has been resolved. There should be no further impact at this time, but please reach out to us at Support if any additional assistance is needed: https://help.lever.co/hc/en-us/requests/new
Posted Feb 21, 2025 - 17:16 PST

Monitoring

Our team was able to stabilize the hire.lever.co platform. Customers should now be able to access the Lever tool without encountering the 500 error messages. We’re currently monitoring to ensure that there are no further issues, and we’ll send a final update to confirm that there have been no recurrences.
Posted Feb 19, 2025 - 09:23 PST

Identified

Our team has identified the issue and is working to implement a resolution. Users may intermittently experience 500 error messages when attempting to access Hire.lever.co. Our Engineering teams are actively working to resolve this as soon as possible.
Posted Feb 19, 2025 - 05:43 PST

Monitoring

Our team was able to stabilize the hire.lever.co platform. Customers should now be able to access the Lever tool without encountering the 500 error messages. We’re currently monitoring to ensure that there are no further issues, and we’ll send a final update to confirm that there have been no recurrences.
Posted Feb 18, 2025 - 14:45 PST

Investigating

We are continuing our investigation into an issue that is intermittently affecting the hire.lever.co platform. Users may intermittently experience 500 error messages when attempting to access Hire.lever.co. Our Engineering teams are actively working to resolve this as soon as possible.
Posted Feb 18, 2025 - 14:20 PST
This incident affected: Global Data Center - LeverTRM (Hire).