Spike in Pipedream Internal Errors for workflows

Incident Report for Pipedream

Resolved

All systems are operational. Incoming events should be processed successfully. We've also processed the backlog of events that arrived during the first part of the incident, from ~1:00am UTC to 3:51 UTC.

From 3:51 AM UTC to 6:04 AM, we had to disable incoming events due to some of the load issues we were experiencing. Events sent during this time may be retried by the source services. At 6:04am, we re-enabled the collection of incoming events again.

We'll follow up with a detailed retrospective of this incident as soon as possible.

Posted Feb 14, 2023 - 07:06 UTC

Monitoring

AWS has shipped a fix for the issue. We're restarting our services and bringing workflows back online. We'll send another update as soon as that's done.

Posted Feb 14, 2023 - 06:02 UTC

Update

AWS is still working on a patch for the issue. We're still in active communication with them.

Posted Feb 14, 2023 - 05:56 UTC

Update

The AWS Lambda team confirmed the issue was due to the scale of volume Pipedream is running on Lambda. They're working a fix now. We'll update this incident again soon.

Posted Feb 14, 2023 - 04:49 UTC

Update

We are continuing to work on a fix for this issue.

Posted Feb 14, 2023 - 04:09 UTC

Update

AWS has escalated to more teams internally. The issue is still ongoing.

Posted Feb 14, 2023 - 03:33 UTC

Update

We're continuing to discuss this with the AWS team. The issue is still ongoing.

Posted Feb 14, 2023 - 03:01 UTC

Identified

AWS Lambda — part of the service we use to run workflows — has identified a service issue. They're working on it, and we'll communicate updates here.

Posted Feb 14, 2023 - 02:27 UTC

Investigating

We're seeing a spike in Pipedream Internal Errors across workflows, and we're investigating.

Posted Feb 14, 2023 - 02:04 UTC

This incident affected: Frontend (Pipedream) and Backend (Workflows - Timer, Workflows - HTTP, Workflows - Email).