Resolved -
This incident has been resolved.
Feb 13, 13:04 UTC
Monitoring -
All workloads on the faulty node have now started up on a new node.
Feb 13, 12:46 UTC
Identified -
The node is cordoned off and now marked as unhealthy. Once a new node has been started up, we will start draining all the workloads on this node. The root cause is the kubelet having problems with managing Pod lifecycle, in particular termination, so if draining adds to the problem, all workloads on that node will be forcefully terminated. Downtime can be expected.
Feb 13, 12:37 UTC
Investigating -
We are currently investigating this issue.
Feb 13, 12:35 UTC