Network Issue
Resolved
Dec 11 at 10:02pm CET
We have not seen any issues since our last update. We consider this incident to be resolved.
Incident Trigger
The incident has been triggered by an internal network component that did not fully recover after a restart. This is normally a regular maintenance task, without impact for customers, however failed this time.
We are performing further investigation and testing into the exact root cause, so we can adjust procedures, scripts and health checking to avoid this in the future.
Impact
A subset of loadbalancers, natgateways and VMs have been impacted by this incident, as they were running on the affected physical machines.
Actions taken
We have moved away any workloads on the impacted machines. In a few cases, autohealing triggered for Kubernetes nodes, automatically replacing an unhealthy node.
If you have any questions, please reach out to our helpdesk (support@thalassa.cloud).
Affected services
Updated
Dec 11 at 09:28pm CET
Our monitoring indicates that all systems have recovered.
For some customers, Kubernetes node autohealing has kicked in and recycled the Kubernetes machines with new ones automatically during the incident.
We are continuing monitoring and the investigating the root cause.
Affected services
Updated
Dec 11 at 09:20pm CET
The issues seem isolated to a small subset of machines. We are currently investigating and moving workloads to healthy machines.
Affected services
Created
Dec 11 at 09:16pm CET
We currently have a network issue. We are investigating.
Affected services