Scaleway
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Jul 22, 2024 - 09:00 CEST
Scheduled - Maintenance will be performed on mutualized Kapsule/Kosmos control planes, to enhance reliability.

It is scheduled from July 22 to July 25. During this period, you may experience brief control plane unavailability (less than 2 minutes).

Below is the detailed schedule:
July 22, 9 AM - 6 PM: Clusters in fr-par with IDs starting with 0, 1, 2, or 3
July 23, 9 AM - 6 PM: Clusters in fr-par with IDs starting with 4, 5, 6, or 7
July 24, 9 AM - 6 PM: Clusters in fr-par with IDs starting with 8, 9, a, or b
July 25, 9 AM - 6 PM: Clusters in fr-par with IDs starting with c, d, e, or f

Jul 22, 2024 09:00 - Jul 25, 2024 18:00 CEST
Monitoring - A fix has been implemented and we are monitoring the results.
Jul 19, 2024 - 16:40 CEST
Investigating - Routed IP migration of Kapsule clusters may restart more than one node at a time.

When the nodes are not up-to-date, they need to upgrade upon reboot.

This additional time triggers a timeout in the migration process leading to more than one node being rebooted at the same time.

To avoid this issue, we strongly encourage you to manually restart/replace your nodes one by one prior to the migration so that they are up to date before migrating.

Jul 19, 2024 - 16:29 CEST
Investigating - Log requests for Scaleway Products on fr-par may timeout and prevent the display of dashboard components based on logs ("Fail to call resource" or "Client.Timeout exceeded while awaiting headers")
Jul 16, 2024 - 11:25 CEST
Investigating - Suchard chassis s101-g33-13-llm.dc2 is powered off.

Impact : 18 servers on it are powered down (12 active servers).

List of active servers affected :101551 / 101552 / 101553 / 101554 / 101557 / 101558 / 101559 / 101560 / 101562 / 101564 / 101565 / 101567

Jul 11, 2024 - 19:12 CEST
Monitoring - A fix has been implemented and we are monitoring the results.
Jul 11, 2024 - 11:30 CEST
Investigating - Query logs have high latency, resulting in some cases in timeout.
We are currently investigating this issue.

Jul 11, 2024 - 11:28 CEST
Investigating - Sadly, the configuration change we applied yesterday (https://status.scaleway.com/incidents/6vrd790qhzym) did not solve the issues. We are still investigating. Sorry about any inconvenience.
Jul 11, 2024 - 09:32 CEST
Monitoring - We have changed our Cilium configuration (see last message and schedule maintenance). We are now monitoring.
Jul 10, 2024 - 08:55 CEST
Update - Instabilities might be linked to our Cilium setup, so we are going to update their configuration. We have created a scheduled maintenance for tomorrow 2024/07/10: https://status.scaleway.com/incidents/6vrd790qhzym.
Jul 09, 2024 - 11:31 CEST
Update - We have thoroughly extended our monitoring/logging to help us find out what is causing these errors. So far, we have noticed that the number of 503 are not constant through the day, but appears sporadically, in batches. For example, from yesterday (2024-07-04 12:00 PM), some requests during the following time ranges (in UTC) have been affected:

- 2024-07-04 15:49 => 2024-07-04 17:00
- 2024-07-04 19:10 => 2024-07-04 20:10
- 2024-07-05 01:30 => 2024-07-05 03:30

The pattern seems to always be the same: a peak of 503 (~2 or 4% of total requests) at the beginning of the time range, then a decrease of that 503 rate until it eventually reaches 0% (no errors).

With our extended monitoring, we are still trying to correlate these issues with other metrics or logs. Sorry about the inconvenience.

Jul 05, 2024 - 10:36 CEST
Investigating - This morning's hotfix has not permanently solved the issue, we are seeing 503 again from 12:21 PM (UTC). We are still investigating.
Jun 26, 2024 - 17:38 CEST
Monitoring - We have applied a hotfix at 08:52 UTC. So far we don't see 503 anymore on our side, but we are still actively monitoring.

We still need to apply the fix permanently though. We will keep you updated.

Jun 26, 2024 - 13:01 CEST
Update - Quick update to inform our users that we are still working on it, and are actively trying to find a solution. Sorry for any inconvenience.
Jun 25, 2024 - 19:07 CEST
Update - The issue has been escalated to networking team to investigate possible connectivity issues between hosts.
Jun 19, 2024 - 17:30 CEST
Identified - We have identified the issue. Inside our infrastructure, some TCP connections are terminated unexpectedly, leading to 503 for clients doing HTTP calls using these connections. This only affects custom domains because traffic is routed differently from default endpoints. On the user side, retrying in case of 503 should help to mitigate the issue, as we have seen it is unlikely that TCP connections for 2 consecutive HTTP requests break.

Our monitoring have shown this affects around 100 custom domains, for 0.19% of total requests. For most affected clients, the rate of 503 can go up to 2%, but we have seen it can fluctuate over time.

We are still not sure about the root cause, but are working on it. Sorry for any inconvenience.

Jun 18, 2024 - 09:38 CEST
Update - We are still investigating.

It has been confirmed by our tests that only calls to custom domains, in http and https, might periodically end up in 503 errors. These 503 errors have the following body: "upstream connect error or disconnect/reset before headers. reset reason: connection termination".

The 503 errors are sporadic, but are likely to happen in batches as the global load on our infrastructure (number of requests/number of connections) increase. We have some hypothesis to test before communicating further.

As a reminder if you are affected: if possible, you can use the default provided endpoint (*.functions.fnc.fr-par.scw.cloud) instead of your custom domains. If not possible, retrying in case of 503 is unfortunately the only way to mitigate the issue while we are investigating.

Sorry for any inconvenience.

Jun 13, 2024 - 17:44 CEST
Update - We are still investigating. There are still a few 503 returned when calling custom domains. From what we have seen, calls with HTTPS are more likely to end up in 503 errors. Sorry for any inconvenience.
Jun 12, 2024 - 14:12 CEST
Investigating - Some fr-par clients (1/10th of all clients) calling their functions/containers through a custom domain might encounter an abnormal number of HTTP 503 errors. It seems to only affect HTTP calls to the custom domains, and not calls made directly to the default endpoint, but we are still investigating.

From what we have seen so far, for those clients, there should be less than 4% of 503 errors. Though, this number can evolve through time (sometimes it's less than 0.1%).

If possible, clients experiencing these 503 errors can try to use the default provided endpoint instead of their custom domains (*.functions.fnc.fr-par.scw.cloud). If not possible, retrying in case of 503 is the only way to mitigate the issues while we are investigating.

Sorry for any inconvenience.

Jun 07, 2024 - 19:01 CEST
Identified - We have isolated the faulty path and network connectivity in WAW should be back to normal. We are still monitoring this issue :

- we put back Lumen in production to help with the saturations
- Scaleway to Scaleway (eg. waw to dc5) traffic is forced through Cogent
- but we still have saturations through Level3 on some destinations
- + still waiting for provider update for an ETR

Jul 09, 2024 - 10:48 CEST
Investigating - We are observing packet loss and high latency in WAW region.
High latency and low bandwidth toward some external destinations, including reaching PAR/AMS.

Jul 09, 2024 - 09:55 CEST
Investigating - Switch h2-ee22-2.ams1 is down since 2024 Jul 8 20:58:27 UTC
18 servers in the rack without network connectivity (public and rpn included)
Impacted rack : AMS1 - Room Hall 2 - Zone E-E - Rack E22 Bloc B

Jul 09, 2024 - 09:41 CEST
Monitoring - A fix has been implemented and we are monitoring the results.
Jun 28, 2024 - 14:36 CEST
Investigating - Client that had a Cockpit premium plan on project that is now deleted can still be billed. The team is investigating.
Jun 27, 2024 - 19:58 CEST
Update - We are continuing to investigate this issue.
Jun 20, 2024 - 19:52 CEST
Investigating - Our service is currently experiencing disruption due to blacklisting by Microsoft.
We are actively working with Microsoft to resolve this issue as soon as possible.

Apr 19, 2024 - 00:16 CEST
Investigating - We have noticed that problems with connecting to the dedibackup service can occur.
We will get back to you as soon as we have more information on the situation.

Apr 06, 2023 - 12:23 CEST
Elements - AZ Operational
90 days ago
96.93 % uptime
Today
fr-par-1 Operational
90 days ago
96.66 % uptime
Today
fr-par-2 Operational
90 days ago
93.61 % uptime
Today
fr-par-3 Operational
90 days ago
93.84 % uptime
Today
nl-ams-1 Operational
90 days ago
96.25 % uptime
Today
pl-waw-1 Operational
90 days ago
99.59 % uptime
Today
nl-ams-2 Operational
90 days ago
96.25 % uptime
Today
pl-waw-2 Operational
90 days ago
99.98 % uptime
Today
nl-ams-3 Operational
90 days ago
96.25 % uptime
Today
pl-waw-3 Operational
90 days ago
99.98 % uptime
Today
Elements - Products Partial Outage
90 days ago
97.03 % uptime
Today
Instances Operational
90 days ago
99.6 % uptime
Today
BMaaS Operational
90 days ago
100.0 % uptime
Today
Object Storage Operational
90 days ago
99.99 % uptime
Today
C14 Cold Storage Operational
90 days ago
100.0 % uptime
Today
Kapsule Under Maintenance
90 days ago
96.7 % uptime
Today
DBaaS Operational
90 days ago
93.26 % uptime
Today
LBaaS Operational
90 days ago
99.99 % uptime
Today
Container Registry Operational
90 days ago
99.96 % uptime
Today
Domains Partial Outage
90 days ago
89.18 % uptime
Today
Elements Console Operational
90 days ago
94.89 % uptime
Today
IoT Hub Operational
90 days ago
99.99 % uptime
Today
Account API Operational
90 days ago
99.98 % uptime
Today
Billing API Operational
90 days ago
94.9 % uptime
Today
Functions and Containers Operational
90 days ago
95.75 % uptime
Today
Block Storage Operational
90 days ago
99.96 % uptime
Today
Elastic Metal Operational
90 days ago
100.0 % uptime
Today
Apple Silicon M1 Operational
90 days ago
100.0 % uptime
Today
Private Network Operational
90 days ago
98.89 % uptime
Today
Hosting ? Operational
90 days ago
100.0 % uptime
Today
Observability Degraded Performance
90 days ago
81.05 % uptime
Today
Transactional Email Operational
90 days ago
100.0 % uptime
Today
Jobs Operational
90 days ago
88.46 % uptime
Today
Network Degraded Performance
90 days ago
100.0 % uptime
Today
Dedibox - Datacenters Degraded Performance
90 days ago
99.96 % uptime
Today
DC2 Degraded Performance