Update - Here is a checkpoint following issues within VPC networks in Region FR-PAR that occurred in the last few days.
From Tuesday 25th, a few issues were raised to our customer support involving connectivity issues in VPC networks.
Some cases were self solved by restarting impacted nodes and/or updated IPs/MACs information.
At the time, it did not appear to be out of ordinary issues.
On Wednesday 26th, more cases were raised and escalated, raising awareness of a potential more widespread bug that could disrupt some edge cases customers, engineers were involved to collect data and investigate cases.
At 12:45 CET, one of VPC Edge routers crashed. A priority 2 incident was opened to engage more resources and analysis.
Huge levels of BUM (Broadcast/Unicast/Multicast) traffic cross-AZ and cross-products were also identified.
In an attempt to solve these issues, the impacted VPC Edge router was fixed and re-added to the production pool to ensure enough computing resources were available to handle the BUM traffic.
This reduced the global level of BUM traffic in all impacted VPC, leading us to believe that the situation was under control.
On Wednesday afternoon, more cases of latency and connectivity issues were raised to customer support.
Investigations kept going on and pinpointed some desynchronization issues between VPC Edge routers and information systems that could explain higher BUM level than usual.
Decision to restart VPC Edge routers services in fr-par-2 was made in an attempt to clear inconsistencies and freshen data on the routers.
After a few perturbations, the situation was stable by Wednesday evening.
We kept monitoring BUM levels and decision was made to accelerate a deployment of new devices that was already being prepared (more CPU & memory to avoid saturation of VPC Edge routers in case of failover scenarios).
Thursday 27th, new devices were added in the morning in an attempt to better handle BUM traffic.
While this showed useful (reducing impacts), it did not significantly help globally in reducing BUM levels.
In the afternoon, a workaround was deployed to protect our VPC Edge routers from these excessive BUM traffic.
A rate-limiter was applied, per customer network, to ensure a no saturation scenario on VPC Edge routers and make sure legit BUM traffic was handled with the right priority in the routers.
This actually caused a high reduction of BUM traffic, as we hoped (stopping what we believe was a snowball effect).
On Thursday evening, deployment of the rate-limiter was globally done and operations were stopped to monitor the situation.
One conclusion of the day is that our maintenance model for VPC Edge routers seems to create ripple effects on customer networks (while it was designed to be fully transparent).
Friday 28th, investigations are still ongoing to understand the root causes of the issues, high levels of BUM traffic are supposed to be handled transparently, and desynchronization issues are not an expected behavior.
Data collecting on one of the VPC Edge routers caused a crash around 11:00 CET, causing ripple effects, as we understood the day before.
Data collection and analysis kept going on during the day, including customer debug sessions to better understand some specific cases.
During the afternoon, a human error was made by 15:50 CET causing another VPC Edge router unavailability, but we were able to restore it faster and with fewer ripple effects this time.
So, the current situation is that we were able to significantly reduce BUM levels globally, and we believe these are related to desynchronization issues of our VPC Edge routers (causing “unknown MAC and/or IP” issues, so a broadcast behavior).
These desynchronizations are probably the cause of our maintenance model issues (traffic is not being treated equally/fairly by redundant VPC Edge routers).
Our teams are working on reducing desynchronization issues (short term) and fixing the underlying condition once identified (mid-term).
Decision was made to stop any action on VPC Edge routers for the weekend, so all the investigation work will be done off the routers with the currently collected data to avoid creating more perturbations.
We believe the situation to be stable for almost all customers, but our VPC infrastructure in FR-PAR region cannot be considered in a fully stable condition, and we deeply apologize for this.
We want to assure you that all of our teams are focused on fixing the issues.
We encourage customers to share all collected data with our customer support team if you still encounter issues.
Nov 28, 2025 - 22:13 CET
Update - From 10:26 CET to 11:11 CET a VPC equipment triggered errors.
You may have seen packet loss during this time.
Nov 28, 2025 - 11:56 CET
Monitoring - Our teams have deployed a fix on FR-PAR-1.
The situation is now under monitoring and stable.
Nov 27, 2025 - 21:37 CET
Update - Our teams have deployed a fix on FR-PAR-2 and are continuing to work on deploying one on FR-PAR-1 as well.
Nov 27, 2025 - 19:40 CET
Update - We are still deploying a fix, during this deployement you may see some problems with DNS resolution inside kapsule clusters.
Nov 27, 2025 - 17:49 CET
Update - We are deploying a fix to stabilize the broadcast traffic.
Nov 27, 2025 - 16:47 CET
Update - We are still occuring high level of broadcast traffic causing some perturbations. We have a fix being tested at the moment and we will start the deployment in the next hour
Nov 27, 2025 - 15:25 CET
Identified - In order to stabilize broadcast traffic, we are adding more computing ressources to better handle the load, some maintenances will occur today
Nov 27, 2025 - 11:05 CET
Monitoring - The maintenance announced in the previous message was not necessary and has been canceled.
We are monitoring the situation.
Nov 26, 2025 - 19:20 CET
Investigating - From 18:00 CET we are restarting a few VPC services on fr-par-1 to fix some edges cases
Nov 26, 2025 - 18:07 CET
Monitoring - Maintenance completed successfully.
Nov 26, 2025 - 17:36 CET
Update - From 16:30 CET we will proceed with a restart of some internal VPC services to fix some edge cases, a few perturbation can occur during the restart but should not last.
Nov 26, 2025 - 16:18 CET
Update - We are continuing to investigate this issue.
Nov 26, 2025 - 15:50 CET
Investigating - A few customers are still impacted by a few perturbations, the VPC team is working to fix these isolated cases
Nov 26, 2025 - 15:38 CET
Monitoring - We have noticed a degradation in the performance of the product link to VPC, at 13:58 CET, A fix has been made.
We are currently monitoring the situation.
Nov 26, 2025 - 15:03 CET