A widespread cellphone outage hit Denmark on November 28, with Danish telecoms operator TDC Net suffering a major telecom outage. The outage disrupted communication for thousands and raised serious concerns about the reliability of essential telecom services. Among those affected were emergency responders, hospitals, and commuters, showcasing how dependent modern society is on seamless connectivity. But the question on everyone’s mind is—what exactly went wrong and could it have been avoided? Well, we’re here to answer that question.
What caused TDC’s outage?
The outage originated from a software update rolled out by TDC Net, one of Denmark's largest telecom operators, on November 27. TDC confirmed that the disruption wasn’t caused by a cyberattack, putting to rest initial fears of malicious interference. However, the update triggered unforeseen issues, crippling the network’s ability to handle calls—including emergency numbers like 112.
Although TDC implemented a fix later in the day, customers had to take unusual steps, such as removing their SIM cards to make emergency calls, and were even advised to use alternative communication methods or switch to another provider. The company acknowledged the issue and committed to a thorough investigation to prevent similar incidents in the future.
The impact on essential services
The outage wasn’t just an inconvenience for everyday cellphone users, it had far-reaching consequences on people’s access to essential services:
- Emergency services. Calls to the emergency number 112 were blocked, leaving security services to patrol the streets actively searching for those in distress.
- Healthcare. At least one hospital was forced to reduce non-critical care, prioritizing resources for critical cases.
- Transportation. Train and bus services experienced delays, with signaling issues leading to chaos at stations and passengers stranded mid-journey.
These disruptions highlighted the vulnerability of critical infrastructure when telecom networks fail, emphasizing the urgent need for robust backup systems and thorough software quality assurance processes. It also reminds us of the impact of such outages, like the recent CrowdStrike update which also caused mass outages.
Could TDC’s outage have been prevented?
Telecom outages from software updates are not unprecedented, but proactive measures could mitigate their impact. Here are some of the measures TDC could have taken to avoid the outage—or at the very least reduce the severity of its impact.
- Rigorous pre-deployment testing. Thorough testing in controlled environments, including stress tests and simulations, could identify potential flaws before an update is widely deployed.
- Staged rollouts. Implementing updates gradually allows engineers to detect and address issues early, reducing the risk of a complete system failure.
- Backup systems for emergency services. Redundant communication systems for emergency services, independent of regular telecom networks, would ensure uninterrupted access to critical help lines.
- Transparent crisis management. Clear communication with the public about issues and temporary fixes—like the SIM card removal workaround—can alleviate panic during such events.
- Comprehensive software testing. Working with a QA provider that has experience and knowledge of the telecommunications industry or using QAaaS can help detect issues before they escalate.
What would we have done to prevent the faulty software update?
Preventing an outage like the one experienced in Denmark would require rigorous software testing across multiple types of testing phases to ensure the reliability and stability of the system. Here are the key types of software testing that we believe could have helped mitigate such risks:
1. Regression testing
A software update, as in this case, may unintentionally disrupt existing functionality. Regression testing ensures that updates do not negatively affect previously working features. Specifically, it detects potential failures in critical operations, such as network connectivity and emergency service routing. Also, it ensures that previous fixes and stable features remain intact after updates.
2. Load testing
Telecommunication systems need to handle high traffic volumes, especially during peak usage or emergencies. Load testing evaluates system performance under expected and peak load conditions. It identifies bottlenecks, such as server overload or database constraints, before the software is deployed.
3. Disaster recovery testing
Outages can occur even with rigorous planning. Failover systems and disaster recovery mechanisms ensure continuous service. Disaster recovery testing evaluates whether backup systems seamlessly take over during primary system failures. This method also ensures emergency services remain accessible under all circumstances.
4. Integration testing
Telecom systems rely on the interaction between multiple components, including hardware, software, and third-party integrations. Integration testing validates that all system components work together correctly after updates. In addition, it detects issues arising from incompatibilities or unexpected interactions.
5. End-to-end (E2E) testing
Updates can impact the entire workflow of a system, from customer access to emergency communication protocols. End-to-end testing simulates real-world usage scenarios to test the complete functionality of the system. It makes sure that critical processes, such as dialing emergency numbers, work flawlessly.
6. Audio quality testing
Telecom systems need to maintain consistent audio performance under varying conditions, including simultaneous calls and high traffic spikes. Audio quality testing measures response times, call quality, and stability under simulated conditions. Our innovative developments and audio testing setups, like our acoustically treated audio laboratories, allows us to simulate real-world network conditions, such as latency, packet loss, and fluctuating bandwidth, to identify weak points in the system's ability to handle adverse conditions. This type of testing prevents degradation in service quality, such as the reduced sound quality noted during the outage.
7. Security testing
While cyberattacks were ruled out in this case, security testing ensures that vulnerabilities, including those potentially introduced by updates, are mitigated. Security testing detects security gaps that could be exploited to cause system failures and validates the integrity of the software and its updates.
Essentially, with a team of over 500 QA engineers and more than 5,000 real devices, TestDevLab would have put into place an efficient QA strategy that would consider all possible scenarios—even edge cases—and run all the required tests to ensure that such an outage would not have been able to occur.
Learning from TDC’s outage
The TDC outage serves as a wake-up call, not just for Denmark but for telecom providers worldwide. It underscores the importance of having robust software quality assurance processes in place, stringent update protocols, and reliable emergency backups to ensure public safety and trust in critical infrastructure.
As we become increasingly reliant on interconnected services, telecom operators must take every precaution to safeguard the systems users depend on—because when they fail, the impact can be life-threatening.
Don’t let avoidable software bugs put your brand in the spotlight for all the wrong reasons—or drive your users straight to your competitors. Get in touch with us to learn more about our software testing services and how we can help keep your product and reputation intact.