Negative Impact of Call Completion Services During Disasters
A Case Study by Derya Kitiş & Anıl Arabacı
Istanbul, 2023
Is Your HLR Sufficiently Prepared to Withstand Disasters?
Have you considered the potential impact of Call Completion Services (especially Missed Call Alert and Voice Mail) on the Home Location Register (HLR), particularly during emergencies, which could potentially overload and disrupt its functioning? Allow me to illustrate this concern through a real-life scenario.
Following a massive earthquake, a significant increase in call volume occurred, leading to service disruptions as the operator's HLR grappled with the heightened demand. This surge was primarily due to subscribers repeatedly attempting to call their loved ones excessively to ensure their safety.
The operator's HLR faced challenges in effectively handling the increased traffic during the disaster. Simultaneously, within the Call Completion service framework, the necessity to dispatch SMS messages to the sender or intended recipient for failed calls added to the complexity. This involved conveying information about unsuccessful calls to the Call Completion Service, initiating SRI (Send Routing Info) operations, and generating additional traffic for eligibility checks such as roaming status etc. before SMS transmission. The culmination of these Call Completion procedures significantly contributed to heightened traffic within the operator's HLR.
In order to avoid such circumstances, to minimize the risk of HLR overload, consider implementing the following strategy:
1- When there's significant network congestion at the Home Location Register (HLR), it's recommended to temporarily stop transmitting the Report SM Delivery Status (RSMDS) to ease the burden, while ensuring they remain queued to ensure service continuity.
2-Do not send MAP operations for eligibility checks (e.g. roaming status) for each excessive call initiated from the same calling and called numbers, thereby avoiding the necessity of sending SRI requests to the HLR for every transaction. If there are no other options available, then make sure your Call Completion Platform caches roaming information for a short period of time, preventing sending SRI requests for each excessive call. Utilize Defne's Call Completion Suite and Signaling Gateway HLR caching mechanism for keeping roaming status in cache for a certain time
3-Utilize Defne's Call Completion Suite and Signaling Gateway HLR caching mechanism for keeping IMSI and VLR addresses and send SMS directly to the network with Forward_SM only, avoiding sending SRI_For SM for each SMS, bypassing the reliance on SMSC resources. This feature ensures a streamlined approach to SMS transmission.
1 - Terminating RSMDS Operations Temporarily.
When network congestion is detected, the initial action is to halt the transmission of RSMDS (Report SM Delivery Status) requests to the HLR. Simultaneously, failed call processing remains queued to uphold business continuity.
2- Utilizing HLR Caching for Eligibility Checks
Existing Process vs Proposed Solution
Currently, for nearly every failed call, Call Completion services in their process flows are required to send various requests to the HLR (Home Location Register). Some of these include SRI (Send Routing Information) requests sent to obtain subscriber roaming status or Call Forwarding preferences for voicemail, for instance. It's evident that during emergencies resulting in excessive call volumes, continuing to send these requests to the HLR would significantly exacerbate HLR congestion.
The proposed solution in this scenario is to cache the subscriber's current information after the initial request. This caching mechanism aims to circumvent the need for repeated queries to the HLR for subsequent requests, thereby alleviating the strain on the HLR.
Defne Signaling Gateway's Caching Mechanism
Acknowledging the diverse data access needs among different services—some necessitating real-time information and others relying on historical data—a memory database within the SGW stores responses for a specific period. In subsequent queries for the same data, the response is directly fetched from this cache. Tailoring the service for each client involves tracking cache usage status and establishing parameters for handling the maximum historical data required.
3- Utilizing HLR Caching for SMS Sending
Under normal circumstances, SMS messages related to Call Completion services are sent to the SMSC. However, during emergency situations resulting in excessive call volumes, continuing to send these requests to the HLR would inevitably significantly increase HLR congestion. Moreover, utilizing SMS resources for these services under regular conditions is unnecessary.
As an alternative approach, redirecting SMSs to terminate directly within the network instead of the SMSC and eliminating unnecessary SRI_SM requests could be a viable solution.
With Defne SGW's caching mechanism, when a request is made to send another SMS to a number already present in the memory, the process can be completed with just a single transaction. SMS transmission can be accomplished by sending FW_SM through Defne’s CCS (Call Completion Suite) directly to the VLR, without the need for sending SRI_SM operation towards HLR via an SMSC. This eliminates the load on HLR for SMS transmissions. For numbers not found in the memory, the process still requires two transactions. Considering the VLR coverage areas (approximately ... km), it can be estimated that over 90% of subscribers remain connected to the same VLR until they receive another message.
By benefiting the caching mechanism for eligibility checks and integrating the Call Completion Suite for direct SMS transmission, businesses can enhance network efficiency and resilience. This strategic approach not only mitigates the risk of HLR overload during emergencies but also streamlines SMS processes, reducing reliance on SMSC resources. This heightened efficiency, coupled with the reduction in HLR load, promises to mitigate service downtimes and disruptions to P2P calls in emergency scenarios.
The process of sending a simple SMS via the SMSC typically involves a minimum of two transactions within the network. Initially, an SRI-SM request is sent to the HLR to obtain IMSI and VLR information. In emergency scenarios like earthquakes, this places a significant load on the HLR, affecting its smooth operation and disrupting P2P calls. After receiving the response from the HLR, the SMS is then sent to the relevant VLR with IMSI.
Notably, the current systems lack a caching mechanism in the STP (Signaling Transfer Point). Consequently, when queries from various services for the same number occur within similar time frames, the STP must relay separate messages to the relevant units (e.g., HLR) for each query, placing a burden on the unit and causing inefficiencies in the STP.
Were you aware that those risks can be effectively mitigated with Defne Signaling Gateway's (SGW) HLR Caching feature?
I wonder how the restart process would be in such a case of HLR failure. Would it be similar to a 'black start' for power grids? (Power grids require power to be turned back on, so they use power from other grids when they're restarting. In rare cases the whole grid will be shutdown, so there won't be any other grids to take power from. In this case the grid will require a 'black start', which is much more troublesome than a normal start.) When the HLR crashes and SRI requests get piled up, if the telecom operator tries to restart the HLR it will get shutdown on launch. It won't be able to restart because the SRI requests will overload…