ORAN Proposal

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 7

ADT : Anomaly Detection Tool

Existing Problems
Problem Statement
 Manual Approach Challenges

Detect Network Degradation as soon as possible. • Analyze manually for all cell, per hour and detecting dynamic threshold.
Generation of Anomaly report across large circle network
• Huge No of Cells – our EMS circle has 80000 cells , ~950 GB/day
takes several Man-days .
• Miss tagging/Incorrect tagging leads to error in analysis.
• Summary report, Issue mitigation will be delayed.
ADT : KPIs Behavior

Diamond Kpi Category


Conclusions
Connection: Setup Success Rate, Call Drop rate, Session time
• Problem in a mobile
Quality: Throughput, Packet delay, No of connected user communications network can be
directly mapped to anomalous
Data Volume: QCI 1/2/5/9
behaviour of performance
Mobility: Handover Success Rate, Handover Time
matrices collected from deployed
NE (Network Elements).

KPI Gaussian Behavior • Thus our goal is to automate


anomaly detection using
Statistical / ML models to reduce
effort to less then an Hour across
large circle network.

• There are multiple factors to be


considered like number of users,
climate conditions, geo-location,
etc. affect KPIs.
ADT : Architecture

Initialization Module Pre Processing


Highlights
Context Filtering
(Week Day, Holiday)
Derive Diamond KPI
RAN: EMS Server Load Kpi data/hourly (Formula Based)
Collect KPI /15 min Aggregated 500 kPI to 67 KPI
Data Scaling Hourly, Per Cell  There are expected fluctuations
(24 hours * 2 scalars * 80000 )
Big Data/ Sq-Lite Interface in KPIs across hours.

 Outliers are detected using z-


Model Training
Model Validation
Domain Labelling scores with help of hourly
scalars for NEs computed over
Apply Z-Test
Train XG Boost model (Tuneable Good KPI Logic Labelling 30 days.
Tune Hyper Parameters Threshold/KPI)

 Thresholds for detection are


decided using feedbacks.
Background Polling Job
Anomaly Report Generation Re-Training
 After analysing, multiple
KPI Anomaly Frequency Highlight Each
machine learning algorithms like
Existing Cell Update
Summary anomaly/hour – 5000*67 Scalers/day
New Cell Training(15 Days)
Isolation Forest, CBLOF, KNN,
OCSVM, XGBOOST, Random
Forest, XGBoost performed well
UI View /Web View Logging Module
Model Training/Threshold Reset Scalars on seen and unseen data.
based Cell or NE or EMS level
summary Asynchronous

Sub-Module Module Future Scope


ADT : Experiment Outcomes
Models And Results Highlights
• The labelled data obtained after pre-processing for hyper parameters tuning.
Distance based: KNN,LOF(Local Outlier Factor),CBLOF(Clustering based LOF)

Density Based: HDBSCAN (Hierarchical DBSCAN) • Algorithms need to be updated with “contamination” value for best results.
Contamination defines percentage of anomalous data within given sample set.
Ensemble Based: IF(Isolation Forest), XG Boost

High dimensional space based: OCSVM(One class support Vector Machine) • XGBoost performed best along all the metrics on multiple circles in cellular
network.

120
EMS Accuracy Report
100
Model Audit
No of New Anomaly Anomaly
EMS Name Date Hour F1 Score Accuracy Precision Recall ROU AUC Cells Cells Cells No of Cells
80
EMS1 Day-n 15:00:00 0.99 0.99 1.00 0.99 0.98 61 3485 3506 3872

60
EMS2 Day-n 15:00:00 0.98 0.98 0.99 0.97 0.98 81 3497 3580 5063

40
EMS3 Day-n 15:00:00 0.99 0.98 0.99 0.99 0.98 115 3611 3643 4182

20
EMS4 Day-n 15:00:00 0.99 0.98 1.00 0.99 0.98 125 3812 3847 4320

0
Isolation CBLOF KNN OCSVM XGBOOST EMS5 Day-n 15:00:00 0.99 0.99 1.00 0.99 0.98 140 4547 4571 5008
Forest EMS6 Day-n 23:00:00 0.97 0.97 0.97 0.96 0.97 202 2500 2533 5069

Accuracy F1-Score Precision Recall One time testing


Latest Live Internal Server testing for 1 EMS
ADT : Experiment Outcomes
KPI Summary Report
KPI's/Counts EMS1 EMS2 EMS3 EMS4 EMS5 Executive Summary
 Data Aggregation, Data Preprocessing, Domain
DL Effective Mbps] 564 740 753 962 949 Labelling
 Model Training XG-Boost, Model validation
UL Effective Throughput [Mbps] 241 506 440 513 264

DL Volume (GB) 647 1151 1069 1037 698


 POC tested with 17 EMS, with 96% accuracy(One
time )
UL Volume (GB) 335 768 651 631 351
 Validated tool in Internal server with background
Avg. RRC Connected users 867 1706 1600 1344 922 logging, parallel execution, re-training and reset
features on 1 EMS – 5000 cells
UL Interference Power (dBm/RB) 613 554 651 1070 2089  Stability Bug fixes for 17 EMS volume of data

IP Throughput (Mbps) 236 242 335 605 1209  Validation with live 17 EMS (80000 cells) for 2
week continuously
Avg. Active UE-QCI1 542 1193 1046 1407 869

Avg. Active UE-QCI9 450 1124 1016 855 505  Large circle in centralized server (SNAP)

PDCP Loss Rate-QCI1(%) 342 481 638 988 884


ADT : Advantages

• We need not have to build separate models for each cell. A single generalized model can work on
multiple circles with great results.
• This approach accounts for changes across times of the day, and days across week.
• With this approach, we are able to get rid of any herculean and laborious manual process of labelling
our data for anomalies.
• This approach adopts a combination of statistical approach with training a machine learning model,
corroborating with domain experts in Telecom with whom we worked in collaboration.
• This model gives high accuracy for anomaly detection (~99%)
Thank You

You might also like