Health Check Script On MME

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Run following health check script on MME for latest health check.

This should detect most of the issues on the MME


Try to use it along with call processing scripts that were sent earlier
Following script covers:
1. Current Software Version
2. Status of diskless and diskfull (OAM Cads)
3. Status of services on MME
4. Link Status
5. One minute snap shot of call processing
6. High level alarm summary

bash-3.2# /home/ntac1/tools/mme_healthcheck
Health Check run at Mon Nov 18 15:12:24 2013

Current loads <<< Software version on OAM cards and MAF cards.
------------_
Host Version
------ -------
m0101-s00c01h0 R28.53.06
m0101-s00c02h0 R28.53.06
HostId=000500 PoolMask=033000000 BuildNumber=28.53.06.00:1373307759
HostId=000600 PoolMask=033000001 BuildNumber=28.53.06.00:1373307759
HostId=000900 PoolMask=033001000 BuildNumber=28.53.06.00:1373307759
HostId=001000 PoolMask=033001001 BuildNumber=28.53.06.00:1373307759
HostId=001100 PoolMask=033002000 BuildNumber=28.53.06.00:1373307759
HostId=001200 PoolMask=033002001 BuildNumber=28.53.06.00:1373307759
HostId=001300 PoolMask=033003000 BuildNumber=28.53.06.00:1373307759
HostId=001400 PoolMask=033003001 BuildNumber=28.53.06.00:1373307759
HostId=010100 PoolMask=033004000 BuildNumber=28.53.06.00:1373307759
HostId=010200 PoolMask=033004001 BuildNumber=28.53.06.00:1373307759
HostId=010300 PoolMask=033005000 BuildNumber=28.53.06.00:1373307759
HostId=010400 PoolMask=033005001 BuildNumber=28.53.06.00:1373307759
HostId=010500 PoolMask=033006000 BuildNumber=28.53.06.00:1373307759
HostId=010600 PoolMask=033006001 BuildNumber=28.53.06.00:1373307759
HostId=010900 PoolMask=033007000 BuildNumber=28.53.06.00:1373307759
HostId=011000 PoolMask=033007001 BuildNumber=28.53.06.00:1373307759
HostId=011100 PoolMask=033008000 BuildNumber=28.53.06.00:1373307759
HostId=011200 PoolMask=033008001 BuildNumber=28.53.06.00:1373307759
HostId=011300 PoolMask=033009000 BuildNumber=28.53.06.00:1373307759
HostId=011400 PoolMask=033009001 BuildNumber=28.53.06.00:1373307759
HostId=000300 PoolMask=034000000 BuildNumber=28.53.06.00:1372882505
HostId=000400 PoolMask=034000001 BuildNumber=28.53.06.00:1372882505
HostId=000740 PoolMask=041000000 BuildNumber=28.53.06.00:1372880752
HostId=000840 PoolMask=041000001 BuildNumber=28.53.06.00:1372880752
Diskless Status <<< Make sure status is “InserviceActive or “InserviceStbyHot”
---------------
HostId=000500 PoolMask=033000000 status=InserviceActive forced=No degraded=No
HostId=000600 PoolMask=033000001 status=InserviceStbyHot forced=No degraded=No
HostId=000900 PoolMask=033001000 status=InserviceActive forced=No degraded=No
HostId=001000 PoolMask=033001001 status=InserviceStbyHot forced=No degraded=No
HostId=001100 PoolMask=033002000 status=InserviceActive forced=No degraded=No
HostId=001200 PoolMask=033002001 status=InserviceStbyHot forced=No degraded=No
HostId=001300 PoolMask=033003000 status=InserviceActive forced=No degraded=No
HostId=001400 PoolMask=033003001 status=InserviceStbyHot forced=No degraded=No
HostId=010100 PoolMask=033004000 status=InserviceActive forced=No degraded=No
HostId=010200 PoolMask=033004001 status=InserviceStbyHot forced=No degraded=No
HostId=010300 PoolMask=033005000 status=InserviceActive forced=No degraded=No
HostId=010400 PoolMask=033005001 status=InserviceStbyHot forced=No degraded=No
HostId=010500 PoolMask=033006000 status=InserviceActive forced=No degraded=No
HostId=010600 PoolMask=033006001 status=InserviceStbyHot forced=No degraded=No
HostId=010900 PoolMask=033007000 status=InserviceStbyHot forced=No degraded=No
HostId=011000 PoolMask=033007001 status=InserviceActive forced=No degraded=No
HostId=011100 PoolMask=033008000 status=InserviceStbyHot forced=No degraded=No
HostId=011200 PoolMask=033008001 status=InserviceActive forced=No degraded=No
HostId=011300 PoolMask=033009000 status=InserviceActive forced=No degraded=No
HostId=011400 PoolMask=033009001 status=InserviceStbyHot forced=No degraded=No
HostId=000300 PoolMask=034000000 status=InserviceActive forced=No degraded=No
HostId=000400 PoolMask=034000001 status=InserviceStbyHot forced=No degraded=No
HostId=000740 PoolMask=041000000 status=InserviceStbyHot forced=No degraded=No
HostId=000840 PoolMask=041000001 status=InserviceActive forced=No degraded=No
MI VC State <<< Status of MI, used for checking which OAM card is being used for management fuctions.
-----------
state of MI virtual cluster is A - Active
state of MI host primary virtual machine is A - Active
state of MI host alternate virtual machine is S - Standby
FS Config VC state
------------------
state of FS CNFG host VC is A
state of FS CNFG host primary virtual machine is A
state of FS CNFG host alternate virtual machine is S
RCC status (OAM)
----------------

RCC Cluster [clusterfile] Status


m0101-s00c01h0 2 A - ACTIVE
m0101-s00c02h0 3 L - LEAD

Aggregate Service Maintenance States – Expect it to be Enabled all the time.


------------------------------------
Entity AdminState OperState AvailabilityStatus UnknownStatus
MIF_AGGSVC NA Enabled Null False
MAF_AGGSVC NA Enabled Null False
MPH_AGGSVC NA Enabled Null False
MME_AGGSVC Unlocked Enabled Null False

Performance Data <<< Not checked yet. I will update it later


----------------
15 minute CPU averages for the time period 1445-1500

-----------
Link Status <<< Use the individual commands provided earlier to run it manually.
-----------
Interface Count Admin_State Oper_State
S1mme 3622 of 3624 Unlocked Enabled
S1mme 2 of 3624 Unlocked Disabled
Details on 2 links in this state:
Interface Link_Index Admin_State Oper_State Avail_Attribute Identifier
S1mme 2157 Unlocked Disabled None 10.89.102.178 LCL=10.148.16.36
S1mme 2758 Unlocked Disabled None 10.209.14.90 LCL=10.148.16.36
POTENTIAL ISSUE: some S1mme links are Disabled
S6a: All 2 links are Enabled <<< All links are UP. Report or check DOWN links
S11: All 6 links are Unlocked/Enabled

S10: All 5 links are Unlocked/Enabled


S102: All 17 links are Unlocked/Enabled

PCMD Information
----------------
====================================< KPI >=====================================
Name Result Data
-------------------------------------- --------- --------------------
VS_FailedInitAttachRate % 14.0059 ( 2381 / 17000)
VS_FailedInitAttachRate_SGW % 0.0000 ( 0 / 17000)
VS_FailedInitAttachRate_PGW % 0.0000 ( 0 / 17000)
FailedInitAttachRate_HSS % 0.0000 ( 0 / 17000)
VS_FailedServiceRequestRate_net % 0.2021 ( 452 / 223628)
VS_FailedUEInitServiceRequestRate % 0.4660 ( 661 / 141833)
VS_FailedNetInitServiceRequestRate % -1.0073 ( -826 / 82004)
VS_FailedNetInitServiceRequestRate_net % 0.0000 ( 0 / 82004)
VS_FailedCallAttemptRate_net % 0.2726 ( 650 / 238445)
UECtxRelReq_Rate % 0.0000 ( 0 / 237539)
VS_AbnormalUECtxRel_Ratio_Tot % 0.4814 ( 1201 / 249488)
VS_AbnormalUECtxRel_Ratio_RF % 0.4513 ( 1126 / 249488)
DetatchFailRate % 0.0000 ( 0 / 185)
UEInitDetatchFailRate % 0.0000 ( 0 / 183)
MMEInitDetatchFailRate % 0.0000 ( 0 / 1)
HSSInitDetatchFailRate % 0.0000 ( 0 / 1)
VS_TAUfailRate % 3.3803 ( 235 / 6952)
VS_FailedS1HORate % 11.2903 ( 105 / 930)
VS_FailedS1HORate_net % 11.2903 ( 105 / 930)
VS_FailedX2HORate_net % 0.0000 ( 0 / 5687)
VS_AbnormalUECtxRel_Ratio_Alt % 0.4814 ( 1201 / 249488)
VS_FailedCallAttemptRate_total % 1.2631 ( 3042 / 240837)
VS_IneffCallAttemptRate % 0.9932 ( 2392 / 240837)
VS_NoPageRspRate % 3.7557 ( 4153 / 110580)
VS_FailedX2HORate % 0.0000 ( 0 / 5687)
VS_FailedS1HOInMMERate ......... ( 0 / 0)
================================================================================
FCA % 13.5882 (2310 /(14690 + 2310))
================================================================================
totalFail % 5.7981 (31657 / 545992) <<< Should be around 5%, report greater than 10% failures.
================================================================================
Time of First Record: 2013:11:18 20:11:0 UTC -05:00 m0101-s00c01h0
Time of Last Record: 2013:11:18 20:11:58 UTC -05:00 Ver: mme_5.0+_20121024
Start time: 2013-11-18 15:12:31 Processing time: 31 (sec) Total Records: 545992

Alarms << Snap shot of alarms


------
Alarm Summary
Severity Alarm information
-------- -----------------
Major LSS_pathAvailability: SCTP PATH UNREACHABLE: RMT=SCTP-36422@10.89.102.178, LCL=SCTP-36412@10.148.16.36
Major LSS_osSecInfoModificationDetected: ERROR: File has an invalid owner or group
(file=/tmp/hsperfdata_ci905841).
Major LSS_osSecInfoModificationDetected: ERROR: File has been illegally altered (file = /var/spool/cron/root,
2275925634, 4214229111).
Major LSS_externalLinkDown: S1mme link 1369 (10.90.225.50) intf name S1MME_1 is Disabled
Major LSS_externalLinkDown: S1mme link 2116 (10.89.103.250) intf name S1MME_1 is Disabled
Major LSS_externalLinkDown: S1mme link 701 (96.24.102.174) intf name S1MME_1 is Disabled
POTENTIAL ISSUE: 6 Major alarms
*** 2 Health Check commands had potential issues for an in-service system ***

You might also like