Professional Documents
Culture Documents
HSS9860 V900R008C20 Troubleshooting
HSS9860 V900R008C20 Troubleshooting
Troubleshooting
www.huawei.com
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page3
Objective
Upon completion of this course, you will be able to know:
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page4
Contents
1. Troubleshooting
2. Prevent Failures
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page5
Troubleshooting
Faults of a system are classified into common faults and emergency faults.
Common Fault
Common faults are device faults that occur unexpectedly and affect a small range of
services or devices. They do not severely affect the running and quality of service (QoS)
of a network.
Category Description
Service failures Service failures complained by subscribers
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page6
Troubleshooting
Emergency Fault
Common faults are device faults that occur unexpectedly and affect a small range of
services or devices. They do not severely affect the running and quality of service (QoS)
of a network.
Troubleshooting emergency fault aims to recover the system and service provisioning
as soon as possible. To improve the efficiency of troubleshooting emergency faults
and minimize the loss, you must adhere to the following principles.
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page7
Overview of Alarm Handling
Alarm Console
The alarm box provides only visible and audible alarm severity information. The alarm
console on the LMT provides the details about alarms.
Alarm Severity
The alarm severity indicates the severity level of an alarm.
In descending order of alarm severity, alarms are classified into four types:
Critical alarm: Critical alarms should be cleared immediately. Otherwise, system breakdown
may occur.
Major alarm: Urgent action is required to rectify the fault as this type of alarms affects the QoS of
the system.
Minor alarm: This type of alarms does not affect the QoS of the system, but you need to locate and
remove these faults in time.
Warning alarm:This type of alarms should be handled based on the actual conditions.
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page8
Fault Detection Mechanism
The fault detection subsystem monitors the operating status of the equipment through
hardware detection and software detection. It reports the detected faults to you so that you
can rectify fault in time.
Hardware detection
The hardware detection implemented by boards is as follows:
Board state (normal/abnormal or active/standby)
Clock
Temperature
Online/Offline state
Software detection
Logical errors can be detected through software detection. The logical errors that can be detected
are as follows:
Cyclic Redundancy Check (CRC) error
Memory error
Data consistency error
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page9
Analysis Methods Used in
Troubleshooting
Analyzing Indicator Status
Indicators reflect different status of boards and links through different colors. The status of indicators
can be used for fault identification when a board experiences a fault.
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page10
Analysis Methods Used in
Troubleshooting
Analyzing Performance Measurement Information
Performance measurement collects the running information of the system in real time. The
performance measurement information reflects the running status of the system. It can be used for
fault identification when the system experiences a fault.
Analyzing Traced Messages
Message tracing provides dynamic and real-time monitoring on the call connection process, resource
usage, and service flow over ports and signaling links. The traced messages allow you to locate a call
connection failure quickly and help you to troubleshoot the fault. In addition, the traced messages
help you to learn about the signaling exchange between NEs.
Analyzing Error Codes
Error codes are returned by the system when the operations performed on the client fail. The error
codes are used to query specific error information, which is helpful for fault identification.
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page11
Analysis Methods Used in
Troubleshooting
Analyzing Logs
Logs record specific running information of each module of the system or data configuration
operations performed on the client. You can use logs to identify faults, which, however, is more time
consuming compared with other analysis methods. Therefore, use logs to identify faults only when
the other analysis methods do not work.
Analyzing the Device Panel
The device panel provides a visual emulation pane, through which you can perform operations to
manage the hardware, software, and modules of the system. It displays the boards in different colors
according to their hardware status; it also displays the status indicators of modules in different colors
according to the process status. The colors of boards and status indicators of modules can be used for
fault identification when a board or module experiences a fault.
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page12
Analysis Methods Used in
Troubleshooting
Analyzing the Service Panel
The service panel provides a typology view about the logical modules of the system. It depicts the
service processing of modules on different logical layers in graphs or continuous curves. It displays
the modules in different colors according to their status. You can quickly locate a faulty module
based on the colors. Particularly, in case of emergency faults, the service panel effectively shortens
the duration for fault identification.
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page13
Analysis Methods Used in
Troubleshooting
Methods of Analyzing Common Faults
Fault Method
Hardware faults Analyzing Alarm Information, Analyzing the Device Panel, Analyzing
Indicator Status, Analyzing Logs
Link faults Analyzing Traced Messages, Analyzing Alarm Information, Analyzing
Performance Measurement Information, Analyzing Indicator Status,
Analyzing Logs
Operation failures on the Analyzing Alarm Information, Analyzing the Device Panel, Analyzing
OMU Error Codes, Analyzing Logs
Operation failures on the Analyzing Alarm Information, Analyzing Error Codes, Analyzing Logs
provisioning system
Subscriber service failures Analyzing Alarm Information, Analyzing Performance Measurement
Information, Analyzing Traced Messages, Analyzing Logs
Device performance faults Analyzing Performance Measurement Information, Analyzing Alarm
Information
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page14
Analysis Methods Used in
Troubleshooting
Methods of Analyzing Emergency Faults
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page15
Contents
1. Troubleshooting
2. Prevent Failures
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page16
Overview of Fault Prevention
Fault prevention is a set of preventive measures taken regularly while the system is running. Fault
prevention helps to locate and eliminate defects or helps to troubleshoot the system in time to ensure
long-term security and stability of the system.
Based on the implementation period, the fault prevention can be classified into daily maintenance and
periodic maintenance.
Daily Maintenance
Daily maintenance consists of simple operations performed daily by common maintenance
personnel.
Periodic Maintenance
Periodic maintenance consists of complex operations performed regularly by qualified
maintenance personnel.
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page17
Overview of Fault Prevention
Daily Maintenance
Identify alarms generated by the equipment or identify existing defects on the equipment in time, and take
preventive measures. This ensures the stability of the equipment and reduces the number of faults or failures.
Examine the operating status of the equipment and the network in real time, and determine the running
status of the equipment and the network in a future period. This helps to improve the efficiency of
maintenance engineers in handling emergencies.
Periodic Maintenance
Ensure that the equipment is in good condition, and it is safe, stable, and reliable to operate.
Identify the defects in the equipment, such as natural aging, malfunction, and deterioration of performance
through periodic checks, backup measures, tests, and cleaning processes. Take proper measures to eliminate
these defects.
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page18
Category of Fault Prevention
Operations
Based on the maintenance period, the fault prevention operations are categorized as follows:
Check and troubleshoot the faults reported by the system every day.
Check the system performance and subscriber data backed up by the third-party device every week, and identify
and rectify the potential faults. This helps to ensure the normal running of the system and provides data
consistency.
Check the system running status and data consistency between the active and redundancy systems every month.
This helps to eliminate the potential faults from the systems.
Check the system time and running status of the internal components of the system every quarter of the year.
This helps to ensure the normal running environment for the equipment.
Half-yearly Maintenance measures: Check the system ports and system passwords every half year. This helps to
ensure that the normal running environment for the equipment.
Check the switchover between the active and redundancy systems, cable connections, grounding, and power
supply in the equipment room every year. This helps to ensure that the standby modules or the redundancy system
can take over the services of the active system in case of a fault. It also helps to eliminate the potential risks caused
by the aging of the equipment.
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page19
Prevent Failures
Daily Maintenance
Office name:________________________ Maintenance date (year-month-day):___________________
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page20
Prevent Failures
Weekly Maintenance
Office name:________________________ Maintenance date (year-month-day):__________________
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page21
Prevent Failures
Monthly Maintenance
Office name:________________________ Maintenance date (year-month-day):___________________
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page22
Prevent Failures
Quarterly Maintenance
Office name:________________________ Maintenance date (year-month-day):___________________
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page23
Prevent Failures
Semi-annual Maintenance
Office name:________________________ Maintenance date (year-month-day):___________________
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page24
Prevent Failures
Yearly Maintenance
Office name:________________________ Maintenance date (year-month-day):___________________
Copyright © 2013 Huawei Technologies Co., Ltd. All rights reserved. Page25
Thank you
www.huawei.com