Espace EMS Troubleshooting Guide (V200R001C02SPC200 - 04) PDF

eSpace EMS
V200R001C02SPC200
Troubleshooting Guide
Issue 04
Date 2012-06-08
HUAWEI TECHNOLOGIES CO., LTD.

Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior
written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and
the customer. All or part of the products, services and features described in this document may not be
within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,
information, and recommendations in this document are provided "AS IS" without warranties, guarantees or
representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute the warranty of any kind, express or implied.
Huawei Technologies Co., Ltd.

Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China
Website: http://www.huawei.com
Email: support@huawei.com
Huawei Proprietary and Confidential

Issue 04 (2012-06-08) i
Copyright © Huawei Technologies Co., Ltd.
eSpace EMS
Fault Management Contents
Contents
1 Conventions ................................................................................................................................... 1
2 Overview......................................................................................................................................... 2
2.1 Fault Source ..................................................................................................................................................... 2
2.2 Precautions for Troubleshooting ...................................................................................................................... 3
2.3 Requirements on Maintenance Personnel ........................................................................................................ 3
2.4 Troubleshooting Flow ...................................................................................................................................... 4
2.4.1 Troubleshooting Flowchar ...................................................................................................................... 4
2.4.2 Collecting Fault Scenario Information .................................................................................................... 5
2.4.3 Locating and Rectifying Faults ............................................................................................................... 6
2.4.4 Checking Fault Rectification .................................................................................................................. 6
2.4.5 Generating a Fault Rectification Report.................................................................................................. 6
2.4.6 Contacting Huawei .................................................................................................................................. 6
2.5 Obtaining Huawei Technical Support............................................................................................................... 7
3 Methods of Locating Faults ......................................................................................................... 9

3.1 Viewing Alarms on the eSpace EMS Client ..................................................................................................... 9
3.2 Log Analysis................................................................................................................................................... 11
3.2.1 Changing a Log Level ........................................................................................................................... 11
3.2.2 Logs ...................................................................................................................................................... 13
4 Fault Analysis .............................................................................................................................. 19

4.1 Performance Fault Analysis............................................................................................................................ 19
4.1.1 Performance Statistics ........................................................................................................................... 20
4.1.2 Performance Alarms .............................................................................................................................. 21
4.2 Software Management Fault Analysis ............................................................................................................ 22
4.2.1 Executing an Installation or Upgrade Task ........................................................................................... 22
4.2.2 Checking Host Information ................................................................................................................... 25
4.3 iTrace Analysis ............................................................................................................................................... 27
4.3.1 Creating a Tracing Task ........................................................................................................................ 28
4.3.2 Displaying Tracing Messages ............................................................................................................... 32
4.4 iCnfg Analysis ................................................................................................................................................ 38
4.5 DR Fault Analysis .......................................................................................................................................... 40
5 Troubleshooting .......................................................................................................................... 46
Issue 04 (2012-06-08) Huawei Proprietary and Confidential ii

eSpace EMS
Fault Management Contents
5.1 Checking the Running Status of the eSpace EMS .......................................................................................... 46

5.1.1 Starting the eSpace EMS Service .......................................................................................................... 46
5.1.2 Querying the eSpace EMS Service Status ............................................................................................. 47
5.1.3 Stopping the eSpace EMS Service ........................................................................................................ 47
5.2 Checking the Running Status of the DR System ............................................................................................ 48
5.2.1 Starting the GDR Software ................................................................................................................... 48
5.2.2 Checking the Process Status of the GDR Software ............................................................................... 49
5.2.3 Checking the States of DR Resources ................................................................................................... 50
5.2.4 Checking the Database Synchronization Status .................................................................................... 51
5.2.5 Checking the File Synchronization Status............................................................................................. 52
5.2.6 Checking the Statuses of the Switched Roles of the DR System .......................................................... 53
5.2.7 Stopping the GDR Software .................................................................................................................. 54
6 Collecting Fault Information .................................................................................................... 55

6.1 OS Information............................................................................................................................................... 55
6.2 Network Device Information.......................................................................................................................... 56
6.3 DR Information .............................................................................................................................................. 59
6.4 Oracle Database Information.......................................................................................................................... 61
6.5 Collecting Logs .............................................................................................................................................. 63
6.6 Version Information........................................................................................................................................ 69
7 Troubleshooting Cases............................................................................................................... 70
7.1 Filesync Exception ......................................................................................................................................... 71
7.2 DataGuard Synchronization Exception .......................................................................................................... 71
7.3 GDR Process Exception ................................................................................................................................. 72
7.4 Modifying Information About the Master Node Corresponding to the Mediation Node After Switching ..... 73
7.5 The Performance Data of Some Network Devices Cannot Be Collected on the eSpace EMS....................... 73
7.6 Fault Rectification About IP PBX Performance Data Collection Status ........................................................ 74
7.7 Fault Rectification in the File System ............................................................................................................ 75
7.8 eSpace EMS Page Is Leftward Offset in IE 8.0 ............................................................................................. 77
7.9 File Download Dialog Box Is Displayed After a Click on the Upload Icon ................................................... 78
7.10 Failure to Export Data .................................................................................................................................. 79
7.11 Browser Page Cannot Be Properly Displayed or Some Browser Functions Are Unavailable ...................... 82
Issue 04 (2012-06-08) Huawei Proprietary and Confidential iii

eSpace EMS
Fault Management 1 Conventions
1 Conventions
This topic describes conventions of this guide.

 The user name of the eSpace EMS is i2kuser.
 {Install Path} is the installation path of the eSpace EMS. The default path is /opt/oms.
 {GDRWORKDIR} is the GDR installation path. The default path is /opt/oms/gdr.
Issue 04 (2012-06-08) Huawei Proprietary and Confidential 1

eSpace EMS
Fault Management 2 Overview
2 Overview
About This Chapter

This topic helps maintenance personnel to locate and rectify faults.
2.1 Fault Source
This topic describes the fault sources that trigger fault handling activities, and the jobs of
responsible persons before they submit faults to maintenance personnel.
2.2 Precautions for Troubleshooting
Maintenance personnel must take the relevant precautions before locating and rectifying faults,
ensuring the safety of the personnel, services, and devices, including significant and
dangerous operations.
2.3 Requirements on Maintenance Personnel
This topic describes the requirements for the qualifications of maintenance personnel.
2.4 Troubleshooting Flow
This topic describes the general process of rectifying faults and the operations in each step.
2.5 Obtaining Huawei Technical Support
This topic describes how to obtain technical support from Huawei.
2.1 Fault Source

This topic describes the fault sources that trigger fault handling activities, and the jobs of
responsible persons before they submit faults to maintenance personnel.
The fault sources are as follows:
 Customer complains
The customer service department receives customer complaints and starts the fault
rectifying process. The department filters out non-defect events, collects fault scenario
information, and transfers faults to maintenance personnel.
 Routine maintenance

eSpace EMS
In routine maintenance, maintenance personnel regularly take preventive measures

during the normal running of devices to detect and eliminate hidden faults in the devices
in time.
The routine maintenance of the eSpace EMS includes but is not limited to the following:
− Check whether the services on the eSpace EMS server run normally.
− Check whether the database runs normally.
− Check whether the performance indicators of servers and services meet requirements.
For more information about routine maintenance, see theRoutine Maintenance.
2.2 Precautions for Troubleshooting

Maintenance personnel must take the relevant precautions before locating and rectifying faults,
ensuring the safety of the personnel, services, and devices, including significant and
dangerous operations.
Before locating and rectifying faults, maintenance personnel must:
 Strictly comply with the operation and industry safety regulations to ensure the safety of
personnel and devices.
 Take antistatic measures such as wearing an ESD-preventive wrist strap when replacing
and maintaining device parts.
 Not directly connect external computers to the eSpace EMS.
 Strictly control the use of network services.
 Record all relevant raw information in detail when any problem arises during
maintenance.
 Record all significant operations, such as restarting processes. Before these operations,
check the feasibility of the operations, back up data, prepare emergency and safety
measures, and make sure that operations are performed by qualified operators.
 Be cautions when performing the following dangerous operations:
− Deleting directories and files from the eSpace EMS
− Modifying the configuration files of the database
− Modifying the attributes of the database
− Deleting the log files from the systems and database
− Stopping the systems, processes, and database
− Running the kill command
− Modifying the configurations of the network devices
2.3 Requirements on Maintenance Personnel

This topic describes the requirements for the qualifications of maintenance personnel.
To ensure effective maintenance, maintenance personnel are required to have the basic
knowledge of networks and computers, be clear about the service processes of the eSpace
EMS, skillful in locating and rectifying faults, and familiar with the on-site environment.
Thus maintenance personnel must meet the following requirements:

eSpace EMS
 Having the basic knowledge of network devices, operating systems (OSs), databases,
understanding the common commands, and being skillful in using them to perform
maintenance.
 Understanding the logical structure of the eSpace EMSnetworking, the mapping between
the eSpace EMS and on-site devices, and the physical connections between on-site
devices.
 Being familiar with the system structure of the eSpace EMS and skillful in operating the
eSpace EMS.
 Understanding the basic methods of locating and rectifying faults.
2.4 Troubleshooting Flow

This topic describes the general process of rectifying faults and the operations in each step.
TheeSpace EMSis complicated, resulting in the complication of theeSpace
EMStroubleshooting. In addition, theeSpace EMSinvolves multiple network elements
(NEs).Therefore, you need to be familiar with the following points for the troubleshooting:
eSpace EMS networking, interaction between theeSpace EMS and the superior eSpace EMS,
and interaction between theeSpace EMSand the NEs.
According to the statistics, a fault has only one source in most cases instead of multiple
sources. Thus, it is important for you to locate the source of a fault before rectifying the fault.
2.4.1 Troubleshooting Flowchar

This topic describes the general process of handling faults.
Figure 2-1shows the general process of handling faults in the eSpace EMS system.

eSpace EMS
Figure 2-1 Figure 1 Process of handling faults
2.4.2 Collecting Fault Scenario Information

Collecting fault scenario information helps to quickly located faults. This topic describes the
important information about fault scenarios to be collected.
When a fault occurs, the scenario information about the fault must be collected immediately.
The information includes but is not limited to the following:
 Fault occurring time and place
 Detailed description of the fault symptom
 Operations performed before the fault occurs
 Measures taken after the fault occurs and the result
 Affected services and scope of the impact

eSpace EMS
 For the system status information possibly related to a fault, see 6 Collecting Fault
Information.
For a fault reported by a customer, the customer service personnel collect the fault scenario information.
For a fault occurs in an alarm or during the routing maintenance, the maintenance personnel collect the
fault scenario information.
2.4.3 Locating and Rectifying Faults

This topic describes the following operations:
 Locating faults
Fault locating involves two levels: component and module.
− Component level: Narrow the fault source to a device, such as a database.
− Module level: Locate the faulty module, such as the listening port of a database, after
identifying the faulty device.
For the common methods of locating faults, see 3 Methods of Locating Faults.
 Collecting fault information
After identifying the faulty device, collect the details about the device, including the
version number, logs, error codes, alarms, and memory information.
For how to collect fault information, see6 Collecting Fault Information.
You need to collect the detailed information about a device only after identifying the faulty device.
 Handling faults
After locating the faulty module, take proper measures to rectify the faults.
2.4.4 Checking Fault Rectification

This topic aims at determining whether the faults are correctly located and handled.
After taking measures to rectify faults, check whether the faults are rectified.
2.4.5 Generating a Fault Rectification Report

Fault rectifying reports contain the information of the same types to help future maintenance
and fault locating.
After confirming that a fault is rectified, record the fault rectifying process and produce a
report.
It is recommended that a fault rectifying report contain four topics: fault symptom, fault locating, fault
rectifying, and preventive suggestion.
2.4.6 Contacting Huawei

If you fail to rectify a fault after using the methods of locating and rectifying faults described
in this document, contact Huawei technical support engineers for remote or on-site assistance
in rectifying the fault.
For how to obtain technical support from Huawei, see2.5 Obtaining Huawei Technical
Support.

eSpace EMS
Before contacting Huawei technical support engineers, make sure that the following
information is available:
 Full name of the site where a fault occurs
 Name and phone number (mobile or fixed-line phone number) of a contact
 Fault scenario information and fault details
 Remote maintenance environment and parameters for remote access
2.5 Obtaining Huawei Technical Support

This topic describes how to obtain technical support from Huawei.
You can obtain technical support from Huawei through the Internet or by phone. See Table
2-1
Table 2-1 Table 1 Methods of obtaining technical support from Huawei
Method Operation Instruction

Dial a hotline number Dial any of the hotline numbers of Huawei customer service
of Huawei customer center:
service center  8008302118
 4008302118
Dial the phone Obtain the phone numbers of regional offices at
number of the http://www.huawei.com/cn/about/officeList.do
regional office
Refer to the 1. Visithttp://support.huawei.com, and then click Documentation
troubleshooting cases on the left.
2. Choose Product Line > Product > Family > Product >
Troubleshooting Case.
3. View cases or enter keywords for searching.
Consult online 1. Visithttp://support.huawei.com, and then click Community on
the left.
2. Select a forum from technical forum.
3. Check whether the methods of rectifying similar faults have
been provided in the forum. If not, submit the fault.
Access the technical support website of Huawei as anequipment user.Only equipment users or
higher-level users have the permission to access DocumentationandCommunityon the technical
support website of Huawei.Before you access the technical support website of Huawei, register on the
website as equipment user by using the information about Huawei products that you purchased.
 How to access Documentation?
Visithttp://support.huawei.com, and then clickDocumentation. On the
Documentationpage, you can download and browse Huawei product manuals, technical
guides, technical cases, precaution notices, and Huawei technical publications.
 How to access Community?

eSpace EMS
Visithttp://support.huawei.com, and then clickCommunity.TheCommunity page

provides technical forums about Huawei products and functions as a platform for
technical consultations and exchanges.
 How to contact regional offices?
Visithttp://support.huawei.com/, and then click About Huawei.On the page that appears,
click Contact us to view the contact information of regional offices.

eSpace EMS
Fault Management 3 Methods of Locating Faults
3 Methods of Locating Faults
About This Chapter

This topic describes several methods of locating fault, including analyzing logs, analyzing
alarms, and capturing packets for analysis.
There are multiple methods of locating faults. In the actual situation, these methods are often
used together as complements to each other. A good command and a flexible application of
these locating methods are the prerequisites for efficient fault rectification.
3.1 Viewing Alarms on the eSpace EMS Client
This topic describes how to view alarms on the eSpace EMS client.
3.2 Log Analysis
You can locate a fault quickly by checking the logs. This topic describes how to enable the
debug logs and view the logs.
3.1 Viewing Alarms on the eSpace EMS Client

This topic describes how to view alarms on the eSpace EMS client.
Procedure
Step 1 Log in to the eSpace EMS client.
Step 2 On the Topology Management tab page, view the alarms generated on LocalNMS, as shown
in Figure 3-1.

eSpace EMS
Figure 3-1 Viewing Alarms
Step 3 View the current fault alarms, as shown in Figure 3-2.
Figure 3-2 Filter window
Step 4 Click an alarm to view the detailed information, as shown in Figure 3-3

eSpace EMS
Figure 3-3 Current fault alarms
Step 5 Click View details next to Proposed repair actions: to view the causes and repair
suggestions for the alarm.
----End
3.2 Log Analysis

You can locate a fault quickly by checking the logs. This topic describes how to enable the
debug logs and view the logs.
You can locate a fault by checking logs in the following cases:
 No alarm is reported when the fault occurs.
 The fault cannot be located only by checking the alarm.
3.2.1 Changing a Log Level

You can change a log level to obtain the required log information.

eSpace EMS
Context
The configuration file oms.xml under {install path}/run/config records log levels based on
global configuration. This topic describes how to change a log level online by using
commands. After change, the configuration takes effect immediately. If you restart the system,
log levels are automatically restored based on global configuration.
Log levels include:
 DEBUG
 INFO
 WARN
 ERROR
 FATAL
Procedure
The omscli.sh command under {install path}/run/bin is used to change a log level.
Perform the following steps to change a log level:
1. Log in to the eSpace EMS server as user i2kuser.
2. Query the current log level.
# cd {install path}/run/bin
# ./omscli.sh log all
No Name Level File

1 apache WARN /opt/I2000SDV3/run/log/oms/core/apache.log
2 asutil ERROR /opt/I2000SDV3/run/log/oms/asutil/asutil.log
3 author ERROR /opt/I2000SDV3/run/log/oms/sm/author.log
4 base ERROR /opt/I2000SDV3/run/log/oms/core/base.log
5 bme ERROR /opt/I2000SDV3/run/log/bme/bme.log
6 cache ERROR /opt/I2000SDV3/run/log/oms/core/cache.log
7 cm ERROR /opt/I2000SDV3/run/log/oms/cm/cm.log
8 configure ERROR /opt/I2000SDV3/run/log/oms/core/configure.log
9 dbevtutil ERROR /opt/I2000SDV3/run/log/oms/eam/dbevtutil.log
10 dis_frame ERROR /opt/I2000SDV3/run/log/oms/autodis/dis_frame.log
11 dis_lldp ERROR /opt/I2000SDV3/run/log/oms/autodis/dis_lldp.log
12 dis_snmp ERROR /opt/I2000SDV3/run/log/oms/autodis/dis_snmp.log
Name is the log name, Level is the log level, and File is the absolute path of the log file.
3. Change a log level.
# ./omscli.sh log logname level
− logname is the log name in the 2 query result.
− level is the changed level.
For example, change the log level of cm to DEBUG:
# ./omscli.sh log cm DEBUG
Change log level of cm from ERROR to DEBUG
4. (Optional) Restore the log level to the default level.
# ./omscli.sh log logname default
Example: # ./omscli.sh log cm default

eSpace EMS
3.2.2 Logs
This topic describes how the system collects logs when faults occur in the system.
eSpace EMS Logs

Table 3-1 describes how the system collects logs when faults occur.
{install path} is the installation path of the eSpace EMS server. The default path is /opt/oms.
Table 3-1 Log description
Mod Log File Path Log File Log Description

ule
Secu {install author_*.log Security
rity path}/run/log/oms/s authentication logs
mod m/
ule nePermitGate_*.log NE right gateway
logs
sm_*.log Main program logs

Alar {install fm_*.log Main program logs
m path}/run/log/oms/f
mod m/ fmprobe_*.log Collection layer
ule logs
fmui_*.log Alarm client logs
fmbackup_*.log Alarm dump logs

Perfo {install pm_*.log Logs related to
rman path}/run/log/oms/p performance
ce m/ monitoring
mod templates, NE
ule event processing,
and view
monitoring
pmdata_*.log Logs collected

when performance
data is saved to the
database
pmds_*.log DS layer logs
pmmeastype_*.log Logs related to
performance
indicator instance
management
pmprobe_*.log Performance data

collection logs
pmthreshold_*.log Performance
threshold

eSpace EMS

ule
management logs
pmui_*.log Client operation

logs in performance
management
NE {install mimcache_*.log Cache logs of the
acces path}/run/log/oms/ea MIM
s m/
mod mim_*.log NE management
ule logs
iconmgr_*.log NE icon processing

logs
eam_*.log Logs related to NE

access operations
such as NE
lifecycle and type
processing
eam_*.log DS logs of the

EAM
eam_*.log Client operation
logs related to NE
access operations
such as tree table
refreshment
Topo {install mapping_*.log Logs related to
logy path}/run/log/oms/to topology object
mod po/ mapping processing
ule
topo_*.log DS layer logs
related to topology
operations such as
right and domain
allocation and
initialization of the
data to be displayed
on the client
topo_*.log uiService logs, such

as flex invocation
Java errors
topomgr_*.log Logs related to

topology object
management and
alarm
synchronization
Soft {install ideploy_ui*.log Running logs
ware path}/run/log/oms/s related to software

eSpace EMS

ule
mana wm/ management
geme
nt {install *.log Execution logs of
mod path}/run/log/oms/s installation or
ule wm/task name upgrade tasks
NOTE
The task name is the
name of the installation
or upgrade task created
in software
management.
Mess /opt/oms/run/log/om trace_node_*.log Logs related to

age strace/ interaction between
traci the mediation node
ng and the UOA
mod
ule trace_app_*.log Running logs of
message tracing
applications
ME {install med_*.log MED framework
D path}/run/log/oms/m logs and logs
mod ed/ related to
ule interaction between
the MED and NEs
over SNMP or
SOAP
ftp.server_*.log Logs related to

interaction between
the MED and NEs
over FTP
ftp.client_*.log Logs related to

interaction between
the MED and NEs
over FTP
ftp.med_*.log Logs related to

interaction between
the MED and NEs
over FTP
mml.med_*.log Logs related to

interaction between
the MED and NEs
over MML
mml.client_*.log Logs related to

interaction between
the MED and NEs
over MML
telnet.med_*.log Logs related to

eSpace EMS

ule
interaction between
the MED and NEs
over Telnet
telnet.client_*.log Logs related to

interaction between
the MED and NEs
over Telnet
ssh.med_*.log Logs related to

interaction between
the MED and NEs
over SSH
ssh.client_*.log Logs related to
interaction between
the MED and NEs
over SSH
Nort {install nbi_*.log Running logs
hbou path}/run/log/oms/n related to the
nd bi/ northbound
mod module, for
ule example,
forwarding alarms
to the upper NMS
and performing
tasks delivered by
the upper NMS
Basi {install web.portal_*.log Portal running logs
c path}/run/log/oms/co
platf re/ event_*.log Event running logs
orm
log.mgmt_*.log Running logs of the
mod
tool used for
ule
dynamically
changing log
severities
task_*.log Task running logs

sbus_*.log sbus running logs
sbus.server_*.log Running logs of the

sbus server
sbus.heartbeat_*.log sbus heartbeat

check logs
ds.core.adapter_*.log Running logs of the

DS layer
fsm_*.log Running logs of the

file management

eSpace EMS

ule
module
persistence_*.log Running logs of the

persistence layer
sbus.client_*.log Running logs of the

sbus client
apache_*.log Running logs of
Tomcat
base_*.log Running logs of the

base module
cache_*.log Running logs of the

cache module
UC {install snmptrap_*.log Logs about sending
servi path}/run/log/uc/ and receiving trap
ce messages between
log NEs and the eSpace
EMS through
SNMP
cbm/*.log Common functional
module logs (such
as cache, rotation,
batch importing,
and device
selection functions)
gs8/*.log GS8 access and
service logs
iad/*.log IAD access and

service logs
ippbx/*.log IP PBX access and

service logs
license/*.log License
management logs
other/*.log NE detection, NE
automatic access,
and IP PBX/IAD
backup and
restoration logs
remotesupport/*.log Remote
maintenance logs
sftpclient/*.log Log downloading
logs
tr69/*.log IP

eSpace EMS

ule
Phone/SBC/EGW
NE access and
service logs
ums/*.log UMS NE access

and service logs
vqm/*.log NE voice quality

monitoring logs
upgrade/*.log NE upgrade logs

Start {install log.log Startup logs
up path}/run/log/virgo/
log stop.exception.log Startup failure logs
Garb {install gc.hprof.txt Garbage collection

age path}/run/log/ logs
colle
ction
log

eSpace EMS
Fault Management 4 Fault Analysis
4 Fault Analysis
About This Chapter

This topic describes the principles of faults of different categories and the fault locating
guideline, helping you to locate and rectify faults quickly.
4.1 Performance Fault Analysis
This topic describe the principles of the performance statistics, performance monitoring, and
performance alarms and the fault locating guideline, helping you to locate and rectify a
performance fault quickly.
4.2 Software Management Fault Analysis
This topic describes the principles of the software management functions and the fault
location guideline, helping you to locate faults quickly.
4.3 iTrace Analysis
This topic describes the principles of the iTrace common functions and the fault locating
guideline, helping you to locate faults quickly.
4.4 iCnfg Analysis
This topic describes the principles of the iCnfg common functions and the fault location
4.5 DR Fault Analysis
This topic describes the principles of the disaster recovery (DR) system, which helps you to
locate and rectify a DR fault quickly.
4.1 Performance Fault Analysis

This topic describe the principles of the performance statistics, performance monitoring, and
performance alarms and the fault locating guideline, helping you to locate and rectify a
performance fault quickly.

eSpace EMS
4.1.1 Performance Statistics

This topic describes the principles of the performance statistics and the fault location
guideline, helping you to locate and rectify a fault quickly.
Fault Location
To rectify a fault that occurs when you obtain the performance data of an NE connected over
SNMP, perform the following steps:
1. Check whether the NE is connected properly.
a. On the eSpace EMS client, choose Resource > Resource Management.
b. Click the Service Applications or Physical Devices tab, as shown in Figure 4-1.
Figure 4-1 Resource management
c. Check the connections between NEs and the eSpace EMS.

If the connection status is Online, go to 2. Otherwise, rectify the fault according to
the troubleshooting suggestions.
2. Check whether the NE reports the performance data to the eSpace EMS.
You can check whether the NE reports the performance data to the eSpace EMS using
any of the following methods:
Using monitoring views takes precedence over other two methods.

− Check using monitoring views
a. On the eSpace EMS client, choose Performance > Monitoring View.
b. Click Add Monitoring View.
c. In the Add Monitoring View dialog box, set View name, Managed Object, and
Indicator Instance, and click OK.
− Check using historical performance data
a. On the eSpace EMS client, choose Performance > Historical Data.

eSpace EMS
b. Click Select Managed Object. In the Select Managed Object dialog box, set
Object type, Subnets, and Managed Objects, and click OK.
c. On the Historical Data tab page, set Time period and click Search.
− Check using logs
In the med_*.log file in {install path}/run/log/oms/med, check whether there are
performance data reported by the UOA using the OIDs of performance indicators.
If the performance indicators are cumulative ones, you need to check whether there
are calculated performance data using their OIDs in the pmdata_*.log file in {install
path}/run/log/oms/pm.
4.1.2 Performance Alarms

This topic describes the principles of the performance alarms and the fault location guideline,
helping you to locate and rectify a fault quickly.
Implementation Principles
The eSpace EMS compares the performance data with the preset performance index
thresholds in real time. If the performance instant value is greater than the threshold in three
consecutive intervals, the eSpace EMS generates a corresponding performance alarm. The
period of three intervals is a default setting, which can be changed in the relevant
configuration file of the eSpace EMS server.
Through this function, the performance items of the NEs that are monitored by the eSpace
EMS can be monitored in real time.
The performance data of the NEs that are connected to the eSpace EMS through SNMP is
obtained through the performance alarms by the eSpace EMSpm_snmpdataproc module of the
eSpace EMS server.
Figure 4-2 shows the process of generating a performance alarm.
Figure 4-2 Process of generating a performance alarm
The process is described as follows:

eSpace EMS
1. The administrator creates a performance alarm on the eSpace EMS client and sets the
performance indicator thresholds.
2. The eSpace EMS client calls the performance alarm interface of the eSpace EMS server,
and transmits the performance alarm parameter information to the eSpace EMS server.
Then the eSpace EMS server saves the specified performance alarm threshold conditions
to the database.
3. After obtaining performance data based on the statistics period, the eSpace EMS server
performs calculation based on the specified thresholds for performance indicators. If the
value of a performance indicator exceeds the specified threshold, an alarm is generated.
4. After the performance indicator falls, the eSpace EMS server obtains the performance
data again based on the statistics period and then performs calculation based on the
specified thresholds. If the value of the performance indicator is less than the specified
threshold, the alarm is cleared.
Fault Location
If an error occurs in the performance alarm, do as follows to locate the fault:
1. Check whether the alarm thresholds are successfully created.
a. On the eSpace EMS client, choose Performance > Template Configuration.
b. Select an NE or a module.
c. Click a measurement unit and check whether the alarm thresholds of a performance
indicator are successfully set.
If yes, perform step 2; if no, contact the NE maintenance personnel.
2. Check whether a performance alarm is generated.
a. On the eSpace EMS client, choose Fault > Current Alarms.
b. In the alarm list, check whether a performance alarm is generated.
If no performance alarm is generated and the value of the performance indicator
exceeds the specified alarm threshold, contact Huawei technical support.
4.2 Software Management Fault Analysis

This topic describes the principles of the software management functions and the fault
location guideline, helping you to locate faults quickly.
4.2.1 Executing an Installation or Upgrade Task

This topic describes the process of executing an installation or upgrade task and the fault
location guide, helping you to locate faults quickly.
Implementation Principle
Figure 4-3 shows the process of executing an installation or upgrade task.

eSpace EMS
Figure 4-3 Process of executing an installation or upgrade task
待安装或
软件管理
升级的目标主机
1.使用Telnet或SSH协议连接目标主机
2.返回登录成功信息
3.执行指令
4.返回指令执行输出信息
5.根据指令执行输出信息判断执行结果
The process of executing an installation or upgrade task is as follows:

1. The Software Management connects to a target host using Telnet or SSH.
2. The host to be installed or upgraded sends the connection result to the Software
Management.
3. The Software Management sends an installation or upgrade command to the target host.
4. The target host sends the command execution result to the Software Management.
5. The Software Management checks whether the installation or upgrade task is executed
successfully based on the command execution result.
Fault Location
If a fault occurs when an installation or upgrade task is created, locate and rectify the fault as
prompted:
1. Locate a fault based on the log information on the task execution page of the Software
Management.
2. If the following log information is displayed, contact the plug-in maintenance personnel
to locate and rectify the fault:
a. Log in to the Software Management host as the i2kuser user.
b. Access install path/run/log/oms/swm, for example,
/opt/huawei/I2000/run/log/oms/swm/.
c. Refer to the ideploy_ui_*.log file to locate the fault.
Table 4-1 shows examples of logs for executing an installation or upgrade task.

eSpace EMS
Table 4-1 Examples of logs for executing an installation or upgrade task
Log Information Description
2011-11-23 14:58:29,638 DEBUG [T=4 The log information indicates that the
4973][sun.reflect.Genera tedMethodAcces software management module connects to
sor306.invoke() -1] [SSHTerminal] (conn the host to be installed or upgraded over
ectToServer :211) Make connection to secure shell protocol (SSH).
oamtest2@10.137.97.239 at port 22
4973][sun.reflect.Genera tedMethodAcces software management module runs the
sor306.invoke() -1] [UnixTerminal] (sen cd;ksh command on the host to be installed
dCommand:18 11) SSHTerminal : execute or upgraded and the timeout period is
command >>> [30000]:cd ; ksh 30,000 ms.
4976][sun.reflect.Genera tedMethodAcces software management module successfully
sor306.invoke() -1] [ResultProcessor] ( executes instructions.
setSuccessf ul:846) Match message[ide
ploy:cmd:end] with finish word[ideplo y
:cmd:end]
4976][sun.reflect.Genera tedMethodAcces software management module runs
sor306.invoke() -1] [UnixTerminal] (exe modules/backp.sh but there is no return
cuteForward :818) read data error for c value in the timeout period (such as
ommand: /home/see/breeze/ideploy/2011 0 1,500,000 ms).
610170618.498/scripts/ideploy_wrap.sh m To resolve the problem, set Timeout
odules/backup.sh com.huawei.breeze.idep
duration for command execution on the
loy.task.ExecuteTimeoutException: SSHTe
Configure System page under software
rmi nal : Execute command : /home/see
management or contact Huawei technical
/breeze/ideploy/2011061017061 8.498/scr
support.
ipts/ideploy_wrap.sh modules/backup.sh
timeout.[1500000 ms] on host 10.3.4.33(
see)
4976][sun.reflect.Genera tedMethodAcces software management module runs the
sor306.invoke() -1] [ResultProcessor] ha_start.sh script in ngin_ha and the return
(processRaw Msg:397) math result met ex value is not zero. You can locate the fault
ception. com.huawei.breeze.ideploy.task based on the output information of the
.ExecuteErrorException: -Command: "/hom script.
e/lgjsee/breeze/ideploy/20110711145056.
24/scripts/ide ploy_wrap.sh ngin_ha/ha_
start.sh" -Catched Key: "ideploy:error
:" -From Message: "iDeploy:Error:FAILED
" at com.huawei.breeze.ideploy.terminal
.ResultProcessor.ma tchiDeployKeyWords(
ResultProcessor.java:1085) at com.huawe
i.breeze.ideploy.terminal.ResultProcess
or.ma tchResult(ResultProcessor.java:57
3) at com.huawei.breeze.ideploy.termina
l.ResultProcessor.pr ocessRawMsg(Result
Processor.java:373) at com.huawei.breez
e.ideploy.terminal.UnixTerminal.proce s
sResult(UnixTerminal.java:738) at com.h

eSpace EMS
Log Information Description

uawei.breeze.ideploy.terminal.UnixTermi
nal.readA ndProcessResult(UnixTerminal.
java:628) at com.huawei.breeze.ideploy.
terminal.UnixTerminal.sendC ommand(Unix
Terminal.java:1817) at com.huawei.breez
e.ideploy.terminal.UnixTerminal.sendP a
ssword(UnixTerminal.java:1713) at com.h
uawei.breeze.ideploy.terminal.UnixTermi
nal.execu teCmdWithSuUser(UnixTerminal.
java:1199) at com.huawei.breeze.ideploy
.terminal.UnixTerminal.execu teForward(
UnixTerminal.java:789) at com.huawei.br
eeze.ideploy.terminal.UnixTerminal.exec
u teWithSuUser(UnixTerminal.java:1025)
4.2.2 Checking Host Information

This topic describes the process of checking host information and the fault locating guideline,
helping you to locate faults quickly.
Implementation Principle
Figure 4-4shows the process of checking host information.

eSpace EMS
Figure 4-4 Process of checking host information
The process of checking host information is as follows:

1. The Software Management connects to the target host to install or upgrade using Telnet
or Secure Shell (SSH).
2. The target host sends a message to the Software Management indicating that the
connection is successful.
3. Enter the user name for logging in to the target host.
4. The target host sends output to the Software Management.
5. The Software Management enters a password based on the received output.
7. The Software Management executes the command for switching to the root user.
9. The Software Management enters the password of the root user based on the received
output.

eSpace EMS
11. The Software Management executes the command for exiting the root user.
12. The Software Management executes the command for creating a file and the command
for deleting the created file.
14. The Software Management executes the FTP or SFTP command to obtain files from the
target host.
16. The Software Management determines the host information checking result based on the
received output.
Locating Guideline
If a fault occurs when host information is checked, refer to Table 4-2 to locate and rectify the
fault.
Table 4-2 Solutions to different errors
Error Information Solution

The user name or Log in to the target host manually using Telnet or SSH, and
password is incorrect. verify the user name or password.
The prompt character is Log in to the target host manually using Telnet or SSH, and
incorrect. check whether the prompt character is one of the following
default prompt characters:
#$>%
If the prompt character is not a default one, change it on the Host
Management page. Click Full to expand all host parameters,
and set the password prompt character to a correct one.
The password of the 1. Log in to the target host manually using Telnet or SSH.
root user is incorrect. 2. Run the ksh command to switch shell.
3. Run the su - root command.
4. Check whether the password prompt character belongs to the
Software Management's password prompt set.
5. If yes, verify that the password of the root user exists. If no,
add the displayed prompt to the Software Management
prompt set.
Other errors Collect files ideploy_ui_*.log and ideploy_ui_*.zip in install
path/run/log/oms/swm/, and send them to Huawei technical
support engineers to locate and rectify faults.
4.3 iTrace Analysis

This topic describes the principles of the iTrace common functions and the fault locating

eSpace EMS
4.3.1 Creating a Tracing Task

This topic describes the process of creating a tracing task and the fault locating guideline,
helping you to locate faults quickly.
Process of Creating a Tracing Task

This topic describes how to create a tracing task.
Figure 4-5 shows the process of creating a tracing task.
Figure 4-5 Process of creating a tracing task
The process of creating a tracing task is described as follows:

1. A user creates a trace task.
2. The user sets trace conditions on the eSpace EMS client and sends a request for creating
a trace task to the eSpace EMS server.
3. The eSpace EMS server constructs the trace task data based on the user settings.

eSpace EMS
4. The eSpace EMS server sends the request for creating a trace task to the eSpace EMS
Mediation node.
The Mediation node and the eSpace EMS server can be deployed on different machines. Typically, the
Mediation node and the eSpace EMS server can be deployed on a machine.
5. The Mediation node verifies and records the trace parameters.
6. The Mediation node sends the request for creating a trace task to the UOA.
7. The UOA verifies the request and asynchronously sends the request to the NE.
8. The UOA sends the success or failure information about task creation to the Mediation
node.
9. The Mediation node updates the trace task status.
10. The Mediation node returns the task creation result to the eSpace EMS server.
11. The eSpace EMS server updates the trace task status.
12. The eSpace EMS server returns the task creation result to the eSpace EMS client.
13. Steps 13 to 20 are the process that the NE asynchronously returns the task creation result.
Fault Location Guideline

This topic describes how to locate and rectify faults when creating a tracing task.
If a fault occurs when creating a tracing task, determine the step where the fault occurs based
on the symptom. Then check the matching environment and log to locate and rectify the fault.
The following describes common fault cases when creating a tracing task.
The Failed to obtain a management object. Message Is Displayed When

You Select a Device.
 Cause
The selected NE or object is deleted by other login users.
 Description
When a user selects a device from the eSpace EMS client, the eSpace EMS server
obtains the dn of the device from the managed object buffer, and then obtains the
detailed trace information about the device based on the dn. If the device is deleted, the
dn is also deleted, and the preceding message is displayed when you select the device.
 Solution
Open the Resource Management tab page, add the device, and create a trace task for the
device.
The Exceeded the maximum number of tracing tasks (5). or The number
of trace tasks exceeded the maximum 40. Message Is Displayed When You
Create a Task.
 Cause
A maximum of 40 trace tasks can be created, and a maximum of five trace tasks is
allowed for a single client.
 Solution
Delete unnecessary trace tasks.

eSpace EMS
The Failed to create the trace task. Message Is Displayed When You
Create a Trace Task.
 Cause
− No matched module is found.
− The trace agent is not connected successfully.
− The trace task fails to be created because all trace task IDs are used up.
− Exceptions occur in the Master and Mediation services.
 Solution
Click View Detail to view related information.
Table 4-3 describes the solutions based on different causes.
Table 4-3 Solutions
Cause Description Solution

No matched The module is Check whether the module is registered on the
module is deregistered or an UOA. Open the register_info.log file in {UOA
found. exception occurs in installation directory}/log to check whether
the module. the module is deregistered. If the following
information is displayed, the module is
deregistered.
Nov 22 18:52:38:333686 ThreadID:195
6 >>> Module UnRegister: ModuleCode=005
4040110001
Contact the NE maintenance personnel to find
the reason why the module is deregistered, and
register the module again.
The trace agent The connection 1. Check whether the UOA is successfully
is not connected between the UOA and started.
successfully. Mediation node is Log in to the host where the UOA resides as
abnormal or the UOA user uoa and run the following command:
service is not running
> p
properly.
If the following information is displayed,
the UOA is successfully started. Otherwise,
run the uoa_start.sh command to start the
UOA.
uoa 28679 1 0 May23 ? 00

:00:01 uoa_lma uoa 28681 28679 0 Ma
y23 ? 00:00:01 uoa_server uoa
28782 28679 0 May23 ? 00:00:00 u
oa_log_agent uoa 28869 28679 0 May2
3 ? 00:00:00 uoa_trace_agent uoa
28943 28679 0 May23 ? 00:00:0
1 uoa_perf_agent uoa 28992 28679 0
May23 ? 00:00:00 uoa_cli
2. Check information such as the IP address,

port number, user name, and password in

eSpace EMS

the uoa_common.ini file on the UOA, and
then create NEs again on the eSpace EMS.
The trace task The eSpace EMS Close the trace page on the client (the task ID is
fails to be server needs to released after you close the trace page).
created because allocate the trace task
all trace task ID before sending a
IDs are used up. request for creating a
trace task to the
UOA. If trace task
IDs are allocated by
bit, a maximum of 24
tasks of the same type
is allowed on each
UOA. If you create
more than 24 tasks of
the same type on a
UOA, the error
message is displayed.
The eSpace If the eSpace EMS or Check whether the communication between the
EMS or Mediation service eSpace EMS and the Mediation is normal.
Mediation does not work Locate and troubleshoot the fault by checking
service does not properly, you the operating logs of the Mediation and the
work properly. sometimes cannot eSpace EMS server.
create trace tasks.
A Message Is Reported After You Successfully Create a Trace Task, But the Task
Is Automatically Deleted After It Runs Some Time.
 Cause
− All modules of the trace task are disconnected or deregistered.
− The connection between the UOA and Mediation is disconnected.
− The end time of the paused trace task is reached.
 Solution
Click View Detail to view related information.
Table 4-4 describes the solutions based on different causes.
Table 4-4 Solutions

All modules of The UOA reports a task Check the connection between the
the trace task deletion message when the module and the UOA.
are connection between the View the UOA log file to check whether
disconnected or module and the UOA is the module is deregistered.
deregistered. abnormal or the device is
deregistered.
The connection The eSpace EMS Check whether the connection between

eSpace EMS

between the automatically deletes the the UOA and the Mediation is normal.
UOA and trace task related to the
Mediation is UOA when the connection
disconnected. between the UOA and the
Med-Node is disconnected
or the UOA service does not
work properly.
The end time of The UOA sends a task Re-create a trace task.
the paused trace deletion message to the
task is reached. device and reports it to the
eSpace EMS server when
the end time of the paused
trace task is reached.
A Tracing Task Is Deleted After Being Created for About 10 to 15 Seconds

 Cause
None
 Principle
If an NE does not respond to a request for creating a tracing task sent by the UOA in
about 10 seconds, the UOA considers that the NE does not run properly and deletes the
tracing task.
 Solution
1. Check whether the NE runs properly.
If no, restore the NE. For details, see the NE troubleshooting guide.
2. Check the connection between the NE and the UOA.
If the connection is faulty, restore the connection. For details, see the NE troubleshooting
guide.
4.3.2 Displaying Tracing Messages

This topic describes the process of displaying tracing messages and the fault locating
Process of Displaying a Tracing Message

This topic describes the process of displaying a tracing message reported by an NE on the
eSpace EMS client.
Figure 4-6 shows the process of displaying a tracing message.

eSpace EMS
Figure 4-6 Process of displaying a tracing message
The process is describes as follows:

1. An NE reports a tracing message to the UOA.
2. The UOA verifies the tracing message and reports the message to the eSpace EMS
Mediation node.
3. The eSpace EMS Mediation node reports the message to the eSpace EMS server.
4. The eSpace EMS server parses the message.
5. The eSpace EMS server reports the parsed message to the eSpace EMS client.
6. The eSpace EMS client shows the message tracing result in graphics based on the
parameters such as the trace type.
Fault Location Guideline

This topic describes how to locate and rectify faults when a tracing message is displayed.
Solutions to common faults when a tracing message is displayed are as follows:
No Message Is Reported After a Tracing Task Is Created

 Cause
− An NE does not report a message to the UOA.
− An NE has reported a message, but the message does not reach the UOA for
unknown reasons. The common reason is that the tracing message reported by the NE
is filtered out by the platform.
− The UOA receives the message reported by the NE, but does not report the message
to the Mediation Node.
− The Mediation receives the message, but does not report the message to the eSpace
EMS client.
 Principle

eSpace EMS
An NE reports a tracing message to the UOA. The UOA reports the message to the
Mediation Node. The Mediation Node then sends the message to the eSpace EMS server.
Finally, the eSpace EMS server reports the message to the eSpace EMS client, and the
eSpace EMS client presents the alarm on the GUI.
 Solution
1. Ask the NE maintenance personnel to check whether an NE has reported a message.
If no, ask the NE maintenance personnel to locate and rectify faults based on the NE
troubleshooting guide.
2. Check the duoa_trace_agent.log file in UOA installation directory/log for records that
indicate the UOA has received messages from NEs.
If the log file contains the following information, the UOA has received messages from
NEs.
Nov 21 18:44:16 [Debug3] ThreadID:10236 >>>
--------------ReportTraceMsg------------- ModuleCode =
0054040110001 IsSender = 0 TraceCode = 0xff00000e RcvMsgTimeMs = 739
RcvMsgTimeSec = 1321872256 Trac
eProtocol = 0 GeneralIDType = 1 GeneralID = 123
Nov 21 18:44:16 [Debug3] ThreadID:7216 >>> -------------

CReportTraceMsgToOMCMsg ------------ m_nTotal_Len
gth = 74 m_sVersion = 2 m_sCommand_ID = 0x3 m_nSequence_ID = 0 m_uiTraceTaskID =
4278190094 m_usSequenceNum
= 1 m_cMsgDirection = 0 m_uiTraceMsgTimeSec = 1321872256 m_uiTraceMsgTimeMilliSec
= 739 m_sTraceProtocol =
0 m_szModuleCode = 0054040110001 m_GID.ucGeneralIDType = 1 m_GID.strGeneralID =
123 m_strTraceContent: 000
000 46 72 6F 6D 20 53 52 56-4D 61 6E From SRVMan
m_strTraceExtInfo: 000000 7C 6C
65 76 65 6C 3D 30 |level=0
If the preceding information is not displayed, the UOA does not receive the message
reported by the NE. In this case, you can view the log file of the UOA to locate and
troubleshoot the fault. If the fault still persists, contract Huawei technical support.
3. Check the duoa_trace_agent.log file in UOA installation directory/log/debug for
records that indicate the UOA has reported messages to the Mediation Node.
If the log file contains the following information, the UOA has reported message to the
Mediation Node.
The following information indicates that the UOA sends a message to the Mediation
Node whose IP address is 10.138.48.145.
Nov 21 19:20:27 [Debug3] ThreadID:8620 >>> Put message to queue(1)
(destination=10.138.48.145:4308)
, length is 74. Nov 21 19:20:27 [Debug3] ThreadID:9560 >>> Send message to
remote(IP-10.138.48.145:PORT-
4308:HANDLE-1188), message stream: 000000 00 00 00 4A 00 02 00 03-00 00 00 00 FF
00 00 0E ...J..........
.. 000010 00 01 30 30 35 34 30 34-30 31 31 30 30 30 31 00 ..0054040110001. 000020
01 03 31 32 33 4E
CA 33-FB 00 00 01 9B 00 00 00 ..123N.3........ 000030 00 00 0B 46 72 6F 6D 20-53
52 56 4D 61 6E 00 00
...From SRVMan.. 000040 00 08 7C 6C 65 76 65 6C-3D 30 ..|level=0

eSpace EMS
If the UOA does not send the message to the Mediation Node, check the UOA log file to
locate and rectify the fault. If the fault persists, contact Huawei technical support.
4. Check the log file of the Mediation Node for records indicating that the Mediation Node
has received messages from the UOA.
Log file path: {install path}/run/log/oms/trace/trace_node_*.log
If the log file contains the following information, the Mediation Node has received
messages from the UOA.
2011-11-21 19:20:27,411 DEBUG [T=1245][com.huawei.oms.net.trace.uoa.
agent.AgentDispatcher.dispatch()
117] Receive message from remote ip = 10.138.48.145, port = 6601 2011-11-21
19:20:27,411 DEBUG [T=124
5][com.huawei.oms.net.trace.uoa. agent.AgentDispatcher.dispatch() 120] Receive
message command id = 3. 2011
-11-21 19:20:27,411 DEBUG [T=1245][com.huawei.oms.net.trace.uoa.
agent.AgentDispatcher.dispatch() 121] 0000
00 00 00 00 4A 00 02 00 03-00 00 00 00 FF 00 00 0E ...J............ 000010 00 01
30 30 35 34 30 34-30 31 3
1 30 30 30 31 00 ..0054040110001. 000020 01 03 31 32 33 4E CA 33-FB 00 00 01 9B
00 00 00 ..123N.3........
000030 00 00 0B 46 72 6F 6D 20-53 52 56 4D 61 6E 00 00 ...From SRVMan.. 000040
00 08 7C 6C 65 76 65 6C-3D
30 ..|level=0
If the preceding information is not displayed, the Mediation does not receive the message
forwarded by the UOA. In this case, you need to check whether the connection between
the UOA and the Mediation is normal and view the log file of the Mediation to locate
and troubleshoot the fault. If the fault still persists, contact Huawei technical support.
5. In the log file of the eSpace EMS server, check whether the eSpace EMS server receives
the message from the Mediation.
Log file path: {install path}/run/log/oms/trace/trace_app_*.log
If the log file contains the following information, the eSpace EMS server receives the
message from the Mediation Node.
2011-11-21 19:20:27,411 DEBUG [T=1245][com.huawei.oms.net.trace.uoa.
agent.AgentDispatcher.dispatch()
117] Receive message from remote ip = 10.138.48.145, port = 6601 2011-11-21
19:20:27,411 DEBUG [T=1245][co
m.huawei.oms.net.trace.uoa. agent.AgentDispatcher.dispatch() 120] Receive message
command id = 3. 2011-11-2
1 19:20:27,411 DEBUG [T=1245][com.huawei.oms.net.trace.uoa.
agent.AgentDispatcher.dispatch() 121] 000000 00
00 00 4A 00 02 00 03-00 00 00 00 FF 00 00 0E ...J............ 000010 00 01 30 30
35 34 30 34-30 31 31 30
30 30 31 00 ..0054040110001. 000020 01 03 31 32 33 4E CA 33-FB 00 00 01 9B 00 00
00 ..123N.3........ 0000
30 00 00 0B 46 72 6F 6D 20-53 52 56 4D 61 6E 00 00 ...From SRVMan.. 000040 00 08
7C 6C 65 76 65 6C-3D 30
..|level=0
If the preceding information is not displayed, the eSpace EMS server does not receive
the message forwarded by the Mediation. In this case, you need to check whether the
connection between the eSpace EMS server and the Mediation is normal and view the

eSpace EMS
log files of the app to locate and troubleshoot the fault. If the fault still persists, contact
Huawei technical support.
Unknown Icons Exist in Flowcharts in the Chart Display Area of the iTrace
Client (As Shown in Figure 4-7)
Figure 4-7 Unknown icons
 Cause
− An NE registers the GeneralID with the UOA.
− An unidentified module is registered on the UOA, but the module type is not
specified in the resource file.
 Principle
After receiving a tracing message, the eSpace EMS client searches for the module type
in the local NE data based on the module code, and draws a tracing flowchart based on
the obtained module type. If the module code does not exist in the local eSpace EMS NE
data, the eSpace EMS client cannot find the module type, and cannot draw an icon that
can be identified. Therefore, the icon is displayed as the module code in the chart display
area. The module code registered with the UOA is a 13-digit number.
 Solution
Check whether the unknown module is registered with the UOA by viewing the
$UOA_RUN_ROOT/data/middata/module.datfile on the UOA server.
The first field in every line of this file is a module code. The following are examples:
0054040101001|4040101|SEE_testrptmsg|10.137.97.244|1|1|V100R001C02B121|1111|404
01|SEE_244|o60585
|0054040101001|||soapadapter_100
0054040101002|4040101|SEE_testrptmsg2|10.137.97.244|1|1|V100R001C02B1
21|1111|40401|SEE_244|o60585|0054040101002|||soapadapter_100
− If the module code exists, the module has been registered with the UOA.
− If the module code does not exist, the module is not registered with the UOA.
Because the $UOA_RUN_ROOT/data/middata/module.dat file cannot be
modified manually, contact the NE maintenance personnel to register the module
with the UOA.

eSpace EMS
Unknown Icons Exist in Flowcharts in the Chart Display Area of the iTrace
Client (As Shown inFigure 4-8)
Figure 4-8 Unknown icons
 Cause
− An NE fails to register the GeneralID with the UOA.
− The module GeneralID is not reported.
 Principle
The GeneralID of a module is identified in either of the following ways:
The process of identifying the GeneralID is as follows:
Assume that the module code is 0054040110001, and the GeneralID is
DOID://0A4769E7/00000001/00000002/000000020054100100002.
1. The NE notifies the UOA that the GeneralID of the module 0054040110001 is
DOID://0A4769E7/00000001/00000002/000000020054100100002.
2. When the iTrace server sends a request to the UOA for creating a tracing task, the UOA
reports the GeneralID to the iTrace server.
3. The NE reports the tracing message with the other party information being the module
GeneralID DOID://0A4769E7/00000001/00000002/000000020054100100002.
4. The iTrace server changes the other party information from
DOID://0A4769E7/00000001/00000002/000000020054100100002 to 0054040110001.
5. The iTrace server reports the message with the module code to the iTrace client for
display.
− An NE reports the module GeneralID directly in the additional information about a
message. This method does not require preprocessing of the iTrace server. The iTrace
client directly changes the GeneralID to the module code.
 Solution
The fault locating method varies according to the method of identifying the GeneralID.

eSpace EMS
View UOA logs for creating a tracing task.

If the following log information exists, the GeneralID is successfully registered with
the UOA. Otherwise, contact Huawei technical support engineers.
The following information indicates that the UOA reports a GeneralID message to the iTrace server. For
example, the message is SynGeneralIDListMsg, the message ID is 0x56, and the GeneralID of the
module 0054030104001 is DOID://0A4769E7/00000001/00000002/000000020054100100001.
Jun 04 09:50:30 [Debug3] ThreadID:1479543712 >>> Put message to queue(0)
(destination=10.137.97.248:5
2241), length is 89. Jun 04 09:50:30 [Debug3] ThreadID:1479543712 >>>
------------- CSynGeneralIDListMsg
------------- m_nTotal_Length = 89 m_sVersion = 2 m_sCommand_ID = 0x56
m_nSequence_ID = 0 m_ucSynT
ype = 0 m_szModuleCode = 0054030104001 Jun 04 09:50:30 [Debug3]
ThreadID:1479543712 >>> m_vModuleGeneral
IDList.size ===== 2 ucGeneralIDType : 1 strGeneralID : 0054030104001
ucGeneralIDType : 0 strGeneralID
: DOID://0A4769E7/00000001/00000002/000000020054100100001
− An NE reports the module GeneralID directly in the additional information about a

message.
View the additional message information in the form display area of the iTrace client
to check whether the module alias similar to |alias=... is displayed.
4.4 iCnfg Analysis

This topic describes the principles of the iCnfg common functions and the fault location
Implementation Principles of Configuration Management

Figure 4-9 shows the implementation principles of configuration management.

eSpace EMS
Figure 4-9 Implementation principles of configuration management
Table 4-5 describes the implementation process of configuration management.
The operations described in Table 4-5 are not in sequence.
Table 4-5 Implementation process of configuration management
Operation Procedure
Synchronize The administrator triggers the operation of synchronizing
configuration data configuration data on the eSpace EMS client.
1. The eSpace EMS client sends a data synchronization
request to the server.
2. The server obtains the latest configuration data from NEs.
3. The server synchronizes the latest data to the eSpace EMS
database and returns a synchronization result to the eSpace
EMS client.
4. The eSpace EMS client shows the synchronization result.
Add, delete, or modify The administrator adds, deletes, or modifies configuration
configuration items items on the eSpace EMS client.
1. The eSpace EMS client sends a request for pre-editing
configuration data to the server.
2. The server submits the data to NEs.
3. The eSpace EMS client shows the data submission result.

eSpace EMS
Fault Location
If an error occurs when you perform configuration management operations, you can locate the
fault using either of the following two methods:
1. Locate a fault based on the error message provided by the eSpace EMS client.
2. Locate a fault based on the error log and analysis for the configuration management
implementation process.
Typically, you can locate a fault based on the error message provided by the eSpace EMS
client. If you cannot find the fault cause based on the error message, you can check the error
log and configuration management implementation process. The error log information is as
follows:
 Log file name: cm_[TIMESTAMP].log
 Log file path: {Install path/run/log/oms/cm}
The following is a sample of a piece of complete log information:
2011-11-15 09:50:13,248 DEBUG
[T=205][com.huawei.oms.cm.as.support.ExtensionActivator.start() 43]
ExtensionActivator is starting.
Table 4-6 describes each part in the log information.
Information Description
2011-11-15 09:50:13,248 Time when a fault occurs. The time is

accurate to millisecond.
DEBUG Log level.
[T=205] ID of the current thread.
[com.huawei.oms.cm.as.support.Extensio Code information, such as the class name,
nActivator.start() 43] method name, and code line.
ExtensionActivator is starting. Log information. The information may be
displayed in lines.
4.5 DR Fault Analysis

This topic describes the principles of the disaster recovery (DR) system, which helps you to
locate and rectify a DR fault quickly.
Operation Principles of the GDR System

Figure 4-10 shows the network deployed in geographical disaster recovery (GDR) mode.

eSpace EMS
Figure 4-10 Network deployed in GDR mode
Typically, the production machine of the eSpace EMS provides services. If the production
machine is faulty, a manual switchover is performed to switch services from the production
machine to the redundancy machine.
The GDR software synchronizes data between the production machine and the redundancy
machine and performs resource management for application services.
 DRService: primary process of the GDR software. The process exists on both the
production machine and the redundancy machine. The DRService process of the
production machine monitors the status of the replication link and helps the DRService
process of the redundancy machine to perform a failover. The DRService process of the
redundancy machine monitors the GDR software and prepares the application services
and database for a switchover or failover.
 DRAgent: agent of the GDR software. The DRAgent exists only on the redundancy
machine and is responsible for preparing the application programs for a failover.
 Disaster recovery command line interface (DRCLi): operation mode of the GDR
software. The DRCLi encapsulates data synchronization commands at the bottom layer
to provide users with simple command interfaces and provide the DRService process
with the unified management interface, command interface, and message interface.
 Filesync: file synchronization tool of the GDR software. The tool can run only on the
production machine and is responsible for synchronizing files from the production
 DataGuard: component of the Oracle database. The component is responsible for
replicating data of the database.

eSpace EMS
Data Synchronization in GDR Mode

In the eSpace EMS deployed in GDR mode, data synchronization is classified into the
following types:
 File synchronization based on Filesync: implemented by using the GDR software and
managed by the Filesync process of the GDR software.
 Oracle database synchronization based on the DataGuard: implemented by using the
DataGuard component of the Oracle database.
 File synchronization based on Filesync:
The synchronization process is as follows:
1. The GDR software periodically scans the files on the production machine.
2. If certain files are modified, the GDR software synchronizes the modified files from the
production machine to the redundancy machine by running the SCP or RCP command
of the operating system.
 If the update time of a file changes, it is considered that the file has been modified. If the file
contents are not actually modified, the file is also synchronized from the production machine to the
redundancy machine.
 In the configuration file of the GDR software, you need to configure information such as the files or
directories to be synchronized and the synchronization type.
The files to be synchronized are listed as follows:
− {install path}/run/repository
− {install path}/run/hedex
− {install path}/run/plugins
− {install path}/run/pickup
− {install path}/run/dump
− {install path}/run/data
 Oracle database synchronization based on the DataGuard:
Figure 4-11 shows the synchronization process.

eSpace EMS
Figure 4-11 Network deployed in GDR mode
The following four processes are responsible for synchronization:

− Log network server (LNS): transmits the redo log to the redundancy machine.
− Remote file server (RFS): receives redo data from the production machine and writes
redo data in the redo log file on the redundancy machine.
− Managed recovery process (MRP): applies the received logs to physical disks of the
redundancy machine.
− Archive (ARCH): archives logs on the redundancy machine.
The synchronization process is as follows:
1. The LGWR process writes redo logs in the product database in online mode.
2. The LNS process reads online redo logs from the product database and sends the logs to
the RFS process of the redundancy machine.
3. The RFS process receives the redo logs from the LNS process.
4. The RFS sends an acknowledgment message to the LNS, saying that the redo logs are
received successfully.
5. The RFS process writes the redo logs to the redundancy machine.
6. The MRP obtains the redo logs from the redundancy machine.
7. The MRP applies the redo logs to the database of the redundancy machine.

eSpace EMS
Switching
Switching is classified into switchover and failover:
 Switchover: You perform a switchover only when the production machine is running
properly. A switchover is triggered for test during installation, debugging, or routine
maintenance.
 Failover: You need to perform a switchover when the production machine is faulty.
In switchover mode, you need to stop the service of the production machine and then start the service of
the redundancy machine. In failover mode, however, you need only to start the service of the
redundancy machine.
The following figure shows the switching process.
Table 4-7 Switching processes in switchover and failover modes
Switching Switching Process

Mode
Switchover 1. You run the drcli command on the redundancy machine to trigger
a switchover.
2. After receiving the switchover request, the DRService of the
redundancy machine notifies the DRAgent to prepare for the
switchover. If the production machine is running properly, the
DRService of the redundancy machine also notifies the DRService
of the production machine to prepare for the switchover.
3. The DRAgent starts the eSpace EMS service of the redundancy
machine.
4. The DRService of the production machine stops the eSpace EMS
service of the production machine.
5. The DRService of the production machine stops data replication
from the original production machine to the original redundancy
machine.
6. On the redundancy machine, you run the command for role
switching to start data synchronization from the current production
Failover 1. When the production machine is faulty, you run the DRCLI
command on the redundancy machine to trigger a failover.
2. After receiving the failover request, the DRService of the
redundancy machine notifies the DRAgent to prepare for the
failover.
3. The DRAgent starts the eSpace EMS service of the redundancy
machine.
4. You repair the original production machine. After repair, the data
is synchronized from the current production machine to the current
redundancy machine.
Fault Location
When a fault occurs in the GDR system, you can locate the fault based on logs.

eSpace EMS
The GDR software logs the operating information about the GDR system. Common log files
are listed as follows:
 drcli.log in /opt/huawei/gdr/log: operating logs of the drcli command
 filesync_sh.log in /opt/huawei/gdr/log: operating logs about data replication
 filesync.log in /opt/huawei/gdr/log: operating logs of the Filesync process
 drservice.log in /opt/huawei/gdr/log: operating logs of the DRService process

eSpace EMS
Fault Management 5 Troubleshooting
5 Troubleshooting
About This Chapter

This topic describes the common operations in the troubleshooting, including the operations
of checking the statuses of the eSpace EMS, VCS, and DR resources and starting and
stopping the eSpace EMS, VCS, and DR system.
5.1 Checking the Running Status of the eSpace EMS
This topic describes how to start theeSpace EMS, stop the eSpace EMS, and view the running
status of eSpace EMS services.
5.2 Checking the Running Status of the DR System
This topic describes how to check the running status of the eSpace EMS DR system.
5.1 Checking the Running Status of the eSpace EMS

This topic describes how to start theeSpace EMS, stop the eSpace EMS, and view the running
status of eSpace EMS services.
5.1.1 Starting the eSpace EMS Service

This topic describes how to start the eSpace EMS. You need to start the Oracle database
before starting the eSpace EMS.
Command Syntax
./omsd.sh start
Procedure
2. Access {install path}/run/bin.
> cd {install path}/run/bin
3. Start the eSpace EMS.
> ./omsd.sh start

eSpace EMS
Output Example
Help System ......................................................... started

Kernel Module ....................................................... started
Base Module ......................................................... started
Net Adapter Module .................................................. started
Mediation Module .................................................... started
Topo Module ......................................................... started
MORE Module ......................................................... started
Audit Module ........................................................ started
Access Module ....................................................... started
Dump Module ......................................................... started
Fault Module ........................................................ started
Security Module ..................................................... started
Core Platform ....................................................... started
Performance Module .................................................. started
License Monitor Module .............................................. started
ConfigManager Module ................................................ started
NBI Module .......................................................... started
UOA Module .......................................................... started
Trace Module ........................................................ started
I2000 PM ............................................................ started
SoftwareManagement ideploy.ui ....................................... started
SoftwareManagement swm.ui ........................................... started
Startup Monitor ..................................................... started
Finished
5.1.2 Querying the eSpace EMS Service Status

This topic describes how to query the eSpace EMS service status.
Command Syntax
./omscli.sh checkstate process
Procedure
2. Access {install patch}/run/bin.
3. Query the eSpace EMS service status.
> ./omscli.sh checkstate process
Output Example
System process already started.
5.1.3 Stopping the eSpace EMS Service

This topic describes how to stop the eSpace EMS.

eSpace EMS
Command Syntax
./omsd.sh stop
Procedure
2. Access {install path}/run/bin.
3. Stop the eSpace EMS.
> ./omsd.sh stop
Output Example
Dump Module ......................................................... stopped

MORE Module ......................................................... stopped
Audit Module ........................................................ stopped
Security Module ..................................................... stopped
ConfigManager Module ................................................ stopped
Trace Module ........................................................ stopped
Core Platform ....................................................... stopped
I2000 PM ............................................................ stopped
Performance Module .................................................. stopped
Access Module ....................................................... stopped
License Monitor Module .............................................. stopped
SoftwareManagement ideploy.ui ....................................... stopped
NBI Module .......................................................... stopped
Net Adapter Module .................................................. stopped
Startup Monitor ..................................................... stopped
Topo Module ......................................................... stopped
Fault Module ........................................................ stopped
UOA Module .......................................................... stopped
Mediation Module .................................................... stopped
SoftwareManagement swm.ui ........................................... stopped
Base Module ......................................................... stopped
Kernel Module ....................................................... stopped
Help System ......................................................... stopped
Finished
5.2 Checking the Running Status of the DR System

This topic describes how to check the running status of the eSpace EMS DR system.
5.2.1 Starting the GDR Software

This topic describes the commands for starting the drservice and filesync processes on the
production machine and the drservice and dragent processes on the redundancy machine.
Procedure
Step 1 Start the GDR software of the production machine.

eSpace EMS
> drservice -c
> p
UID PID PPID C STIME TTY TIME CMD

root 3306 1 0 Sep15 ? 00:00:02 drservice -c
root 3307 3306 0 Sep15 ? 00:00:55 filesync 3 243353354 11112
Step 2 Start the GDR software of the redundancy machine.

> drservice -m
> p

root 24065 1 0 Sep15 ? 00:00:08 drservice -m
root 24069 24065 0 Sep15 ? 00:00:00 dragent 0 i2000
----End
5.2.2 Checking the Process Status of the GDR Software

If the process of the GDR software is abnormal, data synchronization, failover, or switchover
may fail. This topic describes how to check the process status of the GDR software.
Procedure
Step 1 Log in to the production machine and redundancy machine by using the gdr account that is
also used in installation of the GDR software.
Step 2 Check the process status of the GDR software on the production machine.
> p
If the information contains the drservice and filesync processes, the GDR software is running
properly on the production machine.

root 1632 1 0 14:38 ? 00:00:02 drservice -c
root 1640 1632 0 14:38 ? 00:00:01 dragent 0 db
root 1641 1632 0 14:38 ? 00:00:01 dragent 1 pub
root 1639 1632 0 14:38 ? 00:00:00 filesync 3 1471304970 11112
Step 3 Check the process status of the GDR software on the redundancy machine.
> p
If the information contains the drservice and dragent processes, the GDR software is running
properly on the redundancy machine.

root 20393 1 0 14:40 ? 00:00:02 drservice -m

eSpace EMS
root 20400 20393 0 14:40 ? 00:00:02 dragent 0 i2000

root 20401 20393 0 14:40 ? 00:00:02 dragent 1 db
The filesync process exists on the redundancy machine, but the dragent process does not exist on the
redundancy machine.
----End
Check Result
Check that the GDR software is running properly on both the production machine and
redundancy machine.
Exception Handling
 If a GDR process is abnormal or not running, check the following logs:
− {GDRWORKDIR}/log/drcli.log
− {GDRWORKDIR}/log/drservice.log
− {GDRWORKDIR}/log/filesync.log
− {GDRWORKDIR}/log/filesync_sh.log
 If the drservice or dragent process is abnormal, stop the process and then start the
drservice process.
For details, see Stopping the GDR Software and Starting the GDR Software.
5.2.3 Checking the States of DR Resources

This topic describes how to check the states of DR resources.
Context
You have to run the command only on the redundancy machine.
The states of DR resources include:

 PreOnline: Resources or resource groups are available.After DR prestart is performed
successfully, resources or resource groups are in this state.
 PreOnlining: Resources or resource groups are being used for DR restart.When DR
prestart is being performed, resources or resource groups are in this state.
 Online: Resources or resource groups are available.After a switchover or failover is
performed successfully, resources or resource groups are in this state.
 Onlining: Resources or resource groups are being used for DR restart.When a switchover
or failover is being performed, resources or resource groups are in this state.
 Offline: Resources or resource groups are stopped.
 Offlining: Resources or resource groups are being stopped.
 Preonlinefailed: Resources or resource groups fail to be used for DR prestart.

eSpace EMS
 OnlineFailed: Resources or resource groups fail to be used for a switchover or failover.

 OfflineFailed: Stopping resources or resource groups fails.
 Unknown: The state of resources or resource groups is unknown.
 PostOnline: After a switchover or failover is performed successfully, resources or
resource groups are in this state.
 PreOnlinePending: At prestart, database resources are suspended at start until the
production machine finishes the checkpoint operation.
 OnlinePending: Resources or resource groups are being started but suspended. This state
occurs in the following scenarios:
− ResourceGroup.n requires a switchover or failover, but ResourceGroup.m that
conflicts with ResourceGroup.n has initiates DR prestart. Therefore,
ResourceGroup.n is in the OnlinePending state until the state of ResourceGroup.m
changes to Offline.
− If the fast switching function is started before start of database resources, database
resources are in the OnlinePending state until the IBC message of the production
machine is executed successfully.
Procedure
Step 1 Log in to the redundancy machine as DR user gdr.
Step 2 Check the DR resource states.
> drcli -c drstate -l
RG STATE
Group State DRState
RG.1 PostOnline Normal
RESOURCE STATE
Group Resource ID Type State
RG.1 1001 -- Offline
RG.1 100101 App(i2000) Offline
RG.1 100102 DB(ORACLE) Offline
----End
Check Result
Check that all resources on the redundancy machine is in the offline state.
5.2.4 Checking the Database Synchronization Status

This topic describes how to check whether database synchronization is normal.
Procedure
Step 1 Log in to the production machine by using the gdr account that is also used in installation of
the GDR software.
Step 2 Check the database synchronization status.
> drcli -c checkrep ResID100101

eSpace EMS
Example: > drcli -c checkrep 100102
RepType: DataGuard
DBName : I2KDB
RlinkName : omsdb[omsdb]_dr_omsdb
Log_Dest_Status : Connected
Time_Computed : None
TransportLag : None
ApplyLag : None
EstimatedOpenTime : None
RealTimeApply : None
MRP0Status : None
OracleDBStatus : READ WRITE
Step 3 Log in to the redundancy machine by using the gdr account that is also used in installation of
the GDR software.
Step 4 Check the database synchronization status.
> drcli -c checkrep ResID100101
Example: > drcli -c checkrep 100102
RepType: DataGuard
DBName : I2KDB
RlinkName : omsdb_dr_omsdb[omsdb]
Time_Computed : 26-JAN-2010 13:43:54
TransportLag : +00 00:00:00
ApplyLag : +00 00:00:03
EstimatedOpenTime : 13(S)
RealTimeApply : ON
MRP0Status : APPLYING_LOG
OracleDBStatus : READ ONLY
----End
Check Result
If the preceding information in bold is displayed, database synchronization is normal.
5.2.5 Checking the File Synchronization Status

This topic describes how to check whether the file synchronization tool Filesync is running
properly.
Procedure
Step 1 Log in to the production machine by using the gdr account that is also used in installation of
the GDR software.
Step 2 Check whether the Filesync process is running properly.
> p

eSpace EMS
If the following information is displayed, the Filesync process is running properly.

root 1632 1 0 14:38 ? 00:00:02 drservice -c
root 1640 1632 0 14:38 ? 00:00:01 dragent 0 db
root 1641 1632 0 14:38 ? 00:00:01 dragent 1 pub
root 1639 1632 0 14:38 ? 00:00:00 filesync 3 1471304970 11112
Step 3 Run the following commands:

> cd ${GDRWORKDIR}/log/
> more filesync.log
> more filesync_sh.log
> more filesync.prt
Check whether the latest records in the preceding log file contain error or failed. If the latest
records do not contain error or failed, file synchronization is normal.
----End
Check Result
 The Filesync process is running properly.
 The latest records in the log file contain error or failed.
5.2.6 Checking the Statuses of the Switched Roles of the DR

System
This topic describes how to check the statuses of the switched roles of the DR system.
Procedure
Step 1 Log in to the production machine as the DR user gdr.
Step 2 Run the drcli -s switchovercheck command to check whether the running status of the DR
environment is normal.
The command is used to check the following information:
 Data replication status of a database resource
 Status of the file synchronization through the file synchronization tool
 Running status of the DR software GDR
 Status of the eSpace EMS key information
After the command is run, the system writes the following execution result into the
switchcheck.prt file in /opt/huawei/GDR/log:
*****************************************************************************
Wed Dec 2 10:11:40 CST 2009
*****************************************************************************
---------------------Check DB replication State BEGIN------------------------

eSpace EMS
[SUCCESS] Stauts of DB replication is Normal.

---------------------Check DB replication State END--------------------------
---------------------Check Filesync State BEGIN------------------------------

[SUCCESS] Filesync status is Normal and all files have been replicated.
---------------------Check Filesync State END--------------------------------
---------------------Check GDR Process State BEGIN---------------------------

[SUCCESS] All processes of GDR are Normal.
---------------------Check GDR Process State END-----------------------------
---------------------Check Application State BEGIN---------------------------

[PROMPT] /opt/huawei/gdr/tools/check_i2000.sh execute successfully.
[SUCCESS] All App Scripts execute successfully.
---------------------Check Application State END-----------------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[SUCCESS] All check finished. Status are all Normal.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
----End
5.2.7 Stopping the GDR Software

This topic describes the commands for stopping the drservice, filesync, and dragnet
processes.
Context
When you stop the GDR software on the production machine, the drservice and filesync
processes are stopped.
When you stop the GDR software on the redundancy machine, the drservice and dragnet
processes are stopped.
Procedure
Step 1 Log in to the production or redundancy machine as user gdr.
Step 2 Run the following command:
> drcli -s stop
----End

eSpace EMS
Fault Management 6 Collecting Fault Information
6 Collecting Fault Information
About This Chapter

This topic describes the information to be collected for fault analysis and location and the
commands used for collecting the information. The fault information to be collected involves
all devices dedicated to providing services in the product, including the devices related to
networks, storage, security, and servers, and their physical networking.
6.1 OS Information
This topic describes the common commands used during OS information.
6.2 Network Device Information
This topic describes the information about network devices that is to be collected.
6.3 DR Information
This topic describes the common information that is collected when a DR fault occurs.
6.4 Oracle Database Information
This topic describes the common information to be collected when you handle database faults.
6.5 Collecting Logs
6.6 Version Information
This topic describes the commands for querying the information about the versions of the
eSpace EMS.
6.1 OS Information
This topic describes the common commands used during OS information.
Table 6-1lists the information about the Linux OS that needs to be collected and the command
used for collecting the information.

eSpace EMS
Table 6-1 Information to be collected and required commands (Linux OS)
No. Information Command Description

to be
Collected
1 OS version # uname -a View the output result.
2 System # top Run the top command as the root user,

performance capture a screenshot, and then save the
status screenshot.
NOTE
Before you run the topcommand, make
sure that top software is installed.
3 System error # more View the current error information of

log /var/log/messages the system.
4 Space # df -k View the space usage of the space

information of usage.
the hard disk
5 Space usage of # du -sh * View the space usage of the file system
the file system. in the current directory.
6 Information # ifconfig -a View the status and IP address of the
about a network network adapter.
adapter
6.2 Network Device Information

This topic describes the information about network devices that is to be collected.
Currently, Huawei Quidway S5600 series Ethernet switches are adopted. Commands may
vary according to switch. For actual commands, see the corresponding manual.
Table 6-2 lists the information about Huawei Quidway S5600 that is to be collected and the
commands used for collecting the information.
Table 6-2 Information to be collected and required commands (switch)

to Be
Collected
1 Status of fault - Check the LED indicators and

indicators interface indicators on the front panel
of a switch.
2 Status of  Command for Record the status of all valid
switch checking all interfaces displayed in a window.
interfaces interface: display
interface

eSpace EMS

to Be
Collected
 Command for
checking
specified
interfaces:
display interface
GigabitEthernet
1/0/20
3 Contents of display log Query logs for any error information
switch logs or the situation that a certain interface
is frequently up and down.
NOTE
The command varies according to switch.
For the actual command, see the
corresponding manual.
Figure 6-1 shows the status of the indicators on Huawei Quidway S5600.
Figure 6-1 Status of the indicators on Huawei Quidway S5600
Table 6-3 shows the modules shown in theFigure 6-1
Table 6-3 Module description
No. Description
1 Status indicators of twenty-four 10/100/1000Base-T
auto-negotiation Ethernet ports
2 Indicators of gigabit SFP combo ports
3 Fabric indicator
4 RPS indicator

eSpace EMS
No. Description
5 Power indicator
6 Module indicator
7 Indicator of port mode switch
8 Mode switch button for port status indicators
9 Seven-segment LED display
10 Console
Table 6-4 describes the status of the indicators.
Table 6-4 Status of the indicators
Indicator Mark on Status Meaning

the Panel
(5) Power PWR Steady A switch is normally started.
indicator green
Blink green The system is performing the power-on self
(1 Hz) test.
Steady red The system fails in the power-on self test.
A fault occurs.
Blink Certain ports fail in the power-on self test.
yellow (1 Their functions fail.
Hz)
Off The switch is powered off.
(4) RPS RPS Steady The AC part and the DC input are normal.
indicator green
Steady The DC input is normal. The AC part fails
yellow or the AC input power is not connected.
Off The DC input power is not connected.
(3) Fabric STK Green The device is in the loop Fabric state.
indicator When a Fabric port receives or sends data,
the indicator blinks quickly.
Yellow A device is in the daisy chain Fabric state.
When a Fabric port receives or sends data,
the Fabric indicator blinks quickly.
Blink green The device is separated from the Fabric
(3 Hz) device (valid when the device is in the
Fabric state).
Off Two Fabric ports are not connected.

eSpace EMS
Indicator Mark on Status Meaning

the Panel
(6) Module Module(M Steady The module is in position and runs
indicator OD) green normally.
Blink The module is not supported or is faulty.
yellow
Off The module is not installed.
6.3 DR Information
This topic describes the common information that is collected when a DR fault occurs.
Table 6-5 describes the information that needs to be collected when a DR fault occurs and the
corresponding commands.
Table 6-5 Information that needs to be collected and the corresponding commands
No. Collecte Command Description

d
Informa
tion
1 Running > p Run the command on theeSpace EMS production

status of machine and DR machine as the DR user gdr.If the
the DR following information is displayed on the eSpace
system EMS production machine:

root 1632 1 0 14:38 ? 00:00:02
drservice -c
root 1640 1632 0 14:38 ? 00:00:01
dragent 0 db
root 1641 1632 0 14:38 ? 00:00:01
dragent 1 pub
root 1639 1632 0 14:38 ? 00:00:00
filesync 3 1471304970 11112
If the following information is displayed on

theeSpace EMS DR machine:

root 20393 1 0 14:40 ? 00:00:02
drservice -m
root 20400 20393 0 14:40 ? 00:00:02
dragent 0 i2000
root 20401 20393 0 14:40 ? 00:00:02
dragent 1 db

eSpace EMS

d
Informa
tion
It indicates that the DR system runs normally.
Otherwise, you need to collect the information, and
then submit it to Huawei technical support engineers.
2 Data > drcli -c Run the command on theeSpace EMS production
replicatio checkrep machine and DR machine as the DR user gdr. If the
n status 100102 following information is displayed on the eSpace
of a EMS production machine, it indicates that the data
database replication status of the database resource is normal
resource on the eSpace EMS production machine:
RepType: DataGuard
DBName : I2KDB
RlinkName : i2kdb[i2kdb]_dr_i2kdb
Time_Computed : None
TransportLag : None
ApplyLag : None
EstimatedOpenTime : None
RealTimeApply : None
MRP0Status : None
OracleDBStatus : READ WRITE
If the following information is displayed after the
preceding command is run on theeSpace EMS DR
machine, it indicates that the data replication status
of the database resource is normal on theeSpace
EMS DR machine:
RepType: DataGuard
DBName : I2KDB
RlinkName : dr_i2kdb_i2kdb[i2kdb]
Time_Computed : 04-JAN-2010 15:27:18
TransportLag : +00 00:00:00
ApplyLag : +03 15:04:53
EstimatedOpenTime : 10(S)
RealTimeApply : ON
MRP0Status :
OracleDBStatus : READ ONLY
Otherwise, you need to collect the displayed
information, and then submit it to Huawei technical
support engineers.
3 Status of > drcli -f check Run the command on the eSpace EMSproduction
the file machine as the DR user gdr .The command output
synchroni will be written into the filesync.prt file in
zation /opt/huawei/GDR/log.Then submit the file to
through Huawei technical support engineers.
the file
synchroni
zation

eSpace EMS

d
Informa
tion
tool
4 Status of > drcli -c Run the command on the eSpace EMS DR machine
the GDR drstate -l as the DR user gdr .Then collect the displayed
resources information and submit it to Huawei technical
support engineers.
5 Running > drcli -s Run the command on the eSpace EMSproduction
status of switchoverchec machine and DR machine as the DR user gdr .The
the DR k command output will be written into the
environm switchcheck.prt file in/opt/huawei/GDR/log.Then
ent submit the file to Huawei technical support
engineers.
6 Run logs > cd Pack all the files (including the filesync.prtand
of the /opt/huawei/gd switchcheck.prt files) in the preceding directory on
GDR r/log the eSpace EMSproduction machine and DR
machine, and then submit the package to Huawei
technical support engineers.
7 Configur > cd Pack all the files in the preceding directory on the
ation /opt/huawei/gd eSpace EMS production machine and DR machine,
informati r/config and then submit the package to Huawei technical
on of the support engineers.
GDR
6.4 Oracle Database Information

This topic describes the common information to be collected when you handle database faults.
Table 6-6 lists the information about an Oracle database that is collected about the commands
for collecting information.
Table 6-6 Information to be collected from an Oracle database and commands for collecting the
information
No. Informa Command Description

tion to
Be
Collecte
d
1 Alarm > cd Save the files that are located in the path and
logs $ORACLE_BASE/diag submit them to Huawei technical support
/rdbms/$ORACLE_SID/ engineers.
$ORACLE_SID/alert The command must be run by the oracle
user.

eSpace EMS

tion to
Be
Collecte
d
NOTE
$ORACLE_BASE indicates ORACLE_BASE set
in the environment variables by the oracle user.
The field$ORACLE_SID indicates the name of
the instance in use, such as i2kdb.
Example:
>
$ORACLE_BASE/diag/rdbms/i2kdb/i2k
db/alert
2 Connecti > cd Save the files that are located in the path and
on logs $ORACLE_BASE/diag submit them to Huawei technical support
/tnslsnr/${HOSTNAME} engineers.
/listener/trace The command must be run by the oracle
user.
NOTE
The field${HOSTNAME}indicates a host name,
such as 2ksvr-1.
Example:
>
$ORACLE_BASE/diag/tnslsnr/i2ksvr-1/li
stener/trace
3 Admin > cd Save the files that are located in the path and
configura $ORACLE_HOME/net submit them to Huawei technical support
tion data work/admin engineers.
The command must be run by the oracle
user.
NOTE
4 Initializat > cd Save the files that are located in the path and
ion files $ORACLE_HOME/db submit them to Huawei technical support
of the s engineers.
database The command must be run by the oracle
user.
NOTE
5 Database > sqlplus / as sysdba Submit the query result to Huawei technical
version support engineers.
SQL> select banner
from sys.v_$version; The command must be run by the oracle
user.

eSpace EMS

tion to
Be
Collecte
d
6 Memory # top>file2.txt Run the top command as the root user and
usage of submit the result to Huawei technical
the support engineers.
database
server
7 Database > exp Back up the data exported from the
data system/password@i2kd database.
export b buffer=8092 full=y The command must be run by the oracle
inctype=complete user.
file=backup.dmp
NOTE
The field password indicates the password of the
system user. The value varies according to actual
situation.
The field i2kdb indicates the instance name of the
eSpace EMSdatabase.
8 Port > more Obtain the value of PORT in the file, and
number, $ORACLE_HOME/net then submit the result to Huawei technical
IP work/admin/listener.or support engineers.
address, a i2kdb = (DESCRIPTION_LIST = (DESCRIPTION
and host = (ADDRESS = (PROTOCOL = TCP)(HOST =
name of i2ksvr-1)(PORT = 1521)) ) )
the
current
database. The command must be run by the oracle
user.
9 Name of > sqlplus / as sysdba Submit the query result to Huawei technical
the support engineers.
SQL> select
current The command must be run by the oracle
DB_UNIQUE_NAME
database user.
from v$database;
instance.
10 Character > sqlplus / as sysdba Submit the query result to Huawei technical
set used support engineers.
SQL> show parameter
in the The command must be run by the oracle
nls_language
current user.
database.
6.5 Collecting Logs


eSpace EMS
eSpace EMS Logs

Table 6-7 describes how the system collects logs when faults occur.
{install path} is the installation path of the eSpace EMS server. The default path is /opt/oms.

ule
Secu {install author_*.log Security
rity path}/run/log/oms/s authentication logs
mod m/
ule nePermitGate_*.log NE right gateway
logs
sm_*.log Main program logs

Alar {install fm_*.log Main program logs
m path}/run/log/oms/f
mod m/ fmprobe_*.log Collection layer
ule logs
fmui_*.log Alarm client logs
fmbackup_*.log Alarm dump logs

Perfo {install pm_*.log Logs related to
rman path}/run/log/oms/p performance
ce m/ monitoring
mod templates, NE
ule event processing,
and view
monitoring
pmdata_*.log Logs collected

when performance
data is saved to the
database
pmds_*.log DS layer logs

pmmeastype_*.log Logs related to
performance
indicator instance
management
pmprobe_*.log Performance data

collection logs
pmthreshold_*.log Performance
threshold
management logs
pmui_*.log Client operation

logs in performance
management

eSpace EMS

ule
NE {install mimcache_*.log Cache logs of the
acces path}/run/log/oms/ea MIM
s m/
mod mim_*.log NE management
ule logs
iconmgr_*.log NE icon processing

logs
eam_*.log Logs related to NE
access operations
such as NE
lifecycle and type
processing
eam_*.log DS logs of the

EAM
eam_*.log Client operation

logs related to NE
access operations
such as tree table
refreshment
Topo {install mapping_*.log Logs related to
logy path}/run/log/oms/to topology object
mod po/ mapping processing
ule
topo_*.log DS layer logs
related to topology
operations such as
right and domain
allocation and
initialization of the
data to be displayed
on the client
topo_*.log uiService logs, such
as flex invocation
Java errors
topomgr_*.log Logs related to

topology object
management and
alarm
synchronization
Soft {install ideploy_ui*.log Running logs
ware path}/run/log/oms/s related to software
mana wm/ management
geme
nt {install *.log Execution logs of
mod path}/run/log/oms/s installation or
ule wm/task name upgrade tasks

eSpace EMS

ule
NOTE
The task name is the
name of the installation
or upgrade task created
in software
management.
Mess /opt/oms/run/log/om trace_node_*.log Logs related to

age strace/ interaction between
traci the mediation node
ng and the UOA
mod
ule trace_app_*.log Running logs of
message tracing
applications
ME {install med_*.log MED framework
D path}/run/log/oms/m logs and logs
mod ed/ related to
ule interaction between
the MED and NEs
over SNMP or
SOAP
ftp.server_*.log Logs related to

interaction between
the MED and NEs
over FTP
ftp.client_*.log Logs related to

interaction between
the MED and NEs
over FTP
ftp.med_*.log Logs related to

interaction between
the MED and NEs
over FTP
mml.med_*.log Logs related to

interaction between
the MED and NEs
over MML
mml.client_*.log Logs related to

interaction between
the MED and NEs
over MML
telnet.med_*.log Logs related to

interaction between
the MED and NEs
over Telnet
telnet.client_*.log Logs related to

interaction between

eSpace EMS

ule
the MED and NEs
over Telnet
ssh.med_*.log Logs related to

interaction between
the MED and NEs
over SSH
ssh.client_*.log Logs related to

interaction between
the MED and NEs
over SSH
Nort {install nbi_*.log Running logs
hbou path}/run/log/oms/n related to the
nd bi/ northbound
mod module, for
ule example,
forwarding alarms
to the upper NMS
and performing
tasks delivered by
the upper NMS
Basi {install web.portal_*.log Portal running logs
c path}/run/log/oms/co
platf re/ event_*.log Event running logs
orm
log.mgmt_*.log Running logs of the
mod
tool used for
ule
dynamically
changing log
severities
task_*.log Task running logs

sbus_*.log sbus running logs
sbus.server_*.log Running logs of the

sbus server
sbus.heartbeat_*.log sbus heartbeat

check logs
ds.core.adapter_*.log Running logs of the

DS layer
fsm_*.log Running logs of the

file management
module
persistence_*.log Running logs of the

persistence layer
sbus.client_*.log Running logs of the

sbus client

eSpace EMS

ule
apache_*.log Running logs of

Tomcat
base_*.log Running logs of the

base module
cache_*.log Running logs of the

cache module
UC {install snmptrap_*.log Logs about sending
servi path}/run/log/uc/ and receiving trap
ce messages between
log NEs and the eSpace
EMS through
SNMP
cbm/*.log Common functional

module logs (such
as cache, rotation,
batch importing,
and device
selection functions)
gs8/*.log GS8 access and

service logs
iad/*.log IAD access and

service logs
ippbx/*.log IP PBX access and

service logs
license/*.log License
management logs
other/*.log NE detection, NE
automatic access,
and IP PBX/IAD
backup and
restoration logs
remotesupport/*.log Remote
maintenance logs
sftpclient/*.log Log downloading

logs
tr69/*.log IP
Phone/SBC/EGW
NE access and
service logs
ums/*.log UMS NE access

and service logs

eSpace EMS

ule
vqm/*.log NE voice quality

monitoring logs
upgrade/*.log NE upgrade logs

Start {install log.log Startup logs
up path}/run/log/virgo/
log stop.exception.log Startup failure logs
Garb {install gc.hprof.txt Garbage collection

age path}/run/log/ logs
colle
ction
log
6.6 Version Information

This topic describes the commands for querying the information about the versions of the
eSpace EMS.
1. Log in to the eSpace EMS server as the i2kuseruser.
2. Run the following command to query the information about the versions of the eSpace
EMS:
> cd {install path}/run/config
> more oms.xml |grep "productVersion"
<param name="productVersion">V300R002C04</param>

eSpace EMS
Fault Management 7 Troubleshooting Cases
7 Troubleshooting Cases
About This Chapter

7.1 Filesync Exception
This topic describes how to handle an exception of the GDR file synchronization process
Filesync.
7.2 DataGuard Synchronization Exception
This topic describes how to handle a DataGuard synchronization exception.
7.3 GDR Process Exception
This topic describes how to handle a GDR process exception.
7.4 Modifying Information About the Master Node Corresponding to the Mediation Node
After Switching
In distributed deployment mode, you need to associate the Mediation node with the current
production machine after switching.
7.5 The Performance Data of Some Network Devices Cannot Be Collected on the eSpace
EMS.
The performance data of some network devices such as routers or switches cannot be
collected on the eSpace EMS.
7.6 Fault Rectification About IP PBX Performance Data Collection Status
This topic describes the methods to rectify faults about IP PBX performance data collection
status.
7.7 Fault Rectification in the File System
This topic describes the methods to rectify faults in the file system.
7.8 eSpace EMS Page Is Leftward Offset in IE 8.0
This topic provides the method to use when the eSpace EMS page is leftward offset in IE 8.0.
7.9 File Download Dialog Box Is Displayed After a Click on the Upload Icon
This topic describes the method to use when the File Download dialog box is displayed after
a click on the upload icon.

eSpace EMS
7.10 Failure to Export Data

7.11 Browser Page Cannot Be Properly Displayed or Some Browser Functions Are
Unavailable
7.1 Filesync Exception

This topic describes how to handle an exception of the GDR file synchronization process
Filesync.
Symptom
The drcli -s switchovercheck command fails after file synchronization is stopped or when the
Filesync process is synchronizing files.
Solution
This exception occurs because the switching check fails after file synchronization is stopped
or when the Filesync process is synchronizing files. You can perform the following steps to
resume file synchronization:
 If file synchronization is stopped, run drcli -f resume on the production machine to start
file synchronization. After file synchronization, run switchovercheck.
 If the Filesync process is synchronizing files, run drcli -f fullrep -l on the production
machine to start lightweight synchronization. After file synchronization, run
switchovercheck.
7.2 DataGuard Synchronization Exception

This topic describes how to handle a DataGuard synchronization exception.
Symptom
During a switchover or failover, the message DB synchronization has
disconnected is displayed. When users check the database synchronization status on the
production machine or redundancy machine, the value of Log_Dest_Status is Disconnected.
Solution
 If the value of Log_Dest_Status is Disconnected on the production machine:
1. Log in to the production machine as user oracle and run the following command:
> sqlplus / as sysdba
2. Check the status of LOG_ARCHIVE_DEST_2.
> select dest_name,status from v$archive_dest_status where dest_id=2;
DEST_NAME STATUS
LOG_ARCHIVE_DEST_2 ERROR

eSpace EMS
If STATUS of LOG_ARCHIVE_DEST_2 is ERROR, an exception occurs during

synchronization between the production machine and the redundancy machine.
3. Log in to the redundancy machine as user oracle.
4. Check whether data synchronization monitoring stops on the redundancy machine.
You can view the monitoring information in listener.ora under
$ORACLE_HOME/network/admin. According to the plan, the monitoring name of
the database on the redundancy machine is omsdb.
> lsnrctl status omsdb
If the following information is displayed, monitoring has stopped.
Connecting to
(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=float_ip_rep1)(PORT=1522)))
TNS-12541: TNS:no listener TNS-12560: TNS:protocol adapter error
TNS-00511: No listener Linux Error: 111: Connection refused
5. Start data synchronization monitoring of the redundancy machine.

> lsnrctl start omsdb
 If the value of Log_Dest_Status is Disconnected on the redundancy machine:
1. Log in to the redundancy machine as user oracle.
2. Check whether the TNS between the production machine and the redundancy machine is
normal.
You can view the TNS information in tnsnames.ora under
$ORACLE_HOME/network/admin. According to the plan, the TNS of the production
machine is omsdb.
> tnsping omsdb
If the following information is displayed, the TNS between the production machine and
the redundancy machine is normal.
TNS Ping Utility for Linux: Version 11.1.0.7.0 - Production on 24-OCT-2011 18:33:52
Copyright (c) 1997, 2008, Oracle. All rights reserved.
Used parameter files:

/opt/oracle/oradb/home/network/admin/sqlnet.ora
Used TNSNAMES adapter to resolve the alias

Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST =
10.85.178.87)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME =
omsdb)))
If the TNS between the production machine and the redundancy machine is abnormal,
check that the network connection between the two machines is normal.
7.3 GDR Process Exception

This topic describes how to handle a GDR process exception.
Symptom
The GDR process is being restarted or not running.

eSpace EMS
Solution
 If the GDR process on the production machine is running properly or not running, collect
the configuration file and log file from {GDRWORKDIR}/config and
{GDRWORKDIR}/log respectively on the production machine. Then contact Huawei
technical support.
 If the GDR process on the redundancy machine is running properly or not running,
collect the files from {GDRWORKDIR}/config, {GDRWORKDIR}/config/i2000, and
{GDRWORKDIR}/log respectively on the redundancy machine. Then contact Huawei
technical support.
7.4 Modifying Information About the Master Node

Corresponding to the Mediation Node After Switching
In distributed deployment mode, you need to associate the Mediation node with the current
production machine after switching.
Procedure
Step 1 Log in to the Mediation node as user i2kuser.
Step 2 Modify the configuration file.
> vi {install path}/run/config/oms.xml
<config name="med"> <config name="center"> <param name="serverPort">31006</param>
<param name="transportPackets">19998</param> </config> <config name="node"> <param
name="nodeId">Mediation_Masterself</param> <param
name="centerIP">10.85.172.90</param> <param name="nodeIP">0.0.0.0</param> <param
name="centerPort">31006</param> <param name="localPort">31007</param> </config>
Change the value of centerIP to the IP address of the current production machine.
Step 3 Restart the Mediation service.
> ./omsd.sh restart
----End
7.5 The Performance Data of Some Network Devices

Cannot Be Collected on the eSpace EMS.
The performance data of some network devices such as routers or switches cannot be
collected on the eSpace EMS.
Symptom
The performance data of some network devices is not displayed in the performance
monitoring view and cannot be found in historical data.

eSpace EMS
Cause Analysis
The eSpace EMS obtains the performance data of network devices by running the SNMP Get
command. However, the SNMP access is disabled on the network devices for security reasons.
Therefore, the eSpace EMS cannot obtain the performance data by running the SNMP Get
command.
Solution
You need to grant SNMP access rights to the eSpace EMS server. In the disaster recovery
networking, you need to grant SNMP access rights to the production machine and the
redundancy machine.
For details, contact the device maintenance personnel.
7.6 Fault Rectification About IP PBX Performance Data

Collection Status
This topic describes the methods to rectify faults about IP PBX performance data collection
status.
Symptom
 In the Monitoring Configuration window, the collection status is Abnormal.
 In the Monitoring View window, no performance data in the latest several data
collection periods is displayed.
Possible Causes
 The connection between the IP PBX and the eSpace EMS is abnormal.
 The IP PBX is upgrading or has been upgraded.
 The IP PBX is restarting or has been restarted.
 Boards on the IP PBX are restarting or have been restarted.
 The active/standby board switchover is being performed or has been performed on the IP
PBX.
Procedure
Step 1 Verify that the connection between the IP PBX and eSpace EMS is normal. If the connection
is abnormal, connect the IP PBX to the eSpace EMS correctly.
Step 2 In system operation logs, check whether any user has upgraded the IP PBX in the day when
exceptions occur.
1. Choose System > Log Management from the main menu.
2. Choose Query Logs > Operation Logs from the navigation tree on the left.
3. In the operation log list, check whether any user has upgraded the IP PBX in the day
when exceptions occur.
 If no, go to Step 3.
 If yes, go to Step 4.

eSpace EMS
Step 3 View the IP PBX operation logs and check whether any user restarts the IP PBX, restarts
boards, or perform the active/standby board switchover.
1. Choose Resource > Resource Management from the main menu.
2. In the Operation column of the device list, click .
The XXX Management window is displayed. In the window name, XXX indicates an
NE name.
3. Choose Manage Service > Operation Log from the navigation tree on the left.
4. In the operation log list, check whether any user restarts the IP PBX, restarts boards, or
perform active/standby board switchover.
Step 4 Restart the performance monitoring task.
1. Choose Performance > Monitoring Configuration from the main menu.
2. Select the performance counter whose collection status is Abnormal, click Stop, and
click Start.
 If the collection status is changed to Normal, the fault is rectified.
 If the collection status is still Abnormal, contact Huawei technical support engineers.
----End
7.7 Fault Rectification in the File System

This topic describes the methods to rectify faults in the file system.
Procedure
 Do not run the fsck command in the file system that has been mounted. Otherwise, data is
lost.
 The shared disk cannot be used by other devices.
Step 1 Check whether the file system is mounted.

# mount
ucemserver2:~ # mount
/dev/cciss/c0d0p2 on / type ext3 (rw,acl,user_xattr)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
debugfs on /sys/kernel/debug type debugfs (rw)
devtmpfs on /dev type devtmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,mode=1777)
devpts on /dev/pts type devpts (rw,mode=0620,gid=5)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
securityfs on /sys/kernel/security type securityfs (rw)

eSpace EMS
The value on indicates that the file system is mounted. Run the umount command to unmount the file
system.
Step 2 Run the fsck -y command to check and restore the file system.
fsck -y /dev/cciss/c0d0p2
# fsck -y /dev/cciss/c0d0p2
fsck 1.38 (30-Jun-2005)
Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs.

Bad nodes were found, Semantic pass skipped
1 found corruptions can be fixed only when running with --rebuild-tree
###########
reiserfsck finished at Wed May 27 15:47:08 2009
###########
fsck.reiserfs /dev/vgscp/lvscp failed (status 0x4). Run manually!
To check and restore the VxFS file system, run the fsck.vxfs command.
# fsck.vxfs -y /dev/sdb1
 If the system displays the message "passed", the checking and restoration complete.
After restarting, you can access the file system.
 If the restoration fails, the file system is damaged. Go to Step 3.
Step 3 Run the following command as prompted:
# fsck.reiserfs --rebuild-tree -y /dev/vgscp/lvscp
Step 4 If the file system cannot be restored, re-create a file system and use the backup data.
----End
Subsequent Processing
After restoration, check whether the file system status is normal.
Step 1 Run the tune2fs to check the ext2 or ext3 file system status before mounting the file system.
# tune2fs -l device name |grep state
# tune2fs -l /dev/sdb2 |grep state
Filesystem state: clean
If clean is displayed in the checking result, you do not need to perform further operations. Otherwise, go
to Step 2.
Step 2 Mount the file system and check logs in /var/log/messages.

If no error prompts exist in the logs, the file system is normal.
----End

eSpace EMS
7.8 eSpace EMS Page Is Leftward Offset in IE 8.0

This topic provides the method to use when the eSpace EMS page is leftward offset in IE 8.0.
Problem
When using the IE 8.0 to download import templates in batches, the eSpace EMS page is
leftward offset, as shown in Figure 7-1.
Figure 7-1 Leftward-offset eSpace EMS page in IE 8.0
Cause
The Internet Explorer is not a standard Internet Explorer 8.0, but Internet Explorer 8.0
Compatibility View.
Troubleshooting
1. Choose Tools > Developer Tools from the menu bar of the Internet Explorer.
The Developer Tools window is displayed.
2. Choose Browser Mode > Internet Explorer 8.0 from the menu bar, as shown in Figure
7-2.

eSpace EMS
Figure 7-2 Developer tool window
After the settings are complete, the eSpace EMS page is displayed normally.
7.9 File Download Dialog Box Is Displayed After a Click

on the Upload Icon
This topic describes the method to use when the File Download dialog box is displayed after
a click on the upload icon.
Problem
Step 1 Click next to Resource file to import on the batch import page, and select an Excel file.
Step 2 Click . The File Download dialog box is displayed, as shown in Figure 7-3.
Figure 7-3 File Download dialog box
----End
Cause
 The selected Excel file does not match the template. For example, this problem occurs if
you select an IAD template on the Import IP PBX page.

eSpace EMS
 Extension ACTION is associated to an incorrect file type.

----End
Solution
Step 1 Close the File Download dialog box and the NE Management tab page.
Step 2 Click My Computer, and choose Tools > Folder Options from the main menu on the
displayed My Computer page.
Step 3 Click the File Types tab.
Step 4 Select extension ACTION, and click Delete.
----End
7.10 Failure to Export Data

Symptom
 When you attempt to export data, such as the current or historical alarm information and
signaling tracing data, the system displays a message shown in Figure 7-4.
Figure 7-4 Interception information
 Exporting the file failed.
Possible Causes
The automatic prompt function for downloading files is disabled.
Procedure
Step 1 Start the Internet Explorer.
Step 2 Choose Tools > Internet Options > Security > Custom Level from the main menu.

eSpace EMS
Figure 7-5 Internet options

eSpace EMS

eSpace EMS
Step 3 Click Enable in Automatic prompting for file downloads under Downloads.
Figure 7-6 Security settings-Internet zone
Step 4 Click OK.

Step 5 Restart the Internet Explorer and log in to the eSpace EMS client. The fault is rectified.
----End
7.11 Browser Page Cannot Be Properly Displayed or Some

Browser Functions Are Unavailable
Symptom
Log in to the browser of the eSpace EMS client. The browser page cannot be properly
displayed or some functions on the browser page are unavailable. For example, slots are left
blank in the IP PBX device panel.

eSpace EMS
Possible Causes
The browsing history is not cleared.
Procedure
Step 1 Clear the browsing history.
 Internet Explorer 8.0
1. Choose Tools > Internet Options from the main menu.
2. Click the General tab and click Delete.

eSpace EMS
Figure 7-7 Internet options

eSpace EMS

eSpace EMS
3. Click Delete in the Delete Browsing History dialog box.
Figure 7-8 Deleting the browsing history
 Firefox 3.6 browser

1. Choose Tools > Options from the main menu.
2. Click Privacy in the displayed Options dialog box.

eSpace EMS
Figure 7-9 Options
3. Click Private Data. In the displayed dialog box, click Clear Now.
----End


Espace EMS Troubleshooting Guide (V200R001C02SPC200 - 04) PDF

Uploaded by

Copyright:

Available Formats

You might also like

Espace EMS Troubleshooting Guide (V200R001C02SPC200 - 04) PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Espace EMS Troubleshooting Guide (V200R001C02SPC200 - 04) PDF

Uploaded by

Copyright:

Available Formats

eSpace EMS

HUAWEI TECHNOLOGIES CO., LTD.

Trademarks and Permissions

Huawei Technologies Co., Ltd.

Huawei Proprietary and Confidential

3 Methods of Locating Faults ......................................................................................................... 9

4 Fault Analysis .............................................................................................................................. 19

Issue 04 (2012-06-08) Huawei Proprietary and Confidential ii

5.1 Checking the Running Status of the eSpace EMS .......................................................................................... 46

6 Collecting Fault Information .................................................................................................... 55

Issue 04 (2012-06-08) Huawei Proprietary and Confidential iii

This topic describes conventions of this guide.

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 1

About This Chapter

2.1 Fault Source

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 2

In routine maintenance, maintenance personnel regularly take preventive measures

2.2 Precautions for Troubleshooting

2.3 Requirements on Maintenance Personnel

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 3

2.4 Troubleshooting Flow

2.4.1 Troubleshooting Flowchar

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 4

Figure 2-1 Figure 1 Process of handling faults

2.4.2 Collecting Fault Scenario Information

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 5

2.4.3 Locating and Rectifying Faults

2.4.4 Checking Fault Rectification

2.4.5 Generating a Fault Rectification Report

2.4.6 Contacting Huawei

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 6

2.5 Obtaining Huawei Technical Support

Table 2-1 Table 1 Methods of obtaining technical support from Huawei

Method Operation Instruction

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 7

Visithttp://support.huawei.com, and then clickCommunity.TheCommunity page

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 8

3 Methods of Locating Faults

About This Chapter

3.1 Viewing Alarms on the eSpace EMS Client

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 9

Figure 3-1 Viewing Alarms

Step 3 View the current fault alarms, as shown in Figure 3-2.

Figure 3-2 Filter window

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 10

Figure 3-3 Current fault alarms

3.2 Log Analysis

3.2.1 Changing a Log Level

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 11

No Name Level File

Issue 04 (2012-06-08) Huawei Proprietary and Confidential 12

eSpace EMS Logs

Table 3-1 Log description

Mod Log File Path Log File Log Description

sm_*.log Main program logs

fmui_*.log Alarm client logs

fmbackup_*.log Alarm dump logs

pmdata_*.log Logs collected