Professional Documents
Culture Documents
IBM Tivoli Network Manager IP Edition 3.9 Best Practices v1.0
IBM Tivoli Network Manager IP Edition 3.9 Best Practices v1.0
IBM Tivoli Network Manager IP Edition 3.9 Best Practices v1.0
9 Best Practices
Note: Before using this information and the product it supports, read the information in Notices located at the end of this document.
Copyright IBM Corporation 2011, 2012. US Government Users Restricted Rights-Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
About this publication ...................................................................................................................... 5
Intended audience .......................................................................................................................................... 5 What this publication contains ..................................................................................................................... 5 Conventions used in this publication........................................................................................................... 5 Related publications ....................................................................................................................................... 6
Will discovery remove devices no longer in the network ? .......................................................... 25 Partial rediscovery will not work with file finder only ................................................................. 25 Why has the discovery process started again immediately after discovery completion ?........ 26
Chapter 3 Polling.............................................................................................................................. 27
Overview........................................................................................................................................................ 27 Typical customer configurations ................................................................................................................ 28 Poller basics ................................................................................................................................................... 28 What devices should I be polling ..................................................................................................... 29 Ping (ICMP) versus SNMP impact on polling ................................................................................ 29 Thresholding ....................................................................................................................................... 30 MIB graphing ...................................................................................................................................... 30 Polled data storage ............................................................................................................................. 30 Built-in device and interface polling capabilities............................................................................ 31 Multiple polling considerations ........................................................................................................ 31 Dont fall behind in polling ............................................................................................................... 31 Time out and retry .............................................................................................................................. 32 Strategy for LAN connected (fast responders) versus WAN connected (slower responders) polling 32 Tuning poller threads......................................................................................................................... 32 Polling intervals .................................................................................................................................. 32 When to add another poller......................................................................................................................... 33 Enhanced procedure for the creation of a new poller Instance .................................................... 33
Notices................................................................................................................................................ 46
Trademarks.................................................................................................................................................... 48
Intended audience
This publication is intended as essential reading for all technical staff that are responsible for: Developing IBM Tivoli Network Manager IP Edition Installing and administering IBM Tivoli Network Manager IP Edition Supporting IBM Tivoli Network Manager IP Edition
Typeface conventions
This publication uses the following typeface conventions: Bold Lowercase commands and mixed case commands that are otherwise difficult to distinguish from surrounding text Interface controls (check boxes, push buttons, radio buttons, spin buttons, fields, folders, icons, list boxes, items inside list boxes, multicolumn lists, containers, menu choices, menu names, tabs, property sheets), labels (such as Tip: and Operating system considerations:)
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 5
Keywords and parameters in text Italic Citations (examples: titles of publications, diskettes, and CDs) Words defined in text (example: a nonswitched line is called a point-to-point line) Emphasis of words and letters (words as words example: "Use the word that to introduce a restrictive clause."; letters as letters example: "The LUN address must start with the letter L.") New terms in text (except in a definition list): a view is a frame in a workspace that contains data Variables and values you must provide: ... where myname represents.... Monospace Examples and code examples File names, programming keywords, and other elements that are difficult to distinguish from surrounding text Message text and prompts addressed to the user Text that the user must type Values for arguments or command options
Related publications
IBM Tivoli Network Manager IP Edition Administration Guide http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/nmip_adm_pdf_39.pdf Describes administration tasks for IBM Tivoli Network Manager IP Edition, such as how to administer processes, query databases and start and stop the product. This publication is for administrators who are responsible for the maintenance and availability of IBM Tivoli Network Manager IP Edition.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 6
IBM Tivoli Network Manager IP Edition Discovery Guide http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/nmip_dsc_pdf_39.pdf Describes how to use IBM Tivoli Network Manager IP Edition to discover your network. This publication is for administrators who are responsible for configuring and running network discovery. IBM Tivoli Network Manager IP Edition Event Management Guide http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/nmip_poll_pdf_39.pdf Describes how to use IBM Tivoli Network Manager IP Edition to poll network devices, to configure the enrichment of events from network devices, and to manage plug-ins to the Tivoli Netcool/OMNIbus Event Gateway, including configuration of the RCA plug-in for root-cause analysis purposes. This publication is for administrators who are responsible for configuring and running network polling, event enrichment, root-cause analysis, and Event Gateway plug-ins. IBM Tivoli Network Manager IP Edition Getting Started Guide http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/nmip_gs_pdf_39.pdf Describes how to set up IBM Tivoli Network Manager IP Edition after you have installed the product. This guide describes how to start the product, make sure it is running correctly, and discover the network. This guide describes how to configure and monitor a first discovery, verify the results of the discovery, configure a production discovery, and how to keep the network topology up to date. Once you have an up-to-date network topology, this guide describes how to make the network topology available to network operators, and how to monitor the network. The essential tasks are covered in this short guide, with references to the more detailed, optional, or advanced tasks and reference material in the rest of the documentation set. IBM Tivoli Network Manager IP Edition Product Overview http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/nmip_ovr_pdf_39.pdf Gives an overview of IBM Tivoli Network Manager IP Edition. It describes the product architecture, components and functionality.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 7
Chapter 1 Introduction
The IBM Tivoli Network Manager IP Edition 3.9 architecture consists of the following functional areas:
Network discovery
Network discovery involves discovering your network devices, determining how they are connected (network connectivity), and determining which components each device contains (containment). The complete set of discovered devices, connectivity, and containment is known as a network topology. You build your network topology by performing a discovery and then ensuring that you always have an up-to-date network topology by means of regular rediscoveries.
Network polling
Network polling determines whether a network device is up or down, whether it has exceeded key performance parameters, or whether links between devices are faulty. If a poll fails, Network Manager generates a device alert, which operators can view in the Active Event List.
Topology storage
Network topology data is stored in the Network Connectivity and Inventory Model (NCIM) database. The NCIM database is a relational database that consolidates topology data discovered by Network Manager. The NCIM database can be implemented using any one of the following relational database management systems: DB2, IDS, MySQL, and Oracle.
Event enrichment
Event enrichment is the process by which Network Manager adds topology data to events, thereby enriching the event and making it easier for the network operator to analyze. Examples of topology data that can be used to enrich events include system location and contact information.
Root-cause analysis
Root cause analysis is the process of determining the root cause of one or more device alerts. Network Manager performs root cause analysis by correlating event information with topology information. The process determines cause and symptom events based on the discovered network device and topology data.
Event storage
Event data is generated by Network Manager polls and also by Tivoli Netcool/OMNIbus probes installed on network devices. A probe is a protocol or vendor specific piece of software that resides on a device, detects and acquires event data from that device, and forwards the data to the ObjectServer as alerts. Event data can also be received from other event sources. Event data from all of these event sources is stored in the Tivoli Netcool/OMNIbus ObjectServer. Note: Tivoli Netcool/OMNIbus is a separate product. If you do not already have OMNIbus then you must get a copy and install it. For more information, see the Network Manager installation documentation.
At any time a network administrator can set up polling of specific SNMP and ICMP data on one or more network devices. This data is stored in the NCPOLLDATA historical polled data database. By default, Network Manager implements the NCPOLLDATA database using a database schema within the NCIM database. You can optionally integrate Network Manager with IBM Tivoli Monitoring 6.2, with the integrated Tivoli Data Warehouse, to provide extra reporting capabilities, including better report response times, capacity, and isolation of the operational database (NCIM) from unpredictable reporting traffic.
Topology visualization
Network operators can use several topology visualization GUIs to view the network and to examine network devices. Using these GUIs operators can switch between topology views to explore connectivity or associations, and to see alert details in context. Operators also have access to diagnostic tools such as SNMP MIB Browser, which obtains MIB data for devices.
Event visualization
Operators can view event lists and use alert severity ratings to quickly identify high-priority device alerts. Operators can switch from event lists to topology views to see which devices are affected by specific alerts. They can also identify root-cause alerts and list the symptom alerts that contribute to the root cause.
Reporting
Network Manager provides a wide range of reports, including performance reports, troubleshooting reports, asset reports, and device monitoring reports. Right click tools provide immediate access to reports from topology maps. This Best Practices Guide will cover the areas of network discovery, network polling, event enrichment and root-cause analysis.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 9
Chapter 2 Discovery
Overview
Discovery is the first and most important task after Network Manager has been installed. The more complete and accurate the topology, the more value you will gain from route cause analysis (RCA) and the faster you can troubleshoot reported network problems. It provides a solid base to build out your network management solution for proactive and reactive monitoring and asset reporting. This section will help you set up your discovery efficiently and effectively, start and monitor discovery, and show you how to verify it and fix issues afterwards. The illustration below provides an overview of the discover process and how assets, finders, filters, agent, stitchers (and so on) fit together in the process.
For a complete description of the discovery process, see the section About discovery in the IBM Tivoli Network Manager IP Edition Getting Started Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/overview/concept/nmip_ovr_disco.html
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 10
For standard definitions of the discovery phases, see the section Understanding discovery phases in the IBM Tivoli Network Manager IP Edition Getting Started Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/start/task/nmip_dsc_understandingdscphases.html
Interrogating devices
This phase consists of finding as many IP addresses as possible via the pinger and file finder. Once an IP address has been found an SNMP query for the sysObjectId is made and the result is placed in the Details.return table. Other IP addresses on the device are found from the SNMP ipAddress table and constructed in the translations.ipToBaseName table. Entries that match the Pre-discovery filter are then distributed to the various agent tables. Each agent has a filter in the <agent>.agnt file to determine which devices to query. The results from each agent are accumulated in the <agent>.returns tables. When a period of time passes, by default 90 seconds, when no new IP addresses are discovered, when this phase ends, whats called a blackout period begins. During the blackout period the rest of the discovery progresses normally but any new IP addresses that are found are held until the end of discovery. The pingers may still be working, especially if ping sweeping sparse class B sized subnets. Any new IP addresses will be placed in finders.pending until the rest of the phases are complete when it will restart the discovery process for these addresses.
Resolving addresses
This phase is responsible for gathering the MAC/IP information using the ArpCache agent.
Downloading connections
Now that you have the MAC information, agents query the layer 2 switches for connectivity data. Once these agents have worked through their queues and populated the <agent>.returns tables, this phase ends.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 11
Correlating connectivity
At this point, discovery has completed querying the network and begins analyzing the data to build the connectivity layers and containership relationships. This data moves through a set of tables and is finally consolidated in ncp_models master.entityByName. At this point, ncp_model moves the data into the ncim topology database. The connection information is built up as layers from the agent information in the following tables (among others): IPLayer.entityByNeighbor switchTopology.entityByNeighbor CDPLayer.entityByNeighbor and consolidated in fullTopology.entityByNeighbor. This is useful for Support engineers to understand why a connection error occurred.
General Discovery
Use the standard Network Manager 3.9 documentation as a reference point Plan your network discovery in a phased approach, get a strong baseline of your network first Define the parts of the network that you want to discover If your subnet is sparsely populated, including individual routers is likely to result in a faster discovery Initially only discover a limited number of layer 2 and 3 devices and dont run in DEBUG mode Break up large networks into smaller manageable groups Use noisy routers as seeds, as these make good initial seeds Use file and ping finders to intelligently seed your network To restrict discovery, seed with a list of devices using the File finder or the Ping finder, and disable feedback in the Advanced tab Focus on an initial reduced set of agents that meet your requirements Add additional agents for specific requirements (e.g. TraceRoute) Use pre-discovery filters to filter out end nodes, printers, and similar devices Use pre-discovery filters to filter out sensitive nodes that you want to monitoring Use post-discovery filters to prevent instantiation of devices Only modify the standard helpers if you are an experienced user To speed up the discovery process, you could reduce the helper timeouts and number of retries If you have a very reliable network in which devices respond quickly, you can specify a small default timeout To reduce the amount of network traffic caused by a discovery, you could increase the timeout and disable broadcast and multicast pinging
Scopes
Seeds
Agents
Filters
Helpers
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 12
Use OQL to determine where in the process discovery has halted Determine what state agents are in Run DEBUG on suspected failing components Generate a list of discovered IP addresses to make the discovery more efficient on a regular basis Increase number of threads on heavily used agents
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 13
Excluded IPs/subnets
List the IP addresses that are in scope that should be excluded e.g. interfaces being monitored for DoS Briefly describe the networking technologies in place e.g. MPLS, ATM/FR, metro Ethernet
Network technologies
Number of main nodes and interfaces List of network devices NAT description Security
An approximate number
List the device types e.g. Cisco 2601, Some devices may have specific Juniper M5 to be monitored discovery requirements If it is a NAT environment, is it static or dynamic ? Which vendor ? Are there any ACLs / firewalls / security ? List the community strings of all devices in the discovery scope RO required for some Cisco Catalysts. Sometimes required in some instances for MPLS discovery (Cisco, Juniper, Huawei) Are any devices in an out-of-band management network ? How do you want to name the devices /etc/hosts, DNS or sysName also ifName / ifDescr / ifIndex for the interface naming ? Do you want the loopback back address to be the main node IP address ? Are there any firewalls that can never be discovered ? To allow authentication of all devices To ensure proper access to devices
Miscellaneous
in the IBM Tivoli Network Manager IP Edition Getting Started Guide, or online at http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/start/task/nmip_dsc_settingdscboundaries.html
IPv4/IPv6
Network Manager does not support the IPv4-mapped IPv6 format and expects all IPv6 addresses to be in standard colon-separated IPv6 format. For example, Network Manager does not support an Ipv4-mapped IPv6 address such as: ffff: 192.0.2.128. Instead enter this address as: ffff:c000:280 (standard colon-separated IPv6 format).
Seeds
The pingFinder should be enabled if you want the discovery engine to find devices as a result of querying other devices - referred to as feedback. The pingFinder will ping sweep subnets, while the fileFinder provides a convenient way to seed the discovery with one or more lists of IP addresses. There are options that let you limit discovery strictly to a list of IP addresses (either in a file or pingFinder entries) if required. Keep in mind that seeds only work within the confines of the scope. Ping sweeping large class B subnets is less effort to configure but also less efficient than fileFinder lists. After successfully discovering your managed network, you may want to consider generating a list of discovered IP addresses to make discovery more efficient on a regular basis in production. New devices will still need to be discovered since feedback is enabled by default.
Noisy routers
Noisy routers are good initial seeds
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 15
Case 1 complex seed (manipulated scope) Objective: sweep 192.168.0.0 / 16 - split scope into smaller seed networks (e.g. / 19) Threads = 8 Number of devices = 5000 Addresses to ping = 256 * 256 = 65536 Pings per second = 10 Number of retries = 1 Time taken (seconds) = (Addresses to ping / Number of threads / Pings per second) + ((Addresses to ping Number of Devices) * Number of retries / Number of threads / Pings per second) = (65536 / 8 / 10) + (65536 5000) * 1 / 8 / 10) = 1576 (< 0.5 hours)
Threads = 1 Number of devices = 5000 Addresses to ping = 256 * 256 = 65536 Pings per second = 10 Number of retries = 1 Time taken (seconds) = (Addresses to ping / Number of threads / Pings per second) + ((Addresses to ping Number of Devices) * Number of retries / Number of threads / Pings per second) = (65536 / 1 / 10) + (65536 5000) * 1 / 1 / 10) = 12607 (~ 3.5 hours)
Ping finders
Any successful ping response will result in an object being created and further SNMP queries which may result in other IP addresses being found from the routing table or ipNetToMedia table. This is known as feedback and can be controlled from the advanced tab. For example-based tutorial steps on how to configure the Ping finder, see the section Enabling the Ping finder and the feedback mechanism in the IBM Tivoli Network Manager IP Edition Getting Started Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/start/task/nmip_dsc_enablingpingfinder.html You can specify the seeds as pingFinders, which define IP addresses to ping or subnet addresses to ping sweep. Noisy routers make good initial seeds. Note: Be conscious of the size of the subnet you specify, especially with IPv6 subnets. Class B networks can take 30-40 minutes to ping, and class A, a day or more. If these subnets are sparse this can lead to long periods of silence and the discovery will think it is done and continue with the final stages to completion. However the pinger continues and as more ping responses come in, they will cause the discovery to begin a new cycle. It is designed this way to ensure the fullest possible discovery, but does make it tricky to know when the discovery is really complete. IPv6 subnets should have a mask of at least 96 bits or greater.
File finders
You can also create fileFinder entries which specify files containing lists of IP addresses. Thanks to the formatting flexibility, you can use existing files you might maintain outside of Network Manager and thus provide a basis for provisioning control. You must specify the
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 16
delimiter and the column number for the IP address. Network Manager can extract just the IP address or both the IP address and the corresponding name which will be used as the display name for that device. Note that the delimiter is a regular expression, so the default [ ]+ indicates one or more spaces. For example-based tutorial steps on how to configure the File finder for efficient production seed settings, see the section Configuring production discovery settings in the IBM Tivoli Network Manager IP Edition Getting Started Guide, or online at http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/start/task/nmip_dsc_settingupdscconfigurationforproduction.htm l
Note: Be careful formatting your Finder files. The IP address must be cleanly defined with no leading characters or spaces otherwise it will not be used. Check ncp_df_file.<DOMAIN>.log for syntax errors. Since fileFinders are more efficient for discovery, many users tend to use this in production. Continue to keep the PingFinder box checked, even if you have no pingFinder entries, so that new devices added to the network will be discovered
Agents
Full discovery agents
The default set of agents are typically good to begin assessment of your specific needs. Click on each agent to see an explanation that will give you an idea of whether you will benefit from it. For example-based tutorial steps on how to configure agents for a discovery, see the section Ensuring all network technologies are covered in the IBM Tivoli Network Manager IP Edition Getting Started Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/start/task/nmip_dsc_ensuringnwtechnologiesarecovered.html Focusing on a core set of agents will allow you to collect the essential data for your network and speed up the overall discovery time. When running a layer 3 discovery, the Details and AssocAddress agents are run along with a combination of the following IP layer agents: IpRoutingTable IpBackupRoutes IPForwardingTable HSRP VRRP TraceRoute agent can be used if there is a firewall on the network, because SNMP calls cannot always be made through firewalls. If you use the TraceRoute agent, you must specify, as a discovery seed, the subnet node for the subnet on the other side of the firewall. IPv4/6 InetRouting. If you have IPv6 in your network, consider running this agent to discover the connectivity, particularly the IPv6 connectivity.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 17
Some routers support layer 2 technologies. For example, when an ATM card is located in a router chassis, layer 3 discovery agents, such as the IpRoutingTable agent, only discover interfaces with an IP address. Therefore, to fully discover all the interfaces on routers that support layer 2 technologies, you must run the appropriate agents. The ArpCache agent retrieves the physical address of a device, so is only required (in conjunction with the Switch agents) when performing layer 2 discoveries. Frame Relay agents should be run in conjunction with the IP layer agents if you need to add DLCI information to the interfaces of Frame Relay devices. Switch agents must be run for a layer 2 discovery. The Entity agent provides physical containment information from the Entity MIB for asset purposes, physical containment information, and root cause. It is resource intensive and will extend the discovery time and can create additional entity objects in the database. Some of the asset reports require information provided by the Entity and OSinfo agents including the Structure Browser which shows containment information such as modules, cards, and slots. These two agents are disabled by default. By default the IpRouting agent is enabled. It learns about other IP addresses and subnets to feed back into the discovery. Alternatively you can enable the IpBackupRoutes agent which learns about IP addresses from the ipNetToMedia (arp) table instead.
Note: If you plan on using the Partial Discovery feature, you will need to enable the checkbox Enable Caching of Discovery Tables on the Advanced tab. This stores the OQL caches to disk and enables you to continue doing partial discoveries after a restart of ncp_disco. The Ping finder on the Seeds tab also needs to be enabled (even if there are no ping finder entries). Note: You can trigger a partial discovery by creating entries in Disco OQL table finders.rediscovery. This can be done using a script and scheduled with cron to update a volatile region of your network more frequently.
Filters
Prediscovery filters
Prediscovery filters prevent the discovery from retrieving detailed data or connectivity data from the device and prevent discovered devices from being polled for connectivity information. Only devices matching the prediscovery filter are fully discovered. If no prediscovery filter is defined, then all devices within the scope are discovered. For example-based tutorial steps on how to tune a scope zone using a prediscovery filter, see the section Fine tuning a subnet scope zone using prediscovery filters in the IBM Tivoli Network Manager IP Edition Getting Started Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/start/task/nmip_dsc_creatingcomplexscopes.html Prediscovery filters provide a mechanism to base discovery on complex IP ranges that cannot be easily defined in the Scope tab. It can be used to filter out devices based on their sysObjectId value. Default filters exist to filter out end nodes, printers, and similar devices. You can create quite complex multiple filters, which makes this feature very powerful, but try to ensure that filters are designed so that they can be easily maintained if you need to add new scopes. The filter acts on the fields of the Details.returns OQL table in the discovery (Disco) service, so you can use fields other than IP addresses, such as m_ObjectId (equivalent to sysObjectId). A device must pass all filters to be discovered. Search InfoCenter for Using Regular Expressions for detailed usage. Note: A simple way to test your regular expression syntax is to use it in an OQL command such as in the Disco service: select * from Details.returns where m_UniqueAddress LIKE '10\.30\..*\.[1-5]$' This will result in a list of all the IP addresses discovered in that range. If you do an initial discovery without excluding these addresses, you can use this to test your syntax.
Sensitive nodes
You might want to apply a filter to sensitive devices that you do not want to poll. A device might be considered sensitive because there is a security risk involved in polling the device, or because polling might cause the device to overload.
Post discovery
You might want to apply this filter to devices that you do not want to poll, such as workstations and printers. A post-discovery filter restricts device instantiation. If a postdiscovery filter is defined, only devices that pass the criteria are instantiated, that is, sent to
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 19
ncp_model. If no post-discovery filter is defined, then all discovered devices are sent to ncp_model.
Helpers
The helpers are specialized applications that retrieve information from the network on demand. The default helper configuration is sufficient for most networks. However, you might decide to alter the configuration for several reasons. Configuring the Helper System can speed up network discovery, but is recommended for experienced users. Although the discovery agents retrieve connectivity information, they do not have any direct interaction with the network. Instead, they retrieve connectivity information through the Helper System, which consists of a Helper Server and various helpers. Reasons to configure the helpers include: To speed up the discovery process, you could reduce the helper timeouts and number of retries. If you have a very reliable network in which devices respond quickly, you can specify a small default timeout. You might want to change the default timeouts for the SNMP and Telnet helpers if you have many devices that either do not respond to SNMP and Telnet or that are set up not to respond to Telnet or SNMP access. A large default timeout would therefore mean that the helpers wait for a long time for responses they never receive. To reduce the amount of network traffic caused by a discovery, you could increase the timeout and disable broadcast and multicast pinging
Passwords
Use the Passwords tab in the Discovery Configuration GUI to specify the SNMP community strings, including SNMPv3 credentials, used in your network. For example-based tutorial steps on how to configure SNMP community strings, see the section Configuring device and subnet access using SNMP community strings in the IBM Tivoli Network Manager IP Edition Getting Started Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/start/task/nmip_dsc_confdeviceaccessusingsnmpcommstrings.ht ml Typically you will not need telnet access to start with. Consider the telnet based agents once you have a good discovery to see if they will improve your goals to provide deeper modelling such as MPLS VRF/VPNs, NAT, or fill any gaps in information from the SNMP based agents for BGP, layer 2, OSPF, e.g. Cisco SRP.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 20
DNS
Configure the DNS service to use either the system DNS setup or to specify a particular DNS server.
NAT
Once you have successfully completed your initial discoveries, refer to the documentation under Configuring NAT translation if you have NAT gateways. Refer to the section Configuring NAT discoveries in the IBM Tivoli Network Manager IP Edition Discovery Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/disco/task/nmip_dsc_confnatdiscoveries.html Note: If you do not define any NAT gateways on this page, make sure the NAT checkbox is DISABLED. Otherwise discovery can hang and eventually complete with nothing discovered.
Advanced
Typically you do not need to change anything here for the initial discoveries. These options provide control over discovery that can help work around known network device eccentricities. For a deeper explanation of these items, click on the context help icon on this page. However a few items are worth drawing your attention to: Enable Feedback Control - select this to discover additional devices learned from other devices. Disabling this will assist in minimizing the number of IP addresses. Enable Ping Verification - select this to force discovery to only create objects for devices that respond to a ping. Detect best setting (default) will set it depending on the state of Feedback control - which will enable ping verification only if feedback control is enabled). Enable Caching of Discovery Tables - enable this to store a full set of discovery cache files. Partial rediscovery relies on the OQL data from the last discovery being available. When ncp_disco is stopped, the OQL in-memory data is lost, preventing partial rediscoveries. To ensure partial discoveries are always available, enable this option. It is also useful if you need to report a discovery problem to IBM Support. Be aware that it does affect the discovery performance, especially for very large networks. Enable ifName/ifDescr Interface Naming - this option provides a more useful interface display name. Enable sysName Naming - enable this if you have a certain discipline with sysName on the network devices and want to use it as the device display name. Enable VLAN Modelling - if you do not need VLAN modelling (useful for RCA), then disable this to speed up discovery. Note:
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 21
If you need to edit the discovery configuration files themselves, make sure the Discovery Configuration is not open in the GUI, otherwise when changes are saved in the GUI they could overwrite your file edits.
Running a discovery
Now that you have completed the configuration, move to the Discovery Status page to actually start discovery. For example-based tutorial steps on how to run and monitor a discovery, see the section Launch the discovery and monitor discovery progress in the IBM Tivoli Network Manager IP Edition Getting Started Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/start/task/nmip_dsc_startingandmonitoringdiscovery.html
Alternatively, use OQL to delete all the entities in the topology database, by executing this OQL command on the ncp_model service: delete from master.entityByName;
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 22
Object Query Language (OQL) is a SQL-like language used to manipulate memory resident databases in both Network Manager and Tivoli Netcool/OMNIbus. It is covered fully in the documentation under Reference->Reference Languages. OQL is used in the context of a service which identifies the process containing the in-memory database. This allows you to see the real time data from services such as: amos ctrl disco ncim ncoGate ncp_model ncp_poller Event processing within the root cause analysis engine Information to control the automatic start and shutdown of ncp processes Discovery data throughout the cycle Access to the NCIM topology database in any of the supported relational databases. In this case, you would use standard SQL rather than OQL syntax to query the tables Event processing Topology data output from the discovery process prior to transfer to the relational database Query the set of devices being monitored by each policy. It permits filtering against a specific policy (e.g. show all devices currently monitored by the Interface Ping policy) and/or against a single entity (e.g. show all policies that are monitoring a particular device). Note that this is not a query of the database, and what you think you should be polling, but a query of what is currently being processed by the poller. Information used by the ncp_poller and SNMP stack
SnmpHelper
To see a full list of available services, type, ncp_oql -options You can execute OQL commands interactively from either the command line or from the GUI, select Administration->Network->Management Database Access. The warning about advanced users is to remind you that this is live data that can corrupt processes if you change it. Typically you will only be viewing data, not changing it.
Seed router on subnet other than the server SNMP timeout set too high Specific devices may crash or agent hang during discovery
Run DEBUG
Run discoveries without debug generally but if a problem develops then run debug on the suspected failing components. Remember that by altering the DiscoSchema.cfg and DiscoHelperServerSchema.cfg you can get debug for every helper, finder and agent.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 24
Network Manager maintains a count, called linger time. By default each device starts out at 3 and on each discovery is reset to 3 if they respond to ICMP or SNMP. If not, the count is decremented. On reaching -1 it is removed from the database. You can change the default by editing the $NCHOME/etc/precision/ModelSchema.cfg file and changing the master.entityByName Linger time field definition. Use the RemoveNode.pl Perl script to remove specified devices from the network topology. It does this by setting the device to an unmanaged state and marking the device to be removed during the next full discovery.
Why has the discovery process started again immediately after discovery completion ?
This usually occurs when the ping finder is used during the discovery. During phase 1 you expect to find all the devices, however this may not be the case. Network Manager decides to exit phase 1 when the time after the last found IP address exceeds a user defined time limit, m_NothngFndPeriod, defined in DiscoSchema.XXXX.cfg. By default this is 90 seconds. Once out of phase 1, any newly found devices are placed in the finders.pending table. Once the initial discovery has completed, the discovery will examine the finders.pending table and then discover these devices, thereby restarting the discovery. To stop this immediate discovery, either increase the m_NothngFndPeriod period OR add more seed IP addresses or subnets to the ping finder.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 25
Chapter 3 Polling
Overview
One of most important functions of Network Manager is polling the managed network. While significant functionality exists to provide event display and to automate responses, successfully polling the target devices at an interval consistent with the customers management objectives is critical. The Network Manager product overview documentation provides indicative configurations examples for typical network deployments. This section explains some of the key areas such as typical numbers of network domains and polling engines. These examples are only indicative and a more detailed sizing and dimensioning activity should be carried out to ensure that you meet your required network performance. Note: It is not possible in this document to share actual Network Manager performance and capacity results for the many platforms examined in the test lab or in customer engagements. This material is confidential and/or might be taken as a performance guarantee. Production settings can be much different than a controlled test lab. The material presented in this document is based on the collective experiences of our laboratory testing and with customers.
A successful deployment of Network Manager's polling function will result from an incremental process of deployment an evaluation in the customers unique environment. Read this material, start slowly, gain experience, grow the workloads, and monitor. Network Manager provides the user with the ability to activate workloads that can quickly overwhelm the single default poller instance. Most polling concerns result from unintentionally large workloads and devices that do not respond. Key areas that influence individual poller capacity and scalability are: Frequency of polling Single versus multiple core processors Data storage options (including MIB graphing, storage to Network Manager's database, storage to Tivoli Data Warehouse) Database considerations for data storage (actual database selected among those supported, OS kernel tuning, database memory and disk tuning options, and database server capacity) Real event versus post processing Timeouts Retries Tuning poller threads Network response time for the target devices LAN or WAN connected Average rate of network targets failure to respond; poor network connections Type of polling (specific policies); the number of polling packets sent for individual policies (interface polls = a lot per device versus a chassis ping) Complexity of polling (single policy mapped to a poller instance or multiple policies) Integration with other active Tivoli products Automation in place Unrealistic expectations
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 26
As a part of optimizing specific polling variables, you will need to monitor the health of your poller without impacting overall performance. Setup of a light DEBUG process with a low overhead to identify performance issues will pay an important part of this process.
Poller basics
To poll the network, Network Manager periodically sends queries to the devices on the network. These queries determine the behavior of the devices, for example operational status, or the data in the Management Information Base (MIB) variables of the devices. Network polling is controlled by poll policies. Poll policies consist of the following: Poll definitions, which define the data to retrieve. Poll scope, consisting of the devices to poll. The scope can also be modified at a poll definition level to filter based on device class and interface. Polling interval and other poll properties. Note: The poll scope is often the cause of a device not being polled (particularly in the class based part of the poll scope). When defining poll policies, give extra attention to filtering devices correctly.
For more information on poll policies, see the section About polling the network in the IBM Tivoli Network Manager IP Edition Event Management Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/poll/concept/nmip_mon_pollnwovr.html
Assigning correct poll policies is critical in effectively monitoring essential devices in your network. There is no one solution fits all and each network needs to be assessed based on your needs. What devices do you want to constantly monitor ? What type of information do you want to report ? What polling frequency do you require ? What type of polling do you require ? What thresholds do you want to use ? Do you want to store data for historical reporting ? For more information on poll policy scope, see the section Poll policy scope in the IBM Tivoli Network Manager IP Edition Event Management Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/poll/concept/nmip_poll_policyscope.html
SNMP polling involves retrieving Management Information Base (MIB) variables from devices in order to determine faulty behavior or connection problems. Faulty devices or faulty connections are then diagnosed by applying predefined formulas to the extracted MIB variables. Unlike ICMP polling policies, SNMP polling policies do not generally yield a one-for-one packet workload. For a network device, multiple SNMP queries result for a given policy. For example, a capacity of 50 SNMP packets per second does not mean you are polling 50 devices, but more likely just a few. Capacity planners need to consider not only the rate of packets but the number of devices that can be supported. Note: By default, Network Manager provides a single poller instance and by default only Chassis Ping (ICMP) active. No SNMP polling is active is by default.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 28
Thresholding
Use basic threshold polling to apply simple formulas to the MIB variables, or for filtering the scope at device and interface level. To filter at interface level, the poll definition must be set up for interface filtering. Use generic threshold polling for complex formulas, or for filtering the scope at device level only.
MIB graphing
Graphing a MIB variable is useful for fault analysis and resolution of network problems. By graphing a MIB, operators and administrators can see a real-time graph of specific MIB variables for a network device. The MIB variable is polled at a user-defined interval and displayed in a graph over time. As a first step, test the performance of MIB graphing using the default poller. If capacity challenges suggest a need, create another poller instance and map it to one of the primary active policies. This provides a more significant redistribution of the polling workload across the pollers.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 29
For a list of built-in poll policies, see the section Default poll policies in the IBM Tivoli Network Manager IP Edition Event Management Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/poll/reference/nmip_mon_defaultpolldefs.html For a list of built-in poll definitions, see the section Default poll definitions in the IBM Tivoli Network Manager IP Edition Event Management Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/poll/reference/nmip_mon_defpolldef.html For more information on multiple pollers, see the section Administering multiple pollers in the IBM Tivoli Network Manager IP Edition Event Management Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/admin/task/nmip_adm_admindistpoll.html
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 30
Strategy for LAN connected (fast responders) versus WAN connected (slower responders) polling
Network response times can play an important role in polling capacity. LAN connection will typically be fast responders to a ping/poll. WAN connections may have a considerably slower response due to the specifics of its network architecture.
The polling process (ncp_poller) is a multi-thread. The number of threads can be configured by editing the file $NCHOME/etc/precision/NcPollerSchema.cfg While the threads may be increased, it is generally not recommended the thread count be increased in the attempt to gain additional single poller capacity. Tuning the threads this way is helpful in a slow response time setting, where you are waiting a long time (in networking terms) for a response. A given thread will wait for that slow responding device (perhaps connected over a WAN network) to finally respond. More threads means you can talk to more targets while waiting for slow responders. For typical LAN speed connections, slow response is not a factor.
Polling intervals
Status checking Chassis SNMP 2 minutes 5 minutes Data collection < 2 minutes 10-15 minutes
3. 4.
5. 6.
7.
An example of a new ncp_poller instance in the CtrlServicesDOMAIN.cfg file is shown here: insert into services.inTray ( serviceName, servicePath, domainName, argList, dependsOn, retryCount ) values ( ncp_poller, $PRECISION_HOME/platform/$PLATFORM/bin, $PRECISION_DOMAIN, [ -domain, $PRECISION_DOMAIN, -latency, 100000, -debug, 0, -messagelevel", "warn", "-name", "dave_poller1" ], [ nco_p_ncpmonitor, ncp_ncogate ], 5 )
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 32
For more information on the ncp_poller command, see the section ncp_poller command-line options in the IBM Tivoli Network Manager IP Edition Administration Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/ref/reference/nmip_ref_startmonitor.html
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 33
Network Manager performs a number of tasks: Matches an event to an entity Enriches the event in Tivoli Netcool/OMNIbus with information about that entity Passes the event on to plugins, including RCA The principal task of the Network Manager Event Gateway is to match an event in the Tivoli Netcool/OMNIbus alerts.status table to an entity in the NCIM topology database. Once the match is made, event enrichment can be performed: Standard out-of-the-box event enrichment using fields from the NCIM topology database Bespoke event enrichment using fields from the NCIM topology database RCA event enrichment using the RCA plugin Customizable event enrichment via plugins (e.g. zNetView)
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 34
Additional tasks beyond enrichment, for example Disco, which actually invokes dynamic discovery based upon receipt of certain events RCA suppresses events with the aim of identifying the root cause event so that it can be rapidly addressed by the customer. Some customers have emphasized that it is not really the event that is of interest, but the entity that is key. Provided you can identify the entity that has the problem, problem resolution can be quickly performed in most cases. Specific probe customization to de-duplicate events on the same entity can be created rather than suppressing them. For a complete description of the Event Gateway plugins, see the section Plugin descriptions in the IBM Tivoli Network Manager IP Edition Event Management Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/event/reference/nmip_evnt_plugindescriptions.html
Event maps
Event enrichment is performed using event maps. The main function of an event map is to call a set of stitchers that perform topology lookup to determine the entity associated with the event and then enrich the event with topology data. The Event Gateway determines which event map to use based on the kind of event, as defined in the alerts.status EventId field. To utilize an event map: The event meeds to be passed to an appropriate eventMap The gateway must successfully use that eventMap to match the event to an entity The plugin must have registered interest in that eventMap If you choose to configure event map selection using the Event Gateway, then you must configure the Event Gateway config.precedence table. The config.precedence table is configured in the EventGatewaySchema.cfg configuration file. This file is located at $NCHOME/etc/precision/EventGatewaySchema.cfg. All events not explicitly assigned an event map and precedence in the configuration are handled by the default Network Manager event map, if passed to the Network Manager gateway. This allows basic levels of event enrichment, but does not include the event in RCA (Root Cause Analysis) calculations. An event can be matched to an interface without performing RCA, but generally, it is advised that only events directly involved in RCA be used in the calculations due to performance implications. Event maps should be selected based on the following criteria: What data is available to identify the entity associated with this event ? This narrows down the available set of event maps. The event cannot be handled properly if the data expected by the event map is not available in the event in the expected format. Is this a Problem (Type 1) event ? Only problem events are candidates for RCA. Does the event indicate that the entity can no longer receive or transmit network packets ? Any event which identifies that network traffic is adversely affected is a candidate for RCA. It is worth considering an event map that will pass the event to the RCA engine.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 35
Can this event cause or be caused by another event ? If the event indicates a standalone failure, it is not a candidate for RCA. Selection is driven further by the data in the event that can be used to identify the entity. This could be any device-specific information such as: IP address SNPM sysName DNS name MAC address Interface identifiers (e.g. ifIndex, ifAlias, ifDescr) The event fields containing such device-specific information are populated by the probe rules file. It may be possible to add limited additional data from the rules files when raising the event. Typical examples are: LocalNodeAlias Node LocalPriObj LocalSecObj For a complete description of event fields and enrichment, see the sction Event enrichment in the IBM Tivoli Network Manager IP Edition Event Management Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/event/concept/nmip_evnt_eventenrichment.html For a complete description of event maps, see the section Event maps in the IBM Tivoli Network Manager IP Edition Event Management Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/event/concept/nmip_evnt_eventmaps.html
Precedence
Precedence indicates the importance of an event: the higher the precedence, the more important the event. Currently, precedence is only relevant when considering events on exactly the same entity. Furthermore, it is currently used only during RCA. Both the NmosEventMap event field and the config.precedence OQL table identify precedence. Higher precedence events will suppress a lower precedence events. Higher precedence should be used for events that are: Lower down the protocol stack e.g. confirmation that a physical port has failed would be assigned a higher precedence than an IP-Layer problem on that interface. Higher confidence of a specific event identifying a problem e.g. failure to ping an interface may be because the ICMP packet could not reach the interface. This could be due to a network problem between the polling station and the interface. An SNMP trap that explicitly states that a link has gone down is a more positive confirmation of a problem on the interface itself, or on its directly connected neighbor. The table below shows the recommended values to use. Note also that the limits have special significance with respect to RCA:
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 36
Value 0
Meaning This event cannot cause other issues. During RCA, it cannot suppress other events, but it can itself be suppressed. This is reserved for non-authoritative events which suggest but do not necessarily indicate a failure on the device. For example, failure to reach a device does not necessarily indicate a problem on that device - it could be caused by a problem between the polling station and the device. This is intended for protocol failures. Failures identified lower down the protocol stack should take higher precedence. For example, as OSPF runs over IP, an OSPF failure would be expected to have a lower precedence than an IP failure. Confirmed physical failures that indirectly imply a Link Down or Ping Fail (and most other events). Confirmed physical failures that directly indicate a Link Down or Ping Fail. This event cannot be caused by other issues. During RCA, it cannot be suppressed by other events, but it can become root-cause, suppressing other events.
Example events
SYSLOG-cisco-ios-SYS-CPUHOG SYSLOG-cisco-ios-BGPNOTIFICATION00 probeping-icmptimeout SNPMTRAP-IEFT-OSFP-TRAP-MIBosfplfStateChange
300
600
SNMPTRAP-IEFT-OSPF-TRAP-MIBospflfConfigError
For a complete description of precedence, see the section Precedence value in the IBM Tivoli Network Manager IP Edition Event Management Guide, or online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/event/concept/nmip_evnt_precedencevalue.html
SAE
interest has been affected by an event. Customized Customizable event enrichment via customized gateway plugins (eg zNetView) can be created to handle selected events. These are stitcher-based plugins, allowing OQL tables to be defined, NCIM topology data to be queried, and ObjectServer tables to be modified. SQL configuration is required to enable a custom plugin. It must be listed in the gwPluginTypes table, enabled in the gwPlugins table and have events of interest identified in the gwPluginEventMaps and gwPluginEventStates tables. For a complete set of Plugin descriptions, see online at: http://publib.boulder.ibm.com/infocenter/tivihelp/v8r1/topic/com.ibm.networkmanageri p.doc_3.9/itnm/ip/wip/event/reference/nmip_evnt_plugindescriptions.html
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 38
Troubleshooting
Problem Diagnosis Tip If so, does the gateway know how to How can I check if an Does the event match the nco2ncp event is handled ? EventFilter in EventGatewaySchema ? process the event ? Is the EventId of the event listed in the config.precedence inserts in the schema files ? Is the NmosEventMap field of the event populated ? The expected data format per event map If so, do the fields of the event contain is given in EventGatewaySchema the expected data ? The NmosObjInst field will contain the main node entity ID if so, otherwise the Has the entity been successfully event will not be passed to plugins matched against an entity in the topology ? If so, will a given plugin (e.g. RCA) see an event ? Use ncp_gwplugins.pl to list the event maps and states handled by the plugin e.g. ncp_gwplugins.pl domain NCOMS -plugin RCA At -messagelevel info, all serial numbers passed to all plugins are logged How can I create a new event map ? Is the probe rules file accessible ? If so, use the NmosEventMap field of the alerts.status event when the event is raised. Add an entry to the config.precedence OQL table in the schema file, mapping the event EventId field to an eventMap name. Basic event enrichment is performed by the Event Gateway, but additional enrichment (or other action) can be performed using the optional Stitcher named in the event map. Some events (e.g. Network Manager health check events) do not correspond to topology entities
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 39
Comment out ncp_g_event from the CtrlServices schema file, and run it independently on the command line this allows it to be started and stopped as desired, such that the event handling can be closely monitored (requires the ncp_model process to be running). It is recommended to start with a minimal topology, containing entities that are to be related. Events can be fed into the ObjectServer from any source. It can be helpful to raise events that are to be correlated before starting the gateway as it will read them in and process them at startup.
The default event maps use a limited number of alerts.status fields to match an event to an entity. By default, the LocalNodeAlias is used to identify the main node - the chassis that is or that contains the affected entity. Events missing this field will not be handled by Network Manager.
Where is my update ? If a field in the ObjectServer alerts.status table has not been updated as expected: Was an entity found ? Is NmosObjInst non-zero ? What stitcher populates that field ? Is that stitcher triggered by the event map for this event ? Modify the stitcher if not What type of entity was found ? Check the entityType of the NmosEntityId found Some fields are available only for chassis, some only for interfaces, etc. Some values can be NULL
Does the NCIM topology entity have the expected value populated ?
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 40
The event map stitcher field is required to do a topology lookup Only network events are matched to entities e.g. Network Manager Status events do not correspond to entities Check the cache e.g. ncp_oql -domain NCOMS -service EventGateway select * from ncimCache.entityData where <filter>;
Check the database e.g. ncp_oql domain NCOMS -service ncim username ncim select * from entityData where <filter>; Check there is no mismatch between the NCIM topology data and the cache (the data should always be consistent)
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 41
Processor speed
Network Manager does not present a large number of processes and they tend to be longrunning and require significant resources. As a result of this, the first key factor to consider when choosing processors for use with Network Manager is the core speed. This is especially the case for systems dedicated to Network Manager, with a single domain and a single poller.
Multi-core processors
For heavy workloads like discovery, polling, reporting, data exchange with Network Manager component products such as Tivoli Netcool/OMNIbus, and data exchange with other Tivoli products, the availability of two or more processor cores are strongly recommended and/or required. In settings where multiple domains are in use, multiple pollers within a single domain, etc., cases where there are multiple copies of key Network Manager processes running, it is helpful to have multiple processors and more that four processors. In these cases, the processor count is more important than the next processor speed upgrade.
Processors summary
Select a system with at least two processor cores for single server deployments. For most single domain, poller and Network Management instance, choose faster clock speeds over adding additional cores. For multiple domains, pollers and Network Manager instances, choose additional cores over a faster clock speed.
Health check
Performing basic health checks are important in ensuring the smooth operation of the pollers.
CPU usage
If the CPU consumption for a poller process (as measured by CPU measurement tools e.g. top) suggest the process is overloaded, or polling intervals are missed (due to retry issues or poller capacity), it is time to consider additional pollers or even domains.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 42
Tracing routes
Tracing the route to devices in the network map to check network paths is an important step in examining problem targets. To perform this procedure, you must be in the Network Views or in the Network Hop View. From the Network Hop View or Network Views network map, select the device to which to trace the route. To select multiple devices, press Ctrl. Right-click one of the selected devices and choose WebTools > Advanced Traceroute. The results of the traceroute operation appear in one or more separate browser windows. It is also possible to perform a custom traceroute by customizing the traceroute settings.
Specialist scripts
Specialist scripts can also be produced that perform more specific repetitive tasks that are applicable to your architecture and configuration.
6. 7.
example Level 1 debug + INFO message mode to confirm target counts and successful polling To implement debug level changes, restart Network Manager Observe /opt/IBM/tivoli/netcool/log/precision/ncp_poller.NCOMS.log file for poller activation, target counts, and any issues. Note that if one has multiple domains active, that the files would be named like CtrlServices.NCOMS10.cfg and ncp.poller.NCOMS10.log. The default and first domain does not have its name in the files, but extra domains do have their names included
Performance reports
Out of the box performance reports allow you to view any historical performance data that has been collected by the monitoring system for diagnostic purposes. View trend and topN charts for data to gain insight on short term behaviors. The trending report to see the average values collected for of a list of selected devices and drill-down to see the trend over time for that data item. Trending is important to highlight issues that develop over a progressive period of time and are not temporary. Issues identified may need to be addressed via capacity planning.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 44
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. For license inquiries regarding double-byte (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: IBM World Trade Asia Corporation Licensing 2-31 Roppongi 3-chome, Minato-ku Tokyo 106-0032, Japan The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact:
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 45
IBM Corporation 958/NH04 IBM Centre, St Leonards 601 Pacific Hwy St Leonards, NSW, 2069 Australia IBM Corporation 896471/H128B 76 Upper Ground London SE1 9PZ United Kingdom IBM Corporation JBF1/SOM1 294 Route 100 Somers, NY, 10589-0100 United States of America Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee. The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only. All IBM prices shown are IBM's suggested retail prices, are current and are subject to change without notice. Dealer prices may vary. This information is for planning purposes only. The information herein is subject to change before the products described become available.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 46
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. If you are viewing this information softcopy, the photographs and color illustrations may not appear.
Trademarks
IBM, the IBM logo and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at www.ibm.com/legal/copytrade.shtml. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Adobe, Acrobat, Portable Document Format (PDF), PostScript, and all Adobe-based trademarks are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. Other company, product, or service names may be trademarks or service marks of others.
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 47
IBM Tivoli Network Manager IP Edition 3.9 Best Practices Copyright IBM Corporation 2011, 2012. 48