Professional Documents
Culture Documents
Au-Aix Event Infrastructure PDF
Au-Aix Event Infrastructure PDF
Event Infrastructure
Cheryl L. Jennings May 17, 2011
Trishali Nayar
The AIX® Event Infrastructure is an extensible framework for monitoring multiple types of
system events. This article gives an overview of the monitoring interface, as well as pointers for
writing event monitoring applications.
Introduction
The AIX Event Infrastructure is an event monitoring framework for monitoring predefined and user
defined system events. Within the context of the AIX Event Infrastructure, an event is defined as
a change in state or in value which can be detected within the kernel or a kernel extension at the
time the change occurs.
Each type of event, which may be monitored, is associated with an event producer. Event
producers are simply sections of code which can detect an event as it happens and notify the AIX
Event Infrastructure of event occurrences. Some examples of available event producers are:
• modFile: The modFile event producer monitors for modifications to the content of files.
• utilFs: The utilFs event producers monitors the utilization of a file system.
• waitTmPgInOut: The waitTmPgInOut event producer monitors for the average wait time, in
milliseconds, of threads waiting for page in or page out operations to complete over a one
second period.
For a full listing of available event producers, see the Related topics section.
Events are represented as files within a pseudo file system. Existing file system interfaces (read(),
write(), select(), and so on) are used to specify how and when monitoring applications (also called
consumers) should be notified and to wait on and read data about event occurrences. The path
name to a monitor file within the AIX Event Infrastructure file system is used to determine which
event a consumer wishes to monitor.
Inside the AIX Event Infrastructure file system, there are four basic file types:
1. Monitor factories: Monitor factories are directories with the ".monFactory" extension. They
are directory representations of event producers. These directories are automatically created
when the AIX Event Infrastructure file system is mounted.
2. Monitor files: Monitor files are designated by a ".mon" extension and represent events to be
monitored. They only exist under a monitor factory which represents their associated event
producer.
3. List files: List files are special data files that end with the ".list" extension. Currently only one
list file exists, evProds.list. Reading this file will return the list of all available event producers.
4. Subdirectories: Subdirectories are used for ease of management and to represent the full
pathname of the event.
You may mount the file system on any desired mount point. Examples in this article assume it has
been mounted on /aha.
Infrastructure file system is mounted at /aha, the full pathname for the monitor file would be: /aha/
fs/modFile.monFactory/etc/passwd.mon
It should be noted that monitor files can have the same pathname starting from the associated
monitor factory and represent different events. For example, the file /aha/fs/modDir.monFactory/
home/cherylfs.mon monitors for file creation and deletion within the /home/cherylfs directory. The
file /aha/fs/utilFs.monFactory/home/cherylfs.mon monitors for the utilization of the filesystem /
home/cherylfs.
At this point, the monitoring application is asleep until an event occurrence is detected. The AIX
Event Infrastructure watches file operations to see if a monitored file is modified.
As seen in the previous example, the typical flow for a monitoring application is:
Each step is examined in this article. A sample program, mon_modFile_event.c, has been
provided as an example you can download.
Once the monitor file has been created and opened, specifications on how and when to be notified
must be written. A complete listing of the available options are available in the found in Related
topics section (see "Writing to the monitor file").
While most of the monitoring specifications are straightforward, the AIX Event Infrastructure has
two behaviors for the notify count (NOTIFY_CNT) specification:
In the case of NOTIFY_CNT=-1, if another event occurrence is detected while the monitoring
application is not currently blocked in a select() or read() call, that event occurrence data is logged
in the buffer allocated for this consumer. Once the monitoring application attempts another select()
or blocking read(), those calls will return immediately since there is unread event occurrence data
waiting in the buffer.
In the sample program, mon_modFile_event.c, the file to be monitored through the modFile event
producer is passed in as an argument from the user. If necessary, subdirectories are created in the
AIX Event Infrastructure file system to create the necessary monitor file.
It is important to note that monitoring of events does not begin until the monitoring application
issues a select() or a blocking read() call. There are several conditions which cause select() or a
blocking read() to return. These conditions are listed in the AIX 6.1 information center (see Waiting
on events).
The sample program blocks in a select() call once the monitoring information is written to the
monitor file. If the select() call returns an error, a special flag is passed to the parsing function to
indicate that the error format will be used in the output.
• The unmounting of a monitored file system through the utilFs event producer.
• The removing or renaming of a monitored file through the modFile event producer.
• The death of a process monitored through the processMon or pidProcessMon event producer.
Once an unavailable event occurrence has occurred, users may not monitor that event until
it becomes available again. Ideally, the monitoring application identifies unavailable event
occurrences and takes corrective action. This will cause the event to become valid again. The
documentation for each event producer lists, which return codes, indicates that an unavailable
event occurrence has been detected. The return codes for event producers are defined in sys/
ahafs_evProds.h.
For local unavailable event occurrences, the monitor file associated with the event is deleted.
Monitoring applications may read event data from deleted monitor files while the file descriptor for
the monitor file is still open but may not block for further event occurrences. Once the monitoring
application has taken corrective action and the event is valid again, the following actions must be
taken to resume monitoring:
1. The file descriptor for the deleted monitor file must be closed.
2. The monitor file must be re-opened with the O_CREAT flag.
3. Monitoring specifications must be written to the file.
At this point, the monitoring application may wait for event occurrences again in select() or read().
The sample program inspects the return code from the event producer (RC_FROM_EVPROD)
to determine if the event occurrence was an unavailable event occurrence. It attempts corrective
action for some of the possible unavailable event occurrence types and ceases monitoring for
others.
Event data may only be read once and no more than one event's worth of data is returned in a
single read call. For example, say that the data for two event occurrences have been copied into
the buffer before the consumer reads from the monitor file, and the event data for each event
occurrence has 256 bytes worth of data. If the consumer calls read() for 4096 bytes, only the 256
bytes of the first event is returned to the user. A second read call needs to be performed to obtain
the data from the second event.
In the sample program, event data is always read with a buffer of 4K. This is the recommended
read size since most event occurrence data will be less than 4K. Reading with this large of a buffer
means that no partial event occurrences will be returned due to insufficient space in the buffer
passed to the read() call.
For an event producer which has specified AHAFS_THRESHOLD_VALUE_HI and has not
specified AHAFS_STKTRACE_AVAILABLE and passes a message to event consumers, the three
levels of output look like this:
If there is an error from the event producer, all event producers have the following format for all
INFO_LVLs:
BEGIN_EVENT_INFO
TIME_tvsec=1269868036
TIME_tvnsec=966708948
SEQUENCE_NUM=0
RC_FROM_EVPROD=19
END_EVENT_INFO
If a consumer is monitoring a value event and the current value already exceeds the requested
threshold, the following format is used to record this EALREADY event:
BEGIN_EVENT_INFO
TIME_tvsec=1281837726
TIME_tvnsec=446010404
SEQUENCE_NUM=0
CURRENT_VALUE=70
RC_FROM_EVPROD=56
END_EVENT_INFO
Each event producer provides information on what is included in the output for event occurrences
they produce. The return codes for event producers are defined in sys/ahafs_evProds.h.
The sample program checks for the error format in the event occurrence data if the select()
call returned an error. Otherwise, it uses the format included in the modFile event producer's
documentation. It uses sscanf() to read in the values for the corresponding keywords.
The sequence number is reset to 0 following any cessation in monitoring. The following image
illustrates the behavior of the SEQUENCE_NUM keyword.
In a buffer wrap condition, the first read returns only the keyword BUF_WRAP. The second read
returns the event data for the next, whole event occurrence. The SEQUENCE_NUM field should
be consulted to see how many event occurrences may have been overwritten from a buffer wrap.
If monitoring applications experience buffer wrap conditions, they may increase the size of their
monitoring buffer specified in BUF_SIZE. This cannot be done dynamically.
If the consumer is using a very small buffer, there is a possibility that the event data from one event
occurrence may be larger than the buffer. In this case, the keyword EVENT_OVERFLOW will be
written to the buffer, with as much event occurrence data as can fit inside the buffer. The first read
in an overflow case returns only the keyword EVENT_OVERFLOW. The next read returns the
event data that was able to fit in the buffer.
If a wrap condition occurred with this overflow condition, the first read returns the keyword
BUF_WRAP, the second read returns the keyword EVENT_OVERFLOW and the third read returns
the event data that was able to fit inside the buffer.
The sample program checks to see if the event data passed in contains the BUF_WRAP keyword.
A warning is printed to the user if a buffer wrap condition has occurred. The sample program does
not check for EVENT_OVERFLOW since it specifies an INFO_LVL of 1 and uses the default buffer
size of 4K. Currently, the maximum size of a modFile event with an INFO_LVL of 1 will be less than
4K.
data. If the data from each event occurrence is exactly the same (except the timestamp and
sequence number), the timestamp and sequence number of the previously written event data are
updated to reflect this new, duplicate event occurrence.
Duplicate event data consolidation can be distinguished from buffer wrap conditions in that the
keyword BUF_WRAP will not be read in the consolidation case.
The sample program maintains its own internal sequence number. It compares the expected value
to the actual value read from the event occurrence data to determine if duplicate events were
consolidated. This internal sequence number is reset to 0 if an unavailable event was detected.
Downloadable resources
Description Name Size
Program monitors for modifications mon_modFile_event.c 10KB
Related topics
• Read the Waiting on events list. These conditions are listed here: http://
publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.baseadmn/doc/
baseadmndita/aix_ev_in_mon_wait.htm
• http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.baseadmn/doc/
baseadmndita/aix_ev.htm
• See how to create monitor files.
• See "Writing to the monitor file" for a complete listing monitor file options.