The Design and Implementation of A Kerne

The Design and Implementation of a
Kernel Monitoring System for Network Servers
Yutaka Nakamura Eiji Kawai

yutaka-n@rd.center.osaka-u.ac.jp eiji-ka@is.aist-nara.ac.jp
Shinji Shimojo Japan Science and Technology Corporation
shimojo@cmc.center.osaka-u.ac.jp
Cyber Media Center of Osaka University
Suguru Yamaguchi Hideki Sunahara
suguru@is.aist-nara.ac.jp suna@is.aist-nara.ac.jp
Nara Institute of Science and Technology Nara Institute of Science and Technology
Abstract Our kernel monitoring system, called rep2, gives fairly

low influence to the server even in the high loaded state.
From a stand point of server operation, performance We made experiments with actual servers with a huge num-
measurement of the server is important. One of the major ber of real users on the Internet. The results show that our
parts of the measurement is kernel monitoring, and most system can make more detailed observation of the server
server administrators have been legacy tools that can influ- system without server performance degradation.
ence server performance by the load itself. In this paper,
we describe the design and implementation of a new kernel 2. Design and Implementation
monitoring system that does not influence a server perfor-
mance by dividing measurement module and analysis mod- In general, a kernel monitoring program depends on the
ule. Furthermore, our monitoring system can be appled to operating system on which the program works. To improve
a distributed server system. the portability of our system, we divided our system in two
parts, an OS-dependent part and an OS-independent part.
1. Introduction Figure 1 shows the design of our kernel monitoring system.
Server System Analyzing Host
To keep good quality of service, a server administrator Visualizer
should tune a server with the best performance and avoid Server
Prosesses Reportor2 Collector
Graph
Generator
saturation of the server. However, many administrators still Operating System
operate the server systems based on their intuition and ex- CPU information
Memory information
periment that frequently causes miss-configuration of the Process information
servers. Thus, we developed a server performance measure- Network information
ment tool called ENMA [1]. ENMA monitors packets out-

side the server and observes several performance indices of Figure 1. Kernel Monitoring System (rep2)
the server system.
However, there is a problem of packet monitoring; the
obtained information does not show a direct sketch of the The OS-dependent part interacts with the kernel through
kernel status. To solve this problem, we developed a new the kvm interfaces, because the target server host employs
kernel monitoring system and integrated it with ENMA. It the Solaris operating system in our experiment. The kvm in-
monitors several indices of kernel status and transports them terface is supported also by other operating systems such as
to a remote machine that analyzes them in real time. Be- FreeBSD. There is some information that is not supported
cause it records precise timestamps in the log data, the data by the kvm interface such as the buffer status of the protocol
can be merged easily with other data such as the access log stack and input/output packets. We also use kstat and sysctl
of the server. interfaces for this kind of information.
3. Case study of Large Scale WWW Server 100
90
IDLE
USER
KERNEL
100
90
IDLE
USER
KERNEL
80 80
70 70
To show that rep2 is effective enough, we give an exam-
CPU Utilization(%)
CPU Utilization(%)
60 60
50 50
ple to apply rep2 to a large scale actual WWW system. The 40
30
40
30
system is for broadcasting the 82nd National High-School 20 20
10 10
Baseball Games of Japan. The system got about 46 million 0

14:45:00 14:55:00 15:05:00 15:15:00 15:25:00 15:35:00 15:45:00
0
14:45:00 14:55:00 15:05:00 15:15:00 15:25:00 15:35:00 15:45:00
Time Time
accesses per day at the maximum.

Figure 3. Time transition of CPU utilization
3.1. System composition
Figure 2 shows the network environment of this experi- 100 100

with rep2
Cumulative Distrivution Function(%)

Cumulative Distribution Function
90 90 without rep2
ment. We prepared four servers and one layer 4 switch. The 80 80
70 70
layer 4 switch applied round robin algorithm to distribute 60 60
50 50
accesses from many clients. The WWW1 uses Apache 40 40
30 30
server system on Solaris 2.7. From WWW2 to WWW4 use 20 20
10 WWW1 10
Chamomile server system that is our original system for this 0
1e-05 0.0001 0.001 0.01 0.1 1
WWW2
10 100
0
2 3 5 7 10 20 30 50 70 100
service. Response Time (sec)

Time(msec)
Internet Figure 4. Cumulative Distribution Function of

Layer 4 Switch(Server Iron) Response time
WWW1 WWW2 WWW3 WWW4
ENMA ENMA ENMA ENMA
response time with and without rep2. The right graph of

Visualization
Figure 4 shows also cumulative distribution of response
time of WWW1 on both situations. Because they do not
Figure 2. Network Environment show any difference, our system is effective enough not to
influence the server performance.
In this experiment, we observe the server using the

4. Conclusion
ENMA as shown in Figure 2. The visualization host ac-
quires the observation data through the control line. Be-
We showed the effectiveness of our kernel monitoring
cause the visualizer can handle data from several server
system. We combined ENMA and rep2, and observed an
hosts, we can see not only each host data but also total data
actual system from both the outside and the inside.
of the system.
3.2. Results Acknoledgement
Figure 3 is the output of CPU utilization on WWW1 and This work was supported in part by Research for the
WWW2. The CPU cycles consumed by kernel activity on Future Program of Japan Society for the Promotion of
two systems are at the same level. However, User level ac- Science under the Project ”Integrated Network Architec-
tivity of WWW2 is higher than that of WWW1. Accord- ture for Advanced Multimedia Application Systems”(JSPS-
ing to the reference [2], the server system is saturated when RFTF97R16301)
the CPU kernel rate becomes 90%. Therefore, WWW1 and
WWW2 can process more WWW requests. References
The left graph of Figure 4 is cumulative distribution of
response time of the two servers. The response time of [1] Yutaka Nakamura, Ken-ichi Chinen, Hideki Sunahara, Sug-
Chamomile is a hundred times slower than that of Apache. uru Yamaguchi. ENMA: The WWW Server Performance
One of the reasons of the poor responsibility is in it’s ses- Measurement System via Packet Monitoring. In Inet’99, San
Jose, CA, June 1999. http://enma.aist-nara.ac.jp/.
sion scheduler. Chamomile handles hundreds of sessions [2] J. Almeida, V. Almeida and D. Yates. Measuring the Behav-
with a few threads, and that causes large queuing delay. ior of a World-Wide Web Server. Computer Science Depart-
We also investigate performance degradation through ex- ment, Boston University, 1996, October 29
ecuting rep2 on the server system by comparing the server

The Design and Implementation of A Kerne

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Design and Implementation of A Kerne

Uploaded by

Copyright:

Available Formats

The Design and Implementation of a

Kernel Monitoring System for Network Servers

Yutaka Nakamura Eiji Kawai

Abstract Our kernel monitoring system, called rep2, gives fairly

To keep good quality of service, a server administrator Visualizer

ment tool called ENMA [1]. ENMA monitors packets out-

To show that rep2 is effective enough, we give an exam-

ple to apply rep2 to a large scale actual WWW system. The 40

system is for broadcasting the 82nd National High-School 20 20

Baseball Games of Japan. The system got about 46 million 0

accesses per day at the maximum.

Figure 2 shows the network environment of this experi- 100 100

Cumulative Distrivution Function(%)

service. Response Time (sec)

Internet Figure 4. Cumulative Distribution Function of

WWW1 WWW2 WWW3 WWW4

ENMA ENMA ENMA ENMA

response time with and without rep2. The right graph of

In this experiment, we observe the server using the

3.2. Results Acknoledgement

You might also like