Professional Documents
Culture Documents
Automated Network Management Using A Hybrid Multi-Agent System
Automated Network Management Using A Hybrid Multi-Agent System
SYSTEM
LP2 LP2
Proposals based on MAS usually replace the network 1 4 Mb 5 Mb 1 4 Mb 6 Mb
control mechanisms and the MAS is responsible for the LP1 4 5 LP1 4 5
network operation [6][7]. This results in poor robustness, 3 Mb 3 Mb
because if the MAS fails then the whole network fails. a) b)
4 Mb 6 Mb
Therefore, network providers are not confident about 5 Mb 2 3
LP2 2 3
using these systems. Another problem in these systems is LP3
LP3
their scalability. When the network resources and network LP2 1 5 Mb
1 4 Mb 6 Mb
users grow, the system becomes unmanageable, usually
4 5 4 5
due to too many co-ordination messages. LP1 LP1
3 Mb 3 Mb
c) d)
The main goals of our system are the integration and co- Fig. 1: Bandwidth Management. Initial situation: LP1 , LP2, and
ordination of all the mechanisms that act over the logical LP3 of 3, 4, an 5 Mb respectively. LP2 is congested and needs a
network, but with a robust and scalable MAS. The System bandwidth increase. a) using spare resources. b) using unused
resources assigned to other paths. c) re-routing the congested LP in
does not substitute the conventional Network control order to find the needed resources. d) re-routing of another path in
mechanisms but complements them (the management order to release the resources needed in the link to expand the
system can be activated or deactivated). If the MAS fails, congested one.
the network still works in a static way. Scalability is
assured by using a distributed view of the network in There are four typical cases, which are shown in Fig. 1.
order to minimise the management communications. (a) If there is enough spare bandwidth in the link, then the
congested LP is expanded using this bandwidth. (b) If
Moreover, the objectives of Traffic Engineering (TE) [8] there is not enough spare bandwidth and other LPs going
in an MPLS environment are similar to the objectives of through the same link are under-utilised, it is possible to
our own system, and there are also proposals to perform transfer resources from one LP to the other. If (a) and (b)
TE using MAS [9]. Accordingly, our system could be fail, then a re-routing is needed: (c) If the congested LP
considered as part of such TE mechanisms. finds another path with enough resources then it can be
re-routed. Otherwise, (d) other LPs may be re-routed
In Section 2 we briefly present the tasks the system through other links in order to free enough capacity to
should carry out and the system objectives. In Section 3 expand the congested LP.
we describe the MAS architecture and how it works. After
that, in section 4, we present the results obtained for a set Backup Path
2 3 2 3
of bandwidth management experiments. Finally, we give Backup Path
our conclusions and describe the work we plan to do in 1 4 1 4
Fault Fault
the future.
6 5 6 5
1 2 3
Working Path 2
Backup Backup
Path 1 Path 2
the logical network management system. In any case, the
system can also co-operate with these independent
systems.
6 5 4
Shared Bandwidth Our Multi-Agent architecture (Fig. 4) has two different
Fig. 3: Spare Capacity optimisation by sharing the bandwidth
sets of agents. First, there is a reactive type of agents
between backup paths.
whose main task is monitoring and they are called M-
Agents (Monitoring-Agents). Second, there is a set of
more deliberative agents, which are called P-Agents
3. SYSTEM CHARACTERISTICS AND (Performance-Agents), responsible for deciding the best
ARCHITECTURE way to achieve a maximum network performance. This
results in a hybrid agent architecture: M-Agents are
A Multi-Agent System is a good way of improving subordinated to P-Agents, and typically, any actions taken
network management for two main reasons: it is an by the M-Agents are under the supervision of the P-
inherently distributed solution and it introduces artificial Agents. When M-Agents detect a problem they cannot
intelligence based techniques in order to automate some deal with, then P-Agents take control.
day-to-day tasks of the human network managers.
P-Agent P-Agent
M M M M
Management M
Plane
P-Agent P-Agent
M M M M
M
M-Agents
Managed
Network
To achieve good scalability the number of P-Agents is The network simulation for the experiments is depicted in
static while M-Agents are lightweight processes. With Fig. 5. This network has 4 nodes and 4 bi-directional
respect to inter-agent communications, we apply the physical links. Each physical link has 100 Mbps of
constraint that only P-Agents are able to communicate capacity. There are 10 LPs numbered from 1 to 10 in the
outside a node and they can only establish communication figure. Initially each LP has an assigned capacity of 15
with their physical and logical network neighbours. If Mbps. All LPs have the same offered traffic load
some of the required information is not available in the (specified in table 1). Using negative exponential
neighbourhood, these neighbours can ask successively distributions for the interarrival time and duration, the
their own neighbours and so on. mean load for each LP is 100 Mbps, hence all links tend
to be congested.
If the logical network continuously changes, a
performance degradation is produced due to the increase
of management traffic. Therefore, the system should
control the number of bandwidth re-allocations.
link (LP 3 in Fig. 6). In the case of two LPs per link they
increased their bandwidth until they reached half of the
2 5 link capacity and then they competed for bandwidth (LPs
6
2 1 and 2 in Fig 6).
1 4
1 3 7 3
90
10 8
80
Logical 4 70
Paths 9
Capacity (Mbps)
60
50
Fig. 5: The simulated network with the established LPs. 40
30
Traffic Assigned Mean Interarrival Mean Duration 20
Class Bandwidth Time (s) (s) 10
1) 64Kbps 2 60 0
2) 2Mbps 10 120 0 500 1000 1500 2000 2500 3000 3500
3) 2Mbps 20 300
Time (s)
4) 4Mbps 30 120
5) 10Mbps 100 300 Bandwidth LP 1 Bandwidth LP 2 Bandwidth LP 3
Table 1: Generated traffic for each LP
Fig. 6: Capacities assigned to LPs begining in node 1 (LPs 1, 2, and 3),
using the Trigger function Reject-5 and every change of bandwidth is of
Another decision to take is the amount of bandwidth by 2Mbps.
which to increase an LP every time the trigger function On the other hand figure 7 shows some examples of
detects congestion. In these experiments we used a fixed interactions between agents. Most of the interactions are
capacity of 2Mbps. The simulation time was 1 hour in intra-node and only P-agents are able to communicate
each case and the general behaviour, shown in Fig. 6, was with other nodes.
as follows: In the case of a single LP per link, this LP
increased its bandwidth up to the maximum level of the
LP1 LP2
monitoring monitoring
Monitoring Monitoring
results results
Congestion Congestion
Detected Detected
Enough spare
Enough spare resources LP2 spans to other
LP1 resources nodes, LP2 congested
increase
LP1 Pre-reservation for
increased OK an LP2 increase
Enough spare
LP1 Pre-reservation for resources
increased OK an LP2 increase OK
OK LP2 increase
confirmation
[· · · ]
[· · · ]
[· · · ]
[· · · ]
[· · · ]
[· · · ]
[· · · ]
LP1
monitoring LP2
Monitoring
increased OK
results
LP2 increase
Congestion
authorised
Detected
LP1 congested LP2
Not enough increase
spare respurces LP2
Others M-agents on the increased OK
[· · · ]
[· · · ]
[· · · ]
LP1
increase
LP1
increased OK
LP1
increased OK
OK