Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

A HMM approach for TCP Congestion Control Algorithm Estimation

Hugo Sousa Pinto Faculdade de Engenharia da Universidade do Porto


dee11014@fe.up.pt

Abstract
TCP has an embedded congestion control algorithm, by which the senders limit the rate at which they send packets into the network based on the network congestion they perceive. There are two main stages in this algorithm: slow start (SS), where the TCP sender increases its rate exponentially; congestion avoidance (CA), where the TCP sender adjusts its rate in a additive-increase, multiplicativedecrease fashion. Recently, researchers have argued that most of the ows in the Internet, such as short web-page transfers, never get enough time to exit the slow start phase. In practice, this means that they never get enough time to ramp up to full capacity. The main goal of this work is to study the impact that slow start has on different ows in the Internet. For that, we take a probabilistic approach based on a HMM to determine if a ow has left slow start or not. After a proper learning of the HMM using a training set, this enables us to test and study statistics on a large ammount data.

to sources as when to slow down. It is not clear which of the two approaches would be better, but since the last one results in additional overheads, end-to-end congestion control has been widely deployed in the Internet.

1.1. Transmission Control Protocol


The Transmission Control Protocol (TCP) is one of the core protocols in the Internet [4]. Besides providing a reliable transport service between two processes running on different hosts in the Internet, it also has a built-in congestion control mechanism. The approach taken by TCP is to have each sender limit the rate at which it sends trafc into its connection based on the perceived congestion. Each TCP sender keeps a congestion window cwd which it initializes to 1 maximum segment size (MSS). The congestion window is a limit for the number of packets that can be sent without receiving their acknowledgements. Indirectly, this is in fact limiting the senders rate at roughly cwd , where rtt rtt is the round trip time. At an initial stage, called slow start (SS), the congestion window is increased by 1 MSS every time an acknowledgement packet is received. Since there is no explicit feedback regarding congestion, TCP takes advantage of the ACKs to infer information about network congestion. In particular, if the ACKs are being received, this means that the packets are reaching the destination and the network is not congested. During this stage, the sender begins transmitting at a slow rate (hence the designation slow start), but increases its rate exponentially. To easily see this, consider that one segment is sent into the network. After one RTT, its ACK will be received and two segments will be sent. After another RTT, two ACKs will be received and the congestion window will be increased by 2 MSS, resulting in 4 MSS being sent. Thus, the value of cwd doubles every RTT during this phase. If the congestion window reaches a given threshold or if TCP receives three duplicate acknowledgements, TCP interprets this as a signal of network congestion. Therefore, it cuts the congestion window to half and enters a new mode called congestion avoidance (CA). In this mode, the TCP sender additively increases its rate when it does not perceive

1. Introduction
The Internet is composed by a series of routers and links that have limited capacities. Millions of users are connected to the Internet and it would not make sense to dimension resources to accomodate all users at the same time. As users join and leave the network, some resources are occupied and others are left free. These uctuations are an inherent part of the Internet. When resource demands exceed capacity, congestion occurs. If all users always sent at their maximum rate, the buffers in the routers would start to explode, which is what happened in the 1980s, when the Internet experienced its rst congestion collapse. This way, researchers had to come up with a machanism to dinamically allocate resources to the different users. Different approaches to congestion control have been considered: end-to-end congestion control, where only the transport layer is part of the congestion control mechanism (routers are left out); network-assisted congestion control, where routers would give explicit feedback

congestion, and multiplicatively decreases its rate when it detects congestion. Each time an ACK is received, the congestion window is increased by M SS M SS . When a loss is cwd detected by a triple duplicate ACK, the congestion window is cut to half. If a loss is detected by a timeout, the congestion window is set to 1 MSS and the connection reenters slow start.

1.2. Objectives
The main intention of this work is to study the impact that slow start has on the ows across the Internet. Many researchers claim that most of the data transfer in the Internet never ramp up to full capacity, and this results in underutilization of the links. As an alternative, they suggest for example that slow start should start with a much bigger congestion window [2]. This way, it would be very interesting to know what is the percentage of the ows that never leave slow start, or what is the average necessary connection time or le size for a ow to leave slow start. Equipped with the proper probabilistic tools, we will try to answer some of these questions.

Figure 1. Hidden Markov model representation ([1])

gaussians, which means that they will be fully specied by a mean and a covariance matrix. HMMs seem to be a very useful tool for our application of detecting if a ow is in slow start or in congestion avoidance. In our problem, we observe sequences of data which are dependent on the state at which the ow is. However, we do not have information about the current state, that is, for each packet that we observe we have no specic eld saying whether it was sent in SS or CA.

1.3. Hidden Markov models


Clearly, there are two distinct phases in the congestion control mechanism. One very important part of this machine learning project, as in many others, is to determine the important features that allow us to distinguish the two phases. We can already get a bit ahead of ourselves and observe that what differs in the two phases is the way the throughput changes in time. Also, we can also anticipate that our observations will be sequential and will depend, for example, on our current state of congestion control. For example, in the beginning of the ow, we will always be in slow start and therefore it will be more probable to observe a signicant increase in throughput. If we were to treat the observations as simple i.i.d data, we would fail to exploit these sequential patterns. One very interesting tool to treat sequential data are Markov models. Inside these, hidden Markov models (HMMs) allow to model systems where the state is not observable (it is hidden). HMMs can be represent by a Markov chain of latent variables, where each observation is conditioned on the state of the corresponding latent variable, as shown in Figure 1.3. In order to fully specify the model we need the transition probability matrix A, whose elements are the transition probabilities between states or latent variables, the prior probabilities of the initial latent variable and the conditional distributions of the observed variables p(xn |zn ). These last ones are known as emission densities and might be either continuous or discrete. In our application, these correspond to continuous values and will be modelled as

2. Dataset
In order to perform this machine learning project, it was necessary to obtain a lot of packet traces from TCP ows in the Internet. One possible approach could be to collect the whole data. However, the adopted solution was to download 2011 Internet Traces from the CAIDA website [3]. The Cooperative Association for Internet Data Analysis (CAIDA) collects several different types of data at geographically and topologically diverse locations, and makes this data available to the research community, while preserving the privacy of individuals and organizations who donate data or network access. This means that, for example, the IP addresses of the traces are anonymized. However, this is done in such a way that if two IP adresses are equal in the original trace, they will also be equal in the anonymized trace. This will preserve the most important characteristics and still allow researchers to employ machine learning techniques. Using datasets collected by specialists in the area seemed like a much more robust approach. Seemingly minor methodological details can seriously inuence or even invalidate any analysis that is subsequently performed on the data. The dataset used contains anonymized passive trafc traces from CAIDAs equinix-chicago and equinixsanjose monitors on high-speed Internet backbone links. The Endace network cards used to record these traces provide timestamps with nanosecond precision. However, the anonymized traces are stored in pcap format with timestamps truncated to microseconds.

2.1. File format


The le format used to capture network trafc is pcap. It is the format used by the tcpdump program and can be read by Wireshark or CoralReef. However, these tools do not allow the reading of very large les. As an alternative, it is possible to use the libpcap library to read pcap les.

At rst, a pre-processing of the data was performed. The different tcp ows were separated based on their stream number. For each packet, the throughput was computed as:

Th =

tcp.seq + tcp.len bytes sent = time elapsed f rame.time relative

2.2. MATLAB toolbox


Sharktools is the name given to a small set of tools that allow use of Wiresharks deep packet inspection capabilities in interpreted programming languages. The two currently supported interpreted programming languages are Python and Matlab; Matshark is the name of the tool for Matlab and was the one used in this project. Given an arbitrary pcap le, Sharktools uses Wiresharks display lter technology (which knows how to parse thousands of common and obscure network protocols) to cherry-pick packet elds of interest. Sharktools then provides this data as a cell array of structs in Matlab. A user can then plot packet elds with respect to time or carry out more complicated analysis of packet captures in his favorite programming environment. The change in throughput for each packet was then computed as the rst differences of the throughput, with a step of 3. Only ows with at least 10 packets were considered. All the values were stored in a cell array, with each line corresponding to a ow and each ow having a series of change in throughput values. This change in throughput is a per-packet change and will not map directly into the exponential or linear increase explained before. However, we can still expect that the increase in SS will have different values when compared to CA.

3. Learning
The next step was to use a big training set to learn the parameters of the HMM. This can be done using maximum likelihood estimation, that is, determining the parameters that maximize the probability of observing the given data p(X). Since we do not know the values of the hidden states, we will have to perform unsupervised learning. Since it is not possible to obtain a closed-form solution in this case, expectation maximization has to be performed. The training starts with some initial parameters for the model (old ). In the E step, this parameters are used to nd the posterior distributions of the latent variables p(Z|X, old ). In the M step, we maximize the expectation of the logarithm of the complete data likelihood, while xing p(Z|X, old ). This will be done in an iterative way until some stopping criteria. The learning was performed using an HMM toolbox to a training set. However, we still needed to provide the initial estimates for the prior of the rst latent variable, for the transition matrix and for the emission parameters. As suggested in [1], the emission parameters were initialized by tting a mixture of gaussians to the whole dataset. In this step we are obviously losing the sequentiality of the data, but it will give us a good initialization for the mean and the covariance of the gaussian distributions. The algorithm tried to nd two mixtures, one corresponding to SS and the other to CA. Since a ow always starts in slow start, the prior of the SS state is 1 and the prior of CA is set to 0. The initial transition matrix was initialized based on the intuition that very rarely a ow will return to SS after shifting to CA. When in SS, it was assumed that it is equally probable to stay in SS or to switch to CA.

2.3. Feature selection


Based on the packet elds provided by the tcp header and on the timings of the packets, an important step was to gure out what information could be used to distinguish slow start from congestion avoidance. It is important to note that we are monitoring packets at a backbone router and the information about the congestion window is not present in the packets. As should be clear by the explanation given in the beginning, one characteristic of slow start is that during this phase the throughput increases exponentially. When a ow is in congestion avoidance, the throughput only increases linearly. In order to measure the throughput of each of the ows, the following variables were read from the dataset: f rame.time relative, that measures the time elapsed since the beginning of the capture. tcp.seq, that corresponds to the sequence number of the rst byte being sent in the current packet. tcp.len, the length of the current packet. tcp.stream, the ID of the current TCP stream, which uniquely identies a TCP connection between a 4tuple IP1, IP2, PORT1, PORT2. The dataset was ltered so that only TCP packets were read. Furthermore, only packets whose TCP payload was greater than zero were considered, since we are interested in the bulk transfers and not in the signalling packets.

4. Decoding
The problem of decoding corresponds to nding the most probable values for the hidden states in a sequence. This will allow us to estimate when a ow transitions from SS to CA. Note that this is a different problem from nding the most probable hidden state in each instant of time (which may not even belong to a possible path). Naively, we would have to evaluate exponentially many paths. However, Viterbi algorithm allows us to compute the best path through a message passing algorithm, in which we only need to keep track of K paths. At each time, we only need to keep track of the best path that lead to each of the states. The Viterbi algorithm was applied by using a Matlab toolbox to a test set, which is different from the training set used to t the parameters. The most probable paths for the latent variables of the different ows were then stored in a cell array structure.

Figure 2. Percentage of ows that stay in SS or shift to CA

5. Results
In this section, the most important results of the work are presented. In the unsupervised training, the following parameters were learned: prior = mean = 1 0

In a critical perspective, it can be said that the loglikelihood values obtained during the training were very low. This is perhaps due to the fact there is a lot of data, and thus the proabiblity of observing so much data is almost equal to zero. One way to overcome this fact, could be to use a MAP with a prior covariance that is larger, so as to increase the probability of the observed data.

1.0038 0.8899

6. Conclusion
The focus of this project was more on applying the available machine learning tools in a proper way than to implement the actual tools. In this context, it was possible to put into practice tools and algorithms learned in class such as HMMs, maximum likelihood estimation, mixture of gaussian, k-means, EM, etc. The processment of the data and feature selection turned out to be a little cumbersome and time-consuming, but in the end it was possible to use a very big dataset with a lot of information concerning real-world ows in the Internet. The probabilistic model developed allowed to analyse a big test set of transfers, leading us to the conclusion that in fact many ows in the Internet never get past the slow start phase.

Sigma1 = 0.0383 Sigma2 = 0.8151 A= 0.9915 0.2092 0.0085 0.7908

The rst problem that we proposed to solve was to nd how many ows of the test set ever leave slow start. By analysing all the path sequences, we can easily verify how many of them shifted to CA and how many remained in SS. We observed that 38.48% of the ows never left slow start. Furthermore, we are only considering ows that have more than ten packets, so this value should be even bigger if we considered every ow. The second problem that we wanted to address was how many packets on average have to be sent before a ow enters congestion avoidance. For this, we analysed all the most probable path sequences and kept the index of the packet where the most probable state starts being CA. We can conclude that on average we have to wait for 14.83 packets to be sent before shifting to CA. For an MSS of 1500 bytes, this means that we have to send aproximately 22kb of data before shifting to CA.

7. Future work
An interesting extension to the HMMs are called switching linear dynamic systems (SLDS). In a HMM, the latent variables are discrete. An extension could be to consider continuous latent variables, which results in a linear dynamic system. The SLDS can be viewed as a combination of linear dynamic systems with a hidden markov model, where the HMM allows us to switch stocastically between the different LDSs. The LDS would be very useful in modelling the TCP congestion control mechanism, since

the evolution of the congestion window is in fact a linear dynamical system, where the state (the congestion window) evolves throughout time, with each value generating a specic range of emissions in throughput.

References
[1] C. M. Bishop. Pattern recognition and machine learning. Springer, 1st ed. 2006. corr. 2nd printing edition, Oct. 2006. 2, 3 [2] N. Dukkipati, T. Rece, Y. Cheng, J. Chu, T. Herbert, A. Agarwal, A. Jain, and N. Sutin. An argument for increasing tcps initial congestion window. SIGCOMM Comput. Commun. Rev., 40:2633, June 2010. 2 [3] P. H. kc claffy, Dan Andersen. The caida anonymized 2011 internet traces, November 2008. 2 [4] J. F. Kurose and K. W. Ross. Computer Networking: A Top-Down Approach. Addison-Wesley Publishing Company, USA, 5th edition, 2009. 1

You might also like