Yang-SIP

HELSINKI UNIVERSITY OF TECHNOLOGY
Department of Computer Science

Laboratory of Telecommunication Software and Multimedia
Yang Yang
SIP over Client Initiated Connections
Master’s Thesis submitted in partial fulfillment of the requirements for the degree
of Master of Science in Technology.
Otaniemi, May 1, 2007
Supervisor: Professor Antti Ylä-Jääski

Instructor: Sasu Tarkoma, Ph.D.
HELSINKI UNIVERSITY OF TECHNOLOGY ABSTRACT OF THE
OF TECHNOLOGY MASTER’S THESIS
Author: Yang Yang
Name of the Thesis: SIP over Client Initiated Connections
Date: May 1, 2007 Number of pages: 46 + 9
Department: Department of Computer Science
Professorship: T-110 Telecommunications Software and Multimedia
Supervisor: Prof. Antti Ylä-Jääski
Instructor: Sasu Tarkoma, Ph.D.
SIP outbound as an extension of SIP enables the client initiated connections in SIP
signaling system. This feature is desirable in the case of NAT or firewall present between
the public and the private side. In such situation, connections are only allowed from
the private side to the public side. SIP outbound proposes a mechanism which keeps
the client initiated connections between a UA and proxies and later reuses the same
connections to push data to the UA from the proxy sides. This mechanism ensures the
successful traversal of NAT/firewall.
In this thesis we implemented SIP outbound protocol as an extension of SIP and in-
tegrated to the WeSAHMI experimental infrastructure and then evaluated the perfor-
mance of the system as a whole.
Keywords: SIP, SIP outbound, STUN keepalive, backoff mechanism, flow token, NAT.
ii
Acknowledgements
I want to thank my supervisor, Professor Antti Ylä-Jääski, and instructor Ph.D.

Sasu Tarkoma, for giving me the oppertunity to participant the WeSAHMI project
and instructions to accomplish my thesis.
Many thanks go to Jani Heikkinen and Sergio Lembo for their constructive ideas
and practical helps.
My gratitude also goes to my parents, my husband and my friends for their mental
support.
Otaniemi, May 1, 2007
Yang Yang
iii
Contents
Abbreviations vi
List of Figures ix
List of Tables x
1 Introduction 1
1.1 Research problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Brief motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Structure of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Background 5
3 System Model 8
3.1 WeSAHMI architecture . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 WeSAHMI security architecture . . . . . . . . . . . . . . . . . . . . . 9
3.3 SIP outbound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 SIP client-initiated outbound 11

4.1 Overview of the mechanism . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 User agent behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2.1 Flow establishment . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2.2 Flow recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2.3 Keepalive mechanism . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Edge proxy behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3.1 Flow token . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3.2 Forwarding Mechanism . . . . . . . . . . . . . . . . . . . . . 18
iv
4.3.3 Keepalive mechanisms . . . . . . . . . . . . . . . . . . . . . . 19
4.4 Registrar behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.5 Authoritative proxy behavior . . . . . . . . . . . . . . . . . . . . . . 20
5 Implementation 22
5.1 Open source libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 User agent routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.2.1 Termination of a flow . . . . . . . . . . . . . . . . . . . . . . 23
5.2.2 Failures of a flow . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2.3 Re-registation . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3 TCP keepalive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.4 STUN keepalive over UDP . . . . . . . . . . . . . . . . . . . . . . . . 25
5.4.1 Overview of the mechanism . . . . . . . . . . . . . . . . . . . 25
5.4.2 STUN server and client . . . . . . . . . . . . . . . . . . . . . 26
5.4.3 STUN attributes . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.4.4 STUN retransmission mechanism . . . . . . . . . . . . . . . . 27
5.5 Edge proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6 Experimentation 29
6.1 Experimental infrastructure deployment . . . . . . . . . . . . . . . . 29
6.2 Experiment for SIP over UDP with SIP outbound features . . . . . . 30
6.2.1 Experiment for STUN keepalive . . . . . . . . . . . . . . . . 33
6.3 Experiment TCP keepalive . . . . . . . . . . . . . . . . . . . . . . . 34
7 Discussion 37
8 Conclusions 39
8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
8.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
A Appendix 44
A.1 Important data structures . . . . . . . . . . . . . . . . . . . . . . . . 44
A.2 Important modifications to the eXosip and osip libraries . . . . . . . 45
A.3 APIs for base64 encoding . . . . . . . . . . . . . . . . . . . . . . . . 45
A.4 APIs for STUN keepalive . . . . . . . . . . . . . . . . . . . . . . . . 45
v
Abbreviations
AOR Address of Record, a well-known address for a user. In SIP, it is a

SIP URI.
ALG Application Layer Gateway
API Application Programming Interface
B2BUA Back to Back User Agent
DNS Domain Name System, a global de-centralized directory that trans-

lates domain names into IP addresses.
DNSSRV Domain Name System Service Record Working Group, an IETF

working group that specified a DNS extension enabling finding of
an IP address of a service based on a protocol and domain.
DHCP Dynamic Host Configuration Protocol, and Internet protocol for

automating the configuration of devices using TCP/IP.
DTLS Datagram Transport Layer Security
EP Edge Proxy, any proxy that is located topologically between the

registering User Agent and the Authoritative Proxy.
HTTP Hyper Text Transport Protocol, a web browsing protocol.
HMAC Hash message Authentication Code, is a type of message authenti-

cation code calculated using a cryptographic hash function in com-
bination with a secret key.
ICE Interactive Connectivity Establishment
IETF Internet Engineering Task Force
vi
IP Internet Protocol
NAT Network Address Translation, enables a local are network to use one
set of IP addresses for internal traffic and a second set of addresses
for external traffic.
NTP Network Time Protocol, a protocol for synchronizing the clocks of

computer systems data networks.
SDP Session Description Protocol: A format for describing the types of

media to use in a session.
SHA-1 Secure Hash Algorithm Version 1.0, a standard for computing a

condensed representation of data.
SIPCOMP Signaling compression: A framework used to compress signaling

message using arbitrary compression algorithms.
SIP Session Initiation Protocol
SIP URI A uniform resource identifier with the scheme ”sip:”. SIP systems
use the domain component along with DNS to determine where to
send SIP messages.
SMTP Simple Mail Transport Protocol, a protocol for email
SSL Secure Socket Layer, a predecessor of TLS.
STUN Simple Traversal Underneath Network Address Translation
TCP Transmission Control Protocol, an Internet protocol that estab-

lishes reliable connections over IP.
TLS Transport Layer Security
UAC User Agent Client
UDP User Datagram Protocol, a connectionless Internet protocol run-

ning on top of IP.
UMTS Universal Mobile Telecommunications System,
URL Uniform Resource Locators, names used to represent addresses or

locations in the Internet.
vii
UUID Universally Unique Identifier.
WeSAHMI Web Services in Ad-Hoc and Mobile Infrastructure.
viii
List of Figures
2.1 Data push and pull service . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Data pull service with a edge proxy . . . . . . . . . . . . . . . . . . . 7
3.1 Deployment of SIP outbound in WeSAHMI security architecture . . 10
4.1 Explicit probe before sending STUN messages . . . . . . . . . . . . . 15

4.2 Explicit probe after no success STUN response received . . . . . . . 16
4.3 The format of S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4 Forwarding mechanism of EPs . . . . . . . . . . . . . . . . . . . . . . 19
6.1 Experimental environment . . . . . . . . . . . . . . . . . . . . . . . . 29

6.2 Flow sequence of SIP messages . . . . . . . . . . . . . . . . . . . . . 30
ix
List of Tables
4.1 Updated binding behaviour in SIP outbound . . . . . . . . . . . . . 20
5.1 Registration behavoir . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.2 STUN attributes supported by the implementation . . . . . . . . . . 27
6.1 REGISTER request proxied to the primary EP . . . . . . . . . . . . 31

6.2 REGISTER request proxied to the secondary EP . . . . . . . . . . . 32
6.3 200OK response received by the UA from the primary EP . . . . . . 33
6.4 200OK response received by the UA from the secondary EP . . . . . 34
6.5 SUBSCRIBE request sent by the UA to its Notifier. . . . . . . . . . 35
6.6 NOTIFY request sent from the Notifier to the UA. . . . . . . . . . . 36
6.7 STUN Binding request. . . . . . . . . . . . . . . . . . . . . . . . . . 36
A.1 Modification to eXosip and osip libraries . . . . . . . . . . . . . . . . 45
x
Chapter 1
Introduction
The increase of the Internet usages result in the assimilation of telephony services
into the Internet Protocol [6] technology, which stimulates the generation of signaling
protocols to set up and tear down multimedia sessions. Some communities propose
solutions in accordance with their own priorities and interests. Session Initiation
Protocol (SIP), born in a computer science laboratory within a decade, satisfies the
growing thirst for a new generation of IP based services [4].
The SIP is a signaling protocol used for establishing sessions in an IP network. It
is developed by IETF as part of the Internet Multimedia Conferencing Architecture
[29]. It incorporates elements of two well-known protocols: the Web’s Hyper Text
Transfer Protocol (HTTP)formatting protocol [23] and the Simple Mail Transfer
Protocol (SMTP) e-mail protocol [22] [1]. Its first major use has been signaling in
Internet telephony [30]. But gradually SIP’s utility does not end with telephony: it
is already employed as a basic technology for instance messaging and presence.
SIP resolves two significant issues in establishing these real time communication
sessions. First of all, it helps participants going to communicate locate each other
on the Internet (rendezvous). Then it allows those participants to negotiate how
they are willing to communicate.
Nowadays, more and more carriers and providers offer SIP-based services such as
local and long distance telephony, presence and instant messaging, voice message,
push-to-talk, rich media conferencing, and so on. All these media communications
resort to SIP as a signaling protocol, since SIP allows proxy servers to to initiate
TCP connections and send asynchronous UDP datagram to User Agents (UAs).
SIP will be used as the primary signaling technology in the next generation mobile
communication.
However, because of the presence of Network Address Translators (NATs) and fire-
1
CHAPTER 1. INTRODUCTION 2
walls, network is segmented, which causes SIP servers, such as registrars or proxies,
can not initiate connections to UAs. A firewall device will block connections to the
UA between the UA and the proxy servers. Similarly NATs only allow connections
from the private address side to the public side.
Researches about the effect of NAT have been done in recent years. Several
extensions are proposed to the original SIP specification [8], which allows a UA to
receive incoming signaling requests from the server side.
1.1 Research problem

The SIP enables the end systems and proxy servers to establish multimedia sessions
with each other. However, according to the above discussion, only connections to
the server initiated by the UA can be established, but connections in the reverse
direction, server initiated connections, are not possible. It is because a SIP endpoint
behind a NAT only sends messages with its private address and unmapped port,
which will be useless to other endpoints not behind the same NAT. Moreover, most
NATs/firewall prevent incoming TCP connections and UDP traffic from the public
side. This drawback of the NATs impedes the end-to-end connectivity of SIP. A
SIP endpoint will not work in such situation, without implementation of external
extensions of SIP.
The above problem can be partially resolved by deploying an Application Layer
Gateway (ALG) [20] inside the NAT. A SIP-aware ALG can inspect the message,
and map the internal addresses and ports to outside addresses and ports. But this
always requires the ALG to know the nuances of a new use of SIP. Since SIP is a
framework protocol instead of a single application, this method can not be a cure-all
mechanism.
A improved version of the same idea is to put a pair of UAs back to back across
that NAT/firewall point. The a pair of UAs is known as Back to Back User Agent
or session border controller [10]. The B2BUA acts as a UA server on one side and
as a UA client on the other side, terminating and re-originating signaling and media
on both sides. However, a B2BUA has to learn any new protocol features before
allowing them to pass.
To make the endpoints to traverse NAT easier, the Simple Traversal Underneath
NATs (STUN) [12] was proposed years ago. Through the STUN protocol, a SIP
UA can detect the mapping o f its IP address and port on a NAT device between
the private side to the public side. But the addresses obtained may not be usable
by all peers. So only STUN itself can not solve the NAT traversal problem. An
extension of STUN, known as the Traversal Using Relays around NAT (TURN),
allows a SIP client behind a NAT/firewall to receive incoming data over TCP or
UDP connections [11]. However, it only supports the connection of a client behind a
NAT to a single peer. And the cost of providing a TURN relay server is so high that
the TURN would only be desirable as a last resort. The Interactive Connectivity
Establishment (ICE) methodology [26] can be used to discover optimal means of
connectivity using various techniques, such as STUN and TURN [27].
In the worst case, a SIP client may find itself behind a NAT/firewall that prevents
all incoming traffic except packets of a TCP stream the client opened. The SIP
outbound extension is proposed [5], which reuses the connection initiated by the
UA to the EP after the UA establishes a connection to the EP successfully by
sending REGISTER requests. Since the server can not reach the UA, it is the UA’s
duty to keep the connection active. When a UA initiates a connection to the proxy,
the proxy can later reuse this flow to push SIP message to the UA. So the UA has
to assure the flow is always active.
This thesis represented the SIP outbound extension based the internet draft [5]
and did reasonable experiment and evaluation against the new features to inspect
the complexity and usability of SIP outbound. Some updates were proposed to the
specification for implementation needs.
1.2 Brief motivation

This thesis was carried out in the WeSAHMI project. In WeSAHMI project, an
experimental infrastructure for interactive wireless applications, that can operate in
an ad-hoc networking environment, is implemented. In addition, a demo application
suite for an airport environment is to be implemented [2]. SIP is employed as the
communication protocol in the session level in IP networks by the WeSAHMI secu-
rity architecture. After the upper layer accomplishes identification for all entities,
the client system starts a secure session with a gateway. This thesis implemented
the SIP outbound protocol as an extension of SIP. So with the extended features
addressed in [5], the client can initiate a secure channel to open ports for the client
in the gateway (namely EP in the following chapters). After the secure channel has
been established, the channel is kept active by the client. So the gateway later can
push SIP messages to the client.
1.3 Structure of the thesis

Chapter 1 introduces the general background knowledge and presents the research
problem. Chapter 2 addresses the effects of combining NAT and firewall with SIP
signaling and background information of WeSAHMI project. Chapter 3 introduces
the system model of WeSAHMI project and how SIP outbound fits to the whole
WeSAHMI architecture. In chapter 4 we present SIP outbound protocol in more
details, and discuss its challenges in the view of implementation practices. Chapter
5 reviews the procedure of our SIP outbound implementation, and how we integrated
STUN protocol to SIP. Chapter 6 experiments the implementation in a simplified
system against the required the features in SIP outbound. Chapter 7 discusses the
performance of the system after extended by SIP outbound and other possibilities for
the flow token algorithms. Chapter 8 concludes the thesis and presents conclusions
and future works.
Chapter 2
Background
Originally, NAT devices are used to connect an isolated address to an external realm
with globally unique registered addresses [21]. So it effectively extends the address
space. Because SIP packets go out from a NATed client with their private IP ad-
dresses packed into the message headers (Via and Contact headers) and SDP bodies
[9], a NAT device are not aware of them. So when the packets get to their destina-
tion, they are processed and responded to completely useless source addresses.
The effect of NAT and firewalls to signaling system become active research topic
[17] [28] [34]. Several solutions were proposed to allow SIP to traverse NAT and fire-
wall effectively [5], [26]. Solutions to this include using TCP for SIP instead of UDP,
employing keep alive program to maintain NAT bindings, or using STUN/TURN
servers.
The key to successful NAT/firewall traversal is that the remote host know which
global port and IP address has been assigned by the NAT for a given flow. The
extension of SIP, called ICE [26] relies on two new protocols being developed in the
IETF, STUN and TURN. STUN allows a host to learn the global IP address and
UDP port assigned by its outermost NAT box. The address can be subsequently
conveyed by SIP to allow direct UDP connectivity between hosts. TURN allows a
host to select a globally-addressable TCP relay, which can subsequently be used to
bridge a TCP connection between two NATed hosts. Unlike STUN, TURN does
not allow direct connectivity between NATed hosts.
Different from the ICE extension, SIP outbound inserts an extra network entity,
edge proxy, to traverse NAT and firewall, with a client-initiated connection mecha-
nism. The SIP client initiates secured connections to EPs (at least two) by sending
REGISTER requests. These secured connections will be maintained by the client
and EPs so that later EPs can push data to the client through these connections.
5
CHAPTER 2. BACKGROUND 6
This feature requires the EP to work not only as a SIP proxy but also as a keep
alive server. And the EP has to be able to distinguish different connections initiated
by different clients. The EP identifies different connections by assigning different
flow tokens for each connection. Communications to untrusted external domains
are allocated to EPs since clients are invisible to outer domain. Failure tolerance
mechanism is also considered in [5] by proposing multiple registrations and multiple
physical hosts deployment.
As part of the security model of WeSAHMI architecture, this thesis represented
the implementation of SIP outbound as an extension of SIP. The WeSAHMI project
implemented an application for an airport environment. In the airport scenario, a
crucial matter is the delivery of real-time information updates to the passengers and
employees of the airport. Such kind of information updates include flights’ delay or
cancellation, the changes of departure gates of the flights and such. The time delay
caused by the process of information delivery is also crucial. The airline information
system would push the information of the updated situation to passengers on time.
In the WeSASHMI project, two principal services for communication are required
between the Finnair application server and the passengers: pull and push services.
Both of these services are carried out through SIP.
SIP enables clients to register to certain services. Once registered, clients can
pull information from the content server, and the server can send asynchronous
notifications to the client. As shown in the left side of Figure 2.1, the client sends
a SUBSCRIBE message, which is acknowledged by the notifier with a NOTIFY
message. This is the push service.
The pull-service is similar. The client has to know what content to pull from the
notifier. The notifier can send descriptions of available content by using push service.
Once the client knows what services are available, it can decide what content to pull
from the notifier. As shown in right side of Figure 2.1, the notifier first sends a
NOTIFY message which carries a description of the available services. Later, the
client sends a SUBSCRIBE request to query the service, which is acknowledged by
a NOTIFY with the real data of the service.
The security architecture of WeSAHMI system is used to establish authentication
and authorization between clients and the WeSAHMI server. SIP outbound proposes
an additional networking element (Edge Proxy) consisting of transport and security
mechanisms. The EP will be inserted between the UA and the notifier topologically.
So the procedure pull services above has to be adjusted as shown in Figure ??. The
push service is similar, so it is not illustrated in the figure. All incoming and outgoing
messages have to be forwarded to the EP.
CHAPTER 2. BACKGROUND 7
Figure 2.1: Data push and pull service
Figure 2.2: Data pull service with a edge proxy

Chapter 3
System Model
3.1 WeSAHMI architecture

In [2], an experimental infrastructure is specified for interactive wireless applications
operating in a mobile ad-hoc networking [25] environment. A practical application
is deployed for an airport environment. The system provides mobile check-in service
for passengers in the airport. The user of the system is entitled to take necessary
actions with her or his mobile device, such as check-in, registration for a flight,
baggage drop and security gate.
To support the above functions, the infrastructure must be characterized by iden-
tification of mobile users and tracking of their presence, delivery of content, notifica-
tions, and status updates to mobile users in a server-initiated fashion, and managing
and updating the state of both clients and servers in real time.
The WeSAHMI architecture consists of the following components:
WeSAHMI server: a central role as relaying data from the external model to
client brower,
client browser: a X-smile browser on a client node,
security architecture is used to establish secure channel between clients and

server.
WWW server: an Apache WWW server to host user interface components

and relay client input to the WeSAHMI server.
8
CHAPTER 3. SYSTEM MODEL 9
3.2 WeSAHMI security architecture

Our implementation hosts in the security architecture. The security architecture is
designed to push data from the trusted WeSAHMI environment to untrusted wireless
network environment. An extra network entity (namely edge proxy) is added to the
architecture to ensure secure data delivery push. The edge proxy is equipped with
transport and security mechanisms. The edge proxy is a logical entity. Physically,
we can deploy multiple hosts to decrease the possibility of lost notification caused
by a single element failure.
Other elements included in the architecture are mobile hosts and notification
service. The mobile host, working as a SIP UA, can initiate a connection to the
EP by sending REGISTER request to the registrar. And then the registrar will
challenge the mobile host for authentication. After successful registration indicated
by receiving 200 OK response, the mobile host sends STUN Binding requests over
the same flow for sending SIP messages to keep the flow active. This established and
ongoing flow will later be used for secure push. The notification service also works
like a SIP UA. It fetches the contact address of the mobile host by querying the
registrar. The NOTIFY request is forwarded to the EP and then the EP forwards
it to the mobile host through the existing connection initiated by the mobile host.
3.3 SIP outbound

SIP is used to provide pull- and push- services to the WeSAHMI system. For exam-
ple, a client can register to certain services, and then pull data to the service provider
or receive asynchronous notifications from the service provider. But because of the
NAT and firewalls presence, the connections from the server side to the clients side
become impossible. That is, the service provider can not deliver asynchronous data
to clients, which is an expected feature for the WeSAHMI system. To solve this
problem, we have to add new features to the basic SIP according to one of the SIP
extensions, that is SIP outbound [5].
We insert an extra entity to the security architecture, namely the EP. So any
clients who want to subscribe to certain service, must first establish a direct flow
to their EPs by sending REGISTER requests. A local daemon on the client takes
charge of the registration and also handles the SUBSCRIBE/NOTIFY messages.
After successful registrations, the daemon may send a SUBSCRIBE message to a
content server forwarded through one of its outbound EPs, to which the content
server acknowledges with a NOTIFY. On the other hand, if a message from the
CHAPTER 3. SYSTEM MODEL 10
content server has arrived, the daemon will deliver the message to the client appli-
cation, such as the browser. Figure 3.1 shows where we deploy the SIP outbound
component in the WeSAHMI architecture.
Figure 3.1: Deployment of SIP outbound in WeSAHMI security architecture
The client daemon uses keep alive mechanism to keep the flow to its outbound
EPs always active. So when the content server wants to push messages to clients,
it can always reach the client from the public side through a secured channel.
Chapter 4
SIP client-initiated outbound
This chapter briefly describes SIP outbound extension. We adjusted the structure
of the SIP outbound draft [5], and organized it to be convenient for implementation.
4.1 Overview of the mechanism

SIP outbound is specified to be applied to the environment in which a registrar, or
more general a proxy server, can not initiate direct connections to the UA behind a
NAT box or firewall. So the key idea of SIP outbound is that when a UA initiates a
connection to a proxy server, the proxy server can later reuse the same connection
to forward requests to the UA. Certainly, the UA must ensure the connection active
by using certain keep alive mechanism.
To achieve high reliability of connections, the UA can form multiple flows to the
proxy server (known as EP in SIP outbound) by registering multiple times over
different connections for the same SIP AOR. Each REGISTER request includes an
instance-id (used to identify the UA uniquely) and a reg-id label (to distinguish
different flows). And each flow is kept active by using STUN keep alive mechanism
over UDP connection or TCP keep alive.
In the following sections, we will introduce more specifically about different be-
haviors of four networking entities (UA, EP, registrar and authoritative proxy),
supporting SIP outbound features.
11
CHAPTER 4. SIP CLIENT-INITIATED OUTBOUND 12
4.2 User agent behavior

4.2.1 Flow establishment
At configuration time UAs obtain a set of SIP URIs representing the default out-
bound proxy set. In [5], the configuration mechanism is excluded. However, this
should also be a key point for the implementation. For more implementation details,
please refer to chapter 5 and 7. The number of URIs in this set should be at least
two and no more than four. For each outbound proxy URI in the set, the UA must
send a REGISTER request to form a direct flow to the EP. The EP forwards the
request to the registrar, and then every thing works as normal SIP: the registrar
may challenge the UA for authentication; the UA sends its credential and waits for
the 200OK response from the registrar which indicates a successful registration.
The UAC is required to support the Path header mechanism, by including the
’path’ option-tag in a Supported header field value in its REGISTER requests.
Successful registrations are indicated by the presence of ’outbound’ option-tags in
Supported header field values in responses, which reveals the registrar and all EPs
traversed by the UAC support SIP outbound extension.
The failure of a registration is indicated by the UA’s receiving 503 (Service Un-
available) responses with a Retry-After header field. So the UA needs to recover
the flow by employing backoff mechanism to decide the time for re-registration. De-
tails about flow recovery can be found in section 4.2.2 the paragraph about backoff
mechanism.
Instance ID and Register ID
SIP outbound [5] introduces two new parameters for the Contact header field: In-
stance Identifier (instance-id) and Registration Identifier (reg-id). In a signaling
system supporting SIP outbound, each UA is identified uniquely by a persistent
instance-id URN. This instance-id must be persistent even if the UA reboots or
power cycled, and must not change as the device moves from one network to an-
other. The UA uses a UUID URN [19] as its instance-id and attaches it to the
Contact header field as a ”+sip.instance” media feature tag.
The UUID URN does not require central registration process so no centralized
authority is required to administer them. In our mobile wireless environment, this is
a favorable feature to minimize additional entities. Furthermore, a UUID is a fixed
size of 128 bits URN which is reasonably small compared to other alternatives. And
the unique ability to generate a new UUID without a registration process allows for
UUIDs to be one of the URNs with the lowest minting cost[19].

Another new Contact header field parameter is reg-id, added by the UA. The
UA uses reg-id to distinguish different flows, since it can register multiple times
over different connections for the same SIP AOR. The reg-id does not have to be
incremented sequentially, but it has to be unique for each flow. And when the UA
power cycles or reboots the reg-id has to remain the same as the previous flow’s so
that the registrar can replace the older registration[5].
4.2.2 Flow recovery

An ongoing flow may fail because of various network problems. So the UA should
be able to detects failures by certain mechanisms, such as keepalive mechanisms. If
a flow fails, the UA uses the procedure described in section 4.2.1 to form a new flow
to replace the failed one. However, before the recovery of the flow, the UA should
wait for some time as described in the following paragraph.
Backoff mechanism
The UA employs backoff mechanism to avoid avalanche restart on EPs. That is, the
UA needs to wait amount of time before trying to establish a new flow to replace
the failed one.
The following algorithm is used to calculate the waiting time in seconds:
T IM Ewait = min(T IM Emax , (T IM Ebase × (2f ailures )))
T IM Emax : the default value is set to 1800 seconds.
failures: is the number of consecutive registration failure.
T IM Ebase : is set to 30 seconds if all of the flows to every URI in the outbound
proxy set have failed; otherwise, if at least one of the flows has not failed, it
is set to 90 seconds.
A flow is considered successful if outbound registration succeeded and keepalives

have not expired for min-regtime seconds (default of 120 seconds) after a registration.
The time to re-register, known as delay time, is computed by selecting a uniform
random time between 50 and 100 percent of the T IM Ewait . The UA must wait
for the value of the delay time before re-registration. The default flow registration
backoff time table can be found in the Appendix A in [5].
4.2.3 Keepalive mechanism

Two keepalive methods are proposed: STUN over UDP and TCP keepalive. For
SIP over UDP, a limited version of STUN [12] keepalive mechanism is employed.
The only STUN messages required by this usage are Binding Requests, Binding
Responses, and Error Responses.
The UAC sends STUN messages over the same UDP flow used for sending SIP
messages. On the server (EP or registrar) side, it must also provide a limited version
of a STUN server listening on the same network interface and port as the SIP proxy
server.
The UA needs two phases of validation for STUN keepalive support. The first
phase allows a UA to inspect if the URIs in its outbound proxy set containing the
’keep-stun’ parameter, or not. In most circumstances, this explicit indication should
be sufficient. But misconfiguration may happen sometimes. If sending binary STUN
data to a proxy that does not support STUN, the node could be blacklisted for UDP
traffic. So we need the second phase of validation, namely an explicit probe. A UA
can send an OPTION request to the next hop by setting the Max-Forwards header
field to 0, and expect that the next hop responses with the ’sip-stun’ option tag in
its Supported header field. Otherwise, if either of these two validation phases fails,
the UA must stop sending additional STUN messages.
The UA can perform explicit probe just after it establishes a direct flow to the
EP as shown in Figure 4.1, or probe STUN support after it sends a STUN Binding
Request and does not receive a STUN success response as shown in Figure 4.2. The
order of these two phases of validation is implementation specific issue, and is left
for the implementor to decide.
For SIP over TCP or SIP over TLS over TCP, TCP keepalive is sufficient to remain
the flow active. Some operating system, such as Linux, supports per connection TCP
keepalive, which facilitates the keepalive support.
4.3 Edge proxy behavior

The Edge Proxy is located topologically between the UA and the AP and works
as a stateless forwarding proxy. It receives SIP requests and then forwards these
requests to the next hop (a registrar, another EP, or a UA). And if it wishes to be
revisited for any subsequent requests, it will add itself to the Path vector [35]. As
we expect, the EP should be able to use the ongoing flow to forward. To achieve
this feature, it will insert an identifier–containing information about the flow from
Figure 4.1: Explicit probe before sending STUN messages

Figure 4.2: Explicit probe after no success STUN response received

the previous hop–in its Path URI.
4.3.1 Flow token

When the EP receives a REGISTER request from a UA, it needs to create an
identifier value that uniquely identifies this flow, and add this identifier to its user
part of Path URI. The identifier allows the EP to map future requests back to the
correct flow. Moreover, an indirect examination of user’s authentication is done by
checking the presence of the identifier returned by a successful registration response.
SIP outbound [5] proposes flow token as a flow identifier, and also two algorithms
for stateless flow token mechanisms. For the sake of security, in our implementa-
tion we used algorithm 2 proposed in SIP outbound, but modified its input S by
replacing local IP and port with the file descriptor, and then encode it with base64
encoding[15].
In SIP outbound[5] the first algorithm generates a 16 octets long token. The
equation 4.1 is for a TCP connection. NTP is the time the connection is created
[18]. The equation 4.2 is for a UDP based transport, so no NTP time is needed, but
the remote IP and port are required .
Algorithm 1:
T oken = BASE64encode (f ileDescriptor||N T P ) (4.1)

T oken = BASE64encode (f ileDescriptor||remoteIP ||remoteP ort) (4.2)
This algorithm itself has no security assurance, so an attacker can hijack another
user’s call without a hitch. Unless, we employ SIP level security protection, this
algorithm must not be used. But security mechanism in SIP level is expensive. So
we preferred the second algorithm.
Algorithm 2:
T oken = BASE64encode (HM ACSHA1−80 (K, S)||S) (4.3)
In equation 4.3, K is a 20-octet crypto random key distributed (can be obtained

from a trusted third party) and shared among EPs. The input S is formated as
shown in the following Figure 4.3. We used HMAC-SHA1-80 [16] to compute the
keyed-hash value of S, and then encoded the concatenation of the HMAC of S and
S by using base64 encoding [15]. This will result in a 32-octet identifier.
In our implementation, we used algorithm 2, but replaced the local IP and local
Figure 4.3: The format of S
port fields of S with the file descriptor of the socket.
4.3.2 Forwarding Mechanism

There are two kinds of requests traversing the EP. One kind of requests is an inter-
mediate request which is generated by a UA in another domain and has no direct
flow to the EP. Another kind is that EP can receive requests from a UA or another
EP, depending on the configuration. As an intermediate proxy receiving a request
from another EP and it is the host in the topmost Route header field value, the
proxy compares the flow in the flow token with the source of the request. If these
refer to the same flow, the EP removes the Router header and continues processing
the request. If the flow token is invalid, the EP has to reject the request.
Figure 4.4 shows a concrete example. The solid bi-directional arrowed lines indi-
cate direct flow between entities. The dash lines mean flows established when being
needed. UA1 in domain 1 wants to contact UA2 (any kind of SIP request), first
UA1 refers to its registrar and get the contact information of UA2 and also the
flow token for the Path header [35]. Then it proxies its request to EP1 which has a
direct flow to it. EP1 finds itself is the topmost host in the Route header, and the
Route header contains a flow token, so EP1 check if it is a valid flow token. If so,
it applies normal routing procedure to decide the next hop. We assume that EP1’s
next hop is EP2, so it routes the request to EP2. When it receives the request, the
EP2 checks if the request contains a valid flow token and if the flow token is created
by itself. In this example EP2 notices the destination is UA2 who has a direct flow
to it. So EP2 sends the request to UA2 through the direct flow.
EP1 and EP2 proceed the flow token according to the algorithm they use to
generate the token: If they use algorithm 1: They first decode the user part of
the Route header using base64. Then for a TCP-based transport, if a connection
specified by the file descriptor matches its creation time, they forward the request
over that connection. For a UDP-based transport, they forward the request from
the encoded file.
If they use algorithm 2: Equivalently they decode the flow token. Then they
Figure 4.4: Forwarding mechanism of EPs
verify if the HMAC is correct by recomputing the HMAC and checking if they match
each other. If the HMACs mismatch, EPs should send a 403 (Forbidden) response.
Otherwise, they should forward the request on the flow that was specified by the
information in the flow identifier. To ensure the mid-dialog requests are routed over
the existing flow, [13] proposes the EP adds a Record-Route entry to each dialog
initiating request. The Record-Route contains a SIP URI which is comprised of a
flow token and a domain name. If this flow no longer exists, the EP should send a
430 (Flow Failed) response to the request side.
4.3.3 Keepalive mechanisms

Meanwhile, the EP must also support keepalive mechanisms and function as a STUN
server for UDP connections or TCP keepalive as presented in section 4.2.3.
4.4 Registrar behavior

As described in the SIP specification [8], a SIP client sends REGISTER request
periodically to a server (known as a SIP registrar) to associate the client’s SIP or
SIPS URI with the machine into which the client is currently logged (conveyed as a
SIP or SIPS URI in the Contact header field). The registrar writes this association,
also called a binding, to a database, called the location service. REGISTER request
can add a new binding between an AOR and one or more contact addresses. A
client can also remove previous bindings or query to determine which bindings are
currently in place for an AOR.
SIP outbound updates the definition of a binding in [8]. The updated binding
behavior is shown in the following table 5.1, according to the presence of instance-id
and reg-id.
instance-id reg-id Binding Behaviour

Registrar * * Bind an AOR with the combination of
* instance-id and reg-id
* Invalide reg-id to be ignored
Normal binding behaviour
Table 4.1: Updated binding behaviour in SIP outbound
According to the table 5.1, a Contact header field value with an instance-id but
no reg-id is still valid. But this is not applied to the reverse situation which only has
a reg-id but no instance-id. So the reg-id parameter will be simply ignored when the
instance-id is not present. Moreover, the registrar must also be prepared to receive,
for the same AOR, some registrations that use instance-id and reg-id and some do
not. This implies the registrar has to work as a normal SIP registrar and a registrar
supporting SIP outbound when needed.
The registrar must store all the Contact header field information, and store the
time at which the binding was last updated. If a Path header field is present, the
registrar stores this information as well. If the registrar receives a re-registration, it
must update any information that uniquely identifies the network flow over which
the request arrived, and should update the time the binding was last updated.
The registrar must include the ’outbound’ option-tag in a Supported header field
value in its responses to REGISTER requests for which it has performed outbound
processing. This explicitly informs EPs and UAs that this registrar supports SIP
outbound.
4.5 Authoritative proxy behavior

The AP entity is present when location service is needed by the UA. The location
service contains information that allows a proxy to input a URI and receive a set of
zero or more URIs that tell the proxy where to send the request [8]. This information
is created by registrations. As shown in Figure 4.4, UA1 looks up a registration
binding to get the contact information of UA2 by using the location service provided
by the AP and then sends a request through EP1. An AP selects a contact to use
normally, with a few additional rules:
The proxy must not populate the target set with more than one contact with
the same AOR and instance-id at a time. If a request for a particular AOR and
instance-id fails with a 430 (Flow Failed) response, the proxy should replace
the failed branch with another target (if one is available) with the same AOR
and instance-id, but a different reg-id.
If the proxy receives a final response from a branch other than a 408 (Request
Timeout) or a 430 (Flow Failed) response, the proxy must not forward the
same request to another target representing the same AOR and instance-id.
The targeted instance has already provided its response.
Chapter 5
Implementation
In this chapter, we will describe how the SIP outbound [5] was implemented as an
extension of the existing SIP framework. And how our implementation integrated
to the WeSAHMI architecture.
5.1 Open source libraries

We used the open source SIP libraries eXoSIP and oSIP to build the basic SIP
application routine. To minimize changes to the original libraries’ interfaces, we
extended most SIP outbound features in application level. That is, all SIP out-
bound features, except for keepalive mechanisms such as STUN and TCP keepalive,
were implemented by calling APIs provided by eXosip library. The eXosip2 is an
extension of the oSIP library which is a low level SIP library implementing SIP
transactions. The oSIP library provides SIP message parsing and wrappers. The
eXosip sends and receives SIP messages in isolation, and creates a separate thread
for the SIP application built upon the eXosip2 and oSIP libraries. A transaction
state machine of the oSIP library calls callback functions to send SIP messages. A
listening socket needs to be initialized in another thread to receive incoming SIP
messages in the application program. The eXosip2 provides the implementation of
the callback functions for sending the outgoing SIP request over a network trans-
port. In order to reuse the already established TCP connections, the eXosip2 looks
up a data structure which stores all the previous active UDP or TCP connections.
The STUN keepalive mechanism and flow token algorithm was implemented in
separate files. Please refer to appendix A for important modifications and data
structures. Other open source libraries including openSSL, uuid and base64, were
also used to facilitate our implementation. OpenSSL is a cryptography implemen-
22
CHAPTER 5. IMPLEMENTATION 23
tation of the SSL and TLS [32] and the DTLS [7] protocols. We used APIs provided
by OpenSSL to construct HMAC for the flow token.
5.2 User agent routine

First the UA daemon, or called client daemon, initiated the eXosip library, which
constructs some important data structure. Then it registers to the registrar by
forwarding the two REGISTER requests to its primary and secondary EPs respec-
tively. The registrar may challenge the UA. So the UA should provide its identity
as its credential. Successful registrations are indicated by the UA receiving 200OK
responses. This finally leads to the establishment of two direct flows between the
UA and its EPs. Nevertheless, if either of these two flows failed, such as any situa-
tion (as described in section 5.2.2) occurred, the UA should use backoff mechanism
to re-register.
After these initial flow establishment, a timer is trigged, and the the UA can
start normal SIP traffic. We assume it sends a SUBSCRIBE to a remote service
provider. So first the UA should consult the registrar to get the contact information
of the service provider. Then it proxies the request to any of its two proxies using
an established flow. There is no preference which EPs should be used first. In our
implementation, we always pick the primary EP to proxy requests. More intelligent
mechanism is discussed in chapter 7. When the timer expired, keepalive messages
were sent. For SIP over UDP, STUN binding requests were sent (refer to section 5.4
for details); for TCP or TLS over TCP, Linux kernel used TCP keepalive to keep
the flow active.
5.2.1 Termination of a flow

Our system should be able to terminate a flow elegantly. Once the user wants to
terminate SIP communication, he or she can send a REGISTER request with 0
value in Expire header field. The registrar removes the binding so that no further
requests will be sent to the user’s UA.
Depending on the presence of the Contact and Expires headers [14] in the REG-
ISTER request, the registrar will take different actions as shown in Table 5.1.
The REGISTER request may contain an expires parameter in the Contact header
or an Expires header field. According to [8], the REGISTER request with a wild
card Contact header field must only be used with the Expires header whose value
is 0 to remove all registrations. The expires parameter in the Contact header is
Request headers Registration behavior

Contact:* Cancel all registrations
Expires: 0
Contact:sip:callee@example.com; Add URL to current registrations;
expires=30 registration expires in 30 minutes
Table 5.1: Registration behavoir
optional and only indicates the desired expiration time of the registration. If it is
absent, the Contact header uses the Expires header as the default value.
5.2.2 Failures of a flow

Taking the STUN keepalive and implementation practices, we categorize the situa-
tions of a flow failure as follows:
503 (Service Unavailable) response;
XOR-MAPPED-ADDRESS attribute changes in the STUN Binding Response;
408 (Request Timeout)response to a next-hop OPTIONS probe for STUN

support;
430 (Flow Failed) response;
any transport layer failure, such as a fatal ICMP error;
failure of a STUN request, such as STUN retransmission.
If any of the above situation occurs, that is a UA receives any of the above
messages, the UA considers that this flow is failed. So it clears up this flow, and
waits for the right time to re-register by using the backoff mechanism.
5.2.3 Re-registation
We implemented the backoff mechanism described in section 4.2.2. So before the
UA registers again, it has to wait for certain amount of time. The UA has to use
the same reg-id as its previous flow. So the registrar knows this is a new flow to
replace the old one.
5.3 TCP keepalive

For SIP over TCP, or SIP over TLS over TCP, we use TCP keepalive. Linux
kernel supports per-connection TCP keepalive. But by default, TCP keepalive is
disabled. We enabled its support by setting TCP socket options to SOL SOCKET
and SO KEEPALIVE [31]. This feature is integrated to the eXosip library. Namely,
when the UA routine program called eXosip listen addr using TCP protocol, the
eXosip creates a TCP socket which enables keepalive mechanism. Besides, we still
need configure three TCP keepalive parameters:
/pro/sys/net/ipv4/tcp keepalive time: the number of seconds the keepalive

routines wait for before sending the first keepalive probe;
/pro/sys/net/ipv4/tcp keepalive intvl: the time interval between keepalive mes-

sages after the first prob;
/pro/sys/net/ipv4/tcp keepalive probes: the number of consecutive probes be-

fore the connection is marked as broken.
Many other alternative methods can also be used to modify the parameters. We
just picked the one convenient for you.
5.4 STUN keepalive over UDP

5.4.1 Overview of the mechanism
Before addressing more technical details, we must clarify one point may appear
confusing later. STUN support is relatively independent to SIP outbound. SIP
outbound requires STUN support, but any UA or proxy supports STUN, does not
necessarily need to support SIP outbound. So STUN or more general keepalive
mechanism can be perceived as an extension of SIP. This is one reason why STUN
keepalive was integrated in eXosip as independent files.
As specified in [5], we implemented a limited version of STUN client and server on
the SIP UA and the SIP EP respectively. Only STUN Binding Requests, Binding
Responses, and Error Responses are needed.
The UA must generate STUN keepalive messages towards the EP to refresh the
binding on NAT before it expires. Rather than using expensive application layer
messages such as SIP message, the UA sends a STUN binding request to the EP to
exact the same transport address used for SIP, such as port 5060 or 5061. This has
the effect of keeping the bindings in the NAT alive. The STUN binding responses
inform the UA that the EP is still responsive, and also inform the UA if its transport
address towards the EP has changed. In our case, a change of transport address
suggests a failure of flow. The time interval between STUN Binding requests is a
random time between 24 and 29 seconds [12].
The binding refresh usage requires to multiplex STUN traffic on the same trans-
port address as SIP. So first STUN messages must be separated from SIP messages.
A quite distinguishable feature of SIP packets is that all STUN messages start with
the first byte either 0 or 1, but the first byte of a SIP packet has never a value of 0
or 1. This may not be suffice if there are valid application layer data packets which
could be confused with STUN packets. STUN defines a special field called the magic
cookie which is a fixed 32-bit value, 0x2112A442. So even if the SIP packet can have
the same value with the magic cookie in its second 32 bit word, there is only a one
in 232 chances that they are the same.
For SIP over UDP, eXosip opened one UDP socket and we accessed it through
eXosip.net interface[0].net socket. The variable of eXosip is globally visible when
eXosip library is initiated. STUN messages are sent through this socket periodically.
To reduce processing consumption on the UA (which is a mobile phone in WeSAHMI
senario) all the registrations share the same timer. That is, when the timer expires,
the UA traverses all of its registrations and sends STUN Binding requests through
all these registration.
5.4.2 STUN server and client

On the STUN server side, the server daemon reads the buffer from a socket and
then checks if this is a SIP or STUN packet. If this is a STUN message, the daemon
will send the message to STUN message parser, instead of SIP parser. According to
the type of STUN requests, the SIP state machine may mark three kinds of events.
These events do not trigger any states transaction in SIP state machine. They are
just used to mark the type of non-SIP messages received from the SIP port.
New events added to the oSIP event types is shown as follows:
RCV BIND REQUEST: an incoming STUN BINDING request
RCV BIND RESPONSE: an incoming STUN BINDING response
RCV BIND ERROR RESPONSE: an incoming STUN ERROR response
So the receiver (either STUN client or server) may generate the above events,
after parsing the buffer. If it is a STUN Binding request, the server encodes the
STUN Binding response including STUN attributes and sends it over the same flow.
5.4.3 STUN attributes

The following attributes may present in STUN response messages in the field of
attributes as shown in table 5.2:
Value Name Binding Response Error Response

0x0001 MAPPED-ADDRESS *
0x0004 SOURCE-ADDRESS *
0x0005 CHANGE-ADDRESS *
0x0009 ERROR-CODE *
0x000A UNKNOWN-ATTRIBUTES *
0x0020 XOR-MAPPED-ADDRESS *
Table 5.2: STUN attributes supported by the implementation
After receiving the STUN response with any of the above attributes, the STUN
client decides its next action, by checking the attributes present in Binding response.
5.4.4 STUN retransmission mechanism

Because the UDP is connectionless transport protocol, the reliability of STUN mes-
sages is guaranteed by the STUN client retransmission mechanism. Clients should
retransmit the request starting with an interval of RTO[33], doubling after each
retransmission.
Initial value for RTO should be configurable. 3 seconds is recommended [33]. The
value of RTO must not be rounded up to the nearest second.
The value of RTO should be cached by an agent after the completion of the
transaction, and used as the starting value for RTO for the next transaction to the
same host. The value should be considered stale and discarded after 10 minutes.
Retransmissions continue until a response is received, or a total of 7 requests have
been sent. If no response is received by 1.6 seconds after the last request has been
sent, the client should consider the flow to have failed [12].
5.5 Edge proxy

For the sake of security, our system preferred to use the second algorithm as de-
scribed in section 4.3.1, since the first algorithm can only be used if the connection
between the EP and the registrar is integrity protected. The second algorithm uses
keyed HMAC to assure the integrity of the flow token. This is a cheap and efficient
way to protect against malicious modification.
When it decides to generate a flow token according to the mechanism described in
section 4.3.2, the EP first generates a 20-octet random key, and then computes the
keyed hash value of S formatted according to the figure 4.3 with the just generated
random key. By calling APIs provided by the OpenSSL library, we can get a 20-
octet message digest. The EP will only use the first 10-octet and concatenate it
with S. The final step is to apply base64 encoding to the string.
The validation of the token is just the reverse procedure. We base64 decode the
token and compute the HMAC of S extracted from the token. Then check if they
are identical. We implemented base64 encoding in independent files. The important
interfaces can be found in appendix A.
Chapter 6
Experimentation
6.1 Experimental infrastructure deployment

The experimental environment is shown in figure 6.1, used for testing our imple-
mentation. In the initial stage, the UA is manually configured with two outbound
proxy URIs (the minimal number of URIs required in [5]). We ignored DNS and
location service and used IP addresses directly for the sake of simplicity. Another
open issue, left for future work, is that we did not experiment the reliability of our
system. Even though we established two direct flows to the UA’s two EPs, we did
not experiment how our system would behave if the primary EP failed and it had
to use the secondary EP.
The solid bi-directional arrowed lines indicate the direct flows between the UA
Figure 6.1: Experimental environment
29
CHAPTER 6. EXPERIMENTATION 30
Figure 6.2: Flow sequence of SIP messages
and the EP. Namely an always active UDP or TCP flow. The dash bi-directional
arrowed lines indicate indirect flows between the EPs and registrar, because the flow
is established when needed.. We did not deploy APs, since we ignored the location
service.
Figure 6.2 illustrates a basic registration and SUBSCRIBE/NOTIFY procedure
we experimented against our system. In following sections, we present these mes-
sages in details.
6.2 Experiment for SIP over UDP with SIP outbound

features
The UA registers twice to the same registrar through its primary and secondary EPs
respectively. The REGISTER requests generated by the UA are listed as follows:
These two REGISTER requests are almost the same, except for the Route headers
and the reg-id parameters in the Contact header fields, as shown in table 6.1 and
6.2. In the field of Route header, we specified the two EPs IP addresses with two
parameters. Through this way, the REGISTER requests are proxied to these two
EPs, and the two parameters indicate EPs support loose route and STUN keepalive,
REGISTER sip:10.1.0.7 SIP/2.0

Via: SIP/2.0/UDP 10.1.0.10:5060;rport;branch=z9hG4bK1835142445
Route: <sip:10.1.0.11;lr;keep-stun>
From: <sip:caller@10.1.0.10>;tag=37305113
To: <sip:caller@10.1.0.10>
Call-ID: 1951009167@10.1.0.10
CSeq: 1 REGISTER
Contact: <sip:caller@10.1.0.10:5060>;
+sip-instance=”<urn:uuid:c00bb5b6-677f-4ab3-bdd7-f9ae756ea544>”;
reg-id=1
Max-Forwards: 70
User-Agent: eXosip/3.0.1
Expires: 3600
auth: ffn:hash
Supported: path
Content-Length: 0
Table 6.1: REGISTER request proxied to the primary EP
that is EPs can work as STUN keepalive servers. In table 6.2, the reg-id parameter
is set to 2 in the Contact header of the SIP body sent to its secondary EP. So we
later use this parameter to identify different flows established by the same UA. This
information is recorded by the registrar with its Contact header. According to the
Supported header, we can see the UA supports Path header. So EPs can later use
this function if needed. We used a very simple authentication mechanism, adding a
Auth header to the request. The registrar is configured to recognize the value of this
field so that other requests with different values will be denied. A more intelligent
mechanism is expected in the future work.
After received the REGISTER requests, the two EPs proxy REGISTER requests
to the registrar and delivered responses from the registrar to the UA. The responses
received by the UA from the registrar through two EPs are listed as follows:
In table 6.3 and 6.4, we notice a new header, Path header, with three parameters
appeared in responses. That is because EPs generate and insert a flow token to
REGISTER sip:10.1.0.7 SIP/2.0

Call-ID: 1951009167@10.1.0.10
CSeq: 2 REGISTER
reg-id=2
Max-Forwards: 70
Expires: 3600
auth: ffn:hash
Supported: path
Content-Length: 0
Table 6.2: REGISTER request proxied to the secondary EP
the Path header, and pack the Path header to REGISTER requests. After these
actions, EPs proxy requests to the registrar. The registrar records the flow token as
part of the binding information. Then the registrar forms responses by copying the
Path header, which eventually becomes the 200OK responses received by the UA.
The value of Supported header is set to outbound indicating that EPs supports SIP
outbound extension.
After the UA receives two 200OK responses, it sends SUBSCRIBE request as
shown in table 6.5 to its content service provider, Notifier, through its primary EP.
To use primary or secondary EP is decided randomly. In the case of the failure of
one EP, the UA can use another one. In our experiment, the logical Notifier hosts in
the registrar physically. Comparing to the REGISTER request, a new field affiliates
with the first parameter of Route header. It is the flow token the UA extracted from
the Path header of 200OK response. We do not list the response for SUBSCRIBE
request, since it is mainly the normal SIP response.
SIP/2.0 200 OK
Via: SIP/2.0/UDP 10.1.0.10:5060;rport=5060;branch=z9hG4bK1835142445
Call-ID: 1951009167@10.1.0.10
CSeq: 1 REGISTER
reg-id=1
Path:<sip:m+KF6bT1Jd+31XUKAQAKxBMKCQCGxNM=@10.1.0.11:5060;lr;ob>
Max-forwards: 70
User-agent: eXosip/3.0.1
Expires: 3600
auth: ffn:hash
Supported: outbound
Content-Length: 0
Table 6.3: 200OK response received by the UA from the primary EP
Table 6.6 lists the NOTIFY request sent by the notifier. Similarly, we notice the
flow token in the Route header. This request is forwarded to the UA’s primary EP,
who sends it to its final destination by parsing the flow token to find out the exact
flow
6.2.1 Experiment for STUN keepalive

After the first successful registration, we set the STUN keepalive interval to a random
time between 24 to 29 seconds. Then the UA will send STUN Binding requests
periodically.
The STUN Binding request sent by the UA to its two EPs in its Hexadecimal
form. In table 6.7 we listed the parsed binary data in a human readable form. As
you can see we did not give any value for the attributes field. This field may be used
SIP/2.0 200 OK
Via: SIP/2.0/UDP 10.1.0.10:5060;rport=5060;branch=z9hG4bK1094232440
Call-ID: 1951009167@10.1.0.10
CSeq: 2 REGISTER
reg-id=2
Path: <sip:n+IF6bT2Ud+31XUKAQAKxBMKAQAGxBM=@10.1.0.5:5060;lr;ob>
Max-forwards: 70
User-agent: eXosip/3.0.1
Expires: 3600
auth: ffn:hash
Supported: outbound
Content-Length: 0
Table 6.4: 200OK response received by the UA from the secondary EP
later when errors occur in STUN messages. For the rest of the STUN message, we
just padded zero to align it to 20 bytes. The data structure used in the program is
listed in appendix A.
The STUN Binding response is similar with the Binding request except for the
field of STUN message type which is 0x0101.
6.3 Experiment TCP keepalive

TCP keepalive is supported by Linux kernel. We enable TCP keepalive feature in
our code as described in chapter 5, section 5.3. We captured TCP keepalive replies
which were the ACK set without data.
SUBSCRIBE sip:callee@10.1.0.7 SIP/2.0

Route: <sip:n+IF6bT2Ud+31XUKAQAKxBMKAQAGxBM=@10.1.0.11;lr;keep-stun>
To: <sip:callee@10.1.0.7>
Call-ID: 93687336@10.1.0.10
CSeq: 20 SUBSCRIBE
Contact: <sip:caller@10.1.0.10:5060>
Max-Forwards: 70
Expires: 3600
Event: resource-update
Service: finnair
Content-Length: 0
Table 6.5: SUBSCRIBE request sent by the UA to its Notifier.

NOTIFY sip:caller@10.1.0.7:5060 SIP/2.0

Route: <sip:n+IF6bT2Ud+31XUKAQAKxBMKAQAGxBM=@10.1.0.11;lr;keep-stun>
From: <sip:callee@10.1.0.7>;tag=439406621
To: <sip:caller@10.1.0.10>;tag=1530564204
Call-ID: 93687336@10.1.0.10
CSeq: 21 NOTIFY
Contact: <sip:callee@10.1.0.7:5060>
Max-Forwards: 70
Subscription-State: active;expires=3595
Event: resource-update
Content-Type: application/soap+xml
Content-Length: 9
Table 6.6: NOTIFY request sent from the Notifier to the UA.
Name Length Value

Header First two bits 2 bits 0
Message type 2 bytes 0x0001
Message length 2 bytes 0x0000
Magic cookie 4 bytes 0x2112A442
Table 6.7: STUN Binding request.

Chapter 7
Discussion
SIP outbound [5] does not specify the configuration mechanism of outbound proxy
registration URIs. The configuration procedure can be considered as an implemen-
tation practices issue. A trusted third party can be used to distribute the outbound-
proxy-set to UAs in the initial stage. In WeSAHMI scenario, the WeSAHMI server,
who provides a backbone for the whole platform, can be used as the third party.
Each URI in the outbound-proxy-set can be resolved to several different physical
hosts. This means one URI represents one logical EP. But one logical EP can be
deployed to several physical hosts. Such kind of deployment enhances the scalability
and reliability, since a single server’s failure can not hinder the whole system. To
deploy the system in this fashion, DNS service is needed so that the various URIs
in the outbound proxy set can not resolve to the same host.
Every UA may have at least two and up to four logical EPs. To choose which one
to proxy requests, is not specified in the SIP outbound draft. In our implementation,
we just simply picked the primary EP to proxy requests unless it fails. But in a
large system, which has a lot of UAs, the primary EP may overload but other EPs
just run in vain.
To optimize the system, we may design a way to assign work load evenly. We
might regulate a limited number of direct flows from a EP to UAs. When the fixed
number is reached, the EP refuses a UA’s connection and responses with a kind
of message informing the UA to try another EP in its outbound proxy set. This
response message may use 200OK SIP response with a special header different from
normal responses to requests. As to the value of fixed number of direct flows, it
should be decided after practical measurement or mathematical model.
We only implemented STUN over UDP. So client retransmission is desirable to
achieve reliability. The STUN is transparent to transport protocols. So it is possible
37
CHAPTER 7. DISCUSSION 38
to implement it over TCP. If we implement STUN over TCP, we do not need to add
client retransmission to STUN, since TCP is connection oriented.
Chapter 8
Conclusions
8.1 Summary
In this thesis, we addressed SIP outbound protocol and its applications. Then
we described our implementation of SIP outbound as a component of WeSAHMI
system. SIP outbound, as an extension of SIP, updates several behaviors of general
SIP. It makes the traverse behind NAT possible. And then we described how our
implementation was integrated to the WeSAHMI architecture and how it worked
with the whole system. In the end, we designed several experiments for evaluation
of our implementation. The experiments are mainly about client initiated connection
features of SIP outbound and keepalive mechanisms.
During the procedure of implementation, most difficulties we encountered were
the lack of documentation for these open source libraries, including eXosip and
oSIP. This may be the common problem for most open source developers. Our
implementation is built in the application level of these two libraries , so only to
know what kind of application programming interfaces (APIs) they provide is enough
for us. But the documents are not clear and sufficient, about how to use these APIs
so that we had to inspect the source code thoroughly. It was time consuming to go
through such a big bunch of source code. However, this is good for us to learn how
the SIP transaction was implemented in the library. After learning these knowledge,
we may later be able to integrate all the SIP outbound features to the library. So
other application developers can use the library to build SIP application which
supports SIP outbound extension directly.
39
CHAPTER 8. CONCLUSIONS 40
8.2 Future work

Our implementation only realized STUN keepalive over UDP and enabled TCP
keepalive in the kernel. The [5] also proposed CRLF keepalive. To make our system
more intelligent, in future, we may entitle the UA to select a keepalive approach
according to its transport protocol and preferences.
In our experimentation, we colocated the registrar and notifier on one physical
host. For the logical registrar, we stored the binding information to random memory
instead of a database or any hardware. It was just a temporary solution for the
registrar which should be improved in the future. To write the binding information
to files, we need to consider how to format information to make the information
easy to lookup.
Since STUN keepalive is transport to transport protocol, we may also extend it
over TCP connection. Reasonable performance evaluation may be done as compar-
ison to the kernel enabled TCP keepalive. We may also implement client STUN
retransmission mechanism for STUN over UDP to achieve higher reliability.
Scalability is also expected for the SIP outbound system. To achieve high scala-
bility and failure tolerance, multiple physical hosts may be deployed for one logical
EP entity. This may need extra mechanism such as DNS SRV [3]. Moreover, a indi-
vidual timer for each registration should be set when the registrar does its binding
operation.
SIP outbound also mentions about SigComp compression [24]. When SigComp
is applied, both two communicating endpoints need to perform compression and
depression. This feature will be desirable, since the SIP message may reach up to
two thousand bytes or more which is too large for wireless transmission.
Bibliography
[1] Understanding SIP. Internet, 2007. www.sipcenter.com/sip.nsf/.
[2] WeSAHMI System Specification, 2007.
[3] P. Vixie A. Gulbrandsen and L. Esibov. A DNS RR for specifying the location
of services (DNS SRV). Network Working Group, 2000.
[4] Shoma Chakravarty Abhijit Sur, Dean Skidmore. Web services based SOA for
next generation telecom networks. In IEEE international conference on services
computing, page 520, 2006.
[5] R. Mahy C. Jennigns. Managing Client Initiated Connections in the Session

Initiation Protocol. Internet Draft (work in progress), Internet Engineering
Task Force, 2007.
[6] Marina del Rey. Internet Protocol. Network Working Group, September, 1981.
[7] N. Modadugu E. Rescorla. Datagram Transport Layer Security. Network Work-

ing Group, April, 2006.
[8] J. Rosenberg et al. SIP: Session Initiation Protocol RFC 3261. Internet Engi-
neering Task Force, 2002.
[9] V. Perkins C. Handley, M. Jacobson. SDP: Session Description Protocol. Net-

work Working Group, July, 2006.
[10] Alan B. Johnston Henry Sinnreich. Internet Communication Using SIP. 1th
edition, October, 2001.
[11] P. Matthews D. Wing J. Rosenberg, R. Mahy. Traversal Using Relays around

NAT (TURN): Relay Extensions to Session Traversal Utilities for NAT (STUN).
Internet Engineering Task Force, 2007.
41
BIBLIOGRAPHY 42
[12] R. Mahy J. Rosenberg, C. Huitema and D. Wing. Simple Traversal Under-

neath Network Address Translators (NAT) (STUN). Internet Draft (work in
progress), Internet Engineering Task Force, 2006.
[13] K. Johns. Routing of mid dialog requests using sip-outbound. Internet Draft
(work in progress), Internet Engineering Task Force, 2006.
[14] Alan B. Johnston. Understanding the Session Initiation Protocol. 1th edition,
2001.
[15] S. Josefsson. The Base16, Base32, and Base64 Data Encodings RFC 3548.
Internet Draft (work in progress), Internet Engineering Task Force, 2006.
[16] H. Krawczyk. HMAC: Keyed-Hashing for Message Authentication RFC 2104.

[17] Kundan Singh Milind Buddhikot, Adiseshu Hari and Scott Miller. MobileNAT:
A New Technique for Mobility Across Heterogeneous Address Spaces. Mobile
Networks and Applications, 10(3), 2005.
[18] David L. Mills. Computer Network Time Synchronization: The Network Time
Protocol. 1th edition, March, 2006.
[19] R. Salz P. Leach, M. Mealling. A Universally Unique iDentifier (UUID) URN

Namespace RFC 4122. Internet Engineering Task Force, 2005.
[20] M. Holdrege P. Srisuresh. IP network address translator (NAT) terminology

and considerations RFC 2663. Network Working Group, 1999.
[21] M. Holdrege P. Srisuresh. IP Network Address Translator (NAT) Terminology

and Considerations. Internet Draft (work in progress), Internet Engineering
Task Force, August, 1999.
[22] Jonathan B. Postel. Simple Mail Transfer Protocol. Network Working Group,
August, 1982.
[23] J. Mogul H. Frystyk L. Masinter P. Leach T. Berners-Lee R. Fielding, J. Gettys.

Hypertext Transfer Protocol–HTTP/1.1. Network Working Group, June, 1999.
[24] J. Christoffersson H. Hannu R. Price, C. Bormann and Z. Liu. Signaling Com-

pression (SigComp). Network Working Group, January 2003.
BIBLIOGRAPHY 43
[25] Howard Rheingold. Smart Mobs: The Next Social Revolution. 1th edition,
October, 2002.
[26] J. Rosenberg. Interactive Connectivity Establishment (ICE): A Methology for

Network Address Translator (NAT) Traversal for Offer/Answer Protocols. In-
ternet Draft (work in progress), Internet Engineering Task Force, 2005.
[27] J. Rosenberg. Interactive Connectivity Establishment (ICE): A Protocol for

Network Address Translator (NAT) Traversal for Offer/Answer Protocols. In-
ternet Draft (work in progress), Internet Engineering Task Force, 2007.
[28] Yutaka Takeda Saikat Guha and Paul Francis. NUTSS: A SIP-based Approach
to UDP and TCP Network Connectivity. ACM SIGCOMM, 2004.
[29] What is SIP? Internet, 2007. http://www.sipcenter.com/sip.nsf/html/Background.
[30] Robert Sparks. SIP Basics and Beyond. ACM Press, 2007.
[31] W. Richard Stevens. UNIX Network Programming Volume 1 Networking APIs:

Sockets and XTI. 2th edition, January, 1998.
[32] E. Rescorla T. Dierks. The Transport Layer Security (TLS) Protocol Version
1.1. Network Working Group, April, 2006.
[33] M. Allman V. Paxson. Computing TCP’s Retransmission Timer RFC 2988.

[34] Samir Chatterjee Victor Paulsamy. Network Convergence and the

NAT/Firewall Problems. In System Sciences, 2003. Proceedings of the 36th
Annual Hawaii International Conference, page 10, 2003.
[35] D. Willis and B. Hoeneisen. Session Initiation Protocol (SIP) Extension Header
Field for Registering Non-Adjacent Contacts RFC 3327. Internet Engineering
Task Force, 2002.
Appendix A
Appendix
A.1 Important data structures

STUN message header data structure and STUN message data structure:
struct stun_msg_hdr
{
u_int16_t msgType;
u_int16_t msgLength;
u_int32_t magic_cookie;
u_int96_t id;
};
struct stun_msg
{
stun_msg_hdr_t msgHdr;
int hasMappedAddress;
stun_atr_address4_t mappedAddress;
int hasSourceAddress;
stun_atr_address4_t sourceAddress;
int hasChangedAddress;
stun_atr_address4_t changedAddress;
int hasErrorCode;
stun_atr_error_t errorCode;
int hasUnknownAttributes;
stun_atr_unknown_t unknownAttributes;
44
APPENDIX A. APPENDIX 45
int hasXorMappedAddress;
stun_atr_address4_t xorMappedAddress;
};
A.2 Important modifications to the eXosip and osip li-

braries
Library File name Function name

eXosip eXconf.c eXosip keep alive
eXregister api.c eXosip register send register
eXtransport.c eXosip tcp connect socket
udp.c eXosip read message
stun.c; stun.h new files
base64.c; base64.h new files
osip osipevent.c osip message parse
osip message parse.c pro stunmsg; compare addr
Table A.1: Modification to eXosip and osip libraries
A.3 APIs for base64 encoding
void base64_encode (const unsigned char *in, size_t inlen,

unsigned char *out, size_t outlen)
bool base64_decode (const unsigned char *in, size_t inlen,

unsigned char *out, size_t *outlen)
A.4 APIs for STUN keepalive
int stun_parse_message( char* buf, unsigned int bufLen,

APPENDIX A. APPENDIX 46
stun_msg_t *pmsg, int verbose)
unsigned int stun_encode_message( const stun_msg_t msg, char* buf,

unsigned int bufLen, int verbose)

Yang-SIP

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Yang-SIP

Uploaded by

Copyright:

Available Formats

HELSINKI UNIVERSITY OF TECHNOLOGY

Department of Computer Science

SIP over Client Initiated Connections

Otaniemi, May 1, 2007

Supervisor: Professor Antti Ylä-Jääski

I want to thank my supervisor, Professor Antti Ylä-Jääski, and instructor Ph.D.

Otaniemi, May 1, 2007

4 SIP client-initiated outbound 11

AOR Address of Record, a well-known address for a user. In SIP, it is a

ALG Application Layer Gateway

API Application Programming Interface

B2BUA Back to Back User Agent

DNS Domain Name System, a global de-centralized directory that trans-

DNSSRV Domain Name System Service Record Working Group, an IETF

DHCP Dynamic Host Configuration Protocol, and Internet protocol for

DTLS Datagram Transport Layer Security

EP Edge Proxy, any proxy that is located topologically between the

HTTP Hyper Text Transport Protocol, a web browsing protocol.

HMAC Hash message Authentication Code, is a type of message authenti-

ICE Interactive Connectivity Establishment

IETF Internet Engineering Task Force

NTP Network Time Protocol, a protocol for synchronizing the clocks of

SDP Session Description Protocol: A format for describing the types of

SHA-1 Secure Hash Algorithm Version 1.0, a standard for computing a

SIPCOMP Signaling compression: A framework used to compress signaling

SIP Session Initiation Protocol

SMTP Simple Mail Transport Protocol, a protocol for email

SSL Secure Socket Layer, a predecessor of TLS.

STUN Simple Traversal Underneath Network Address Translation

TCP Transmission Control Protocol, an Internet protocol that estab-

TLS Transport Layer Security

UAC User Agent Client

UDP User Datagram Protocol, a connectionless Internet protocol run-

UMTS Universal Mobile Telecommunications System,

URL Uniform Resource Locators, names used to represent addresses or

WeSAHMI Web Services in Ad-Hoc and Mobile Infrastructure.

2.1 Data push and pull service . . . . . . . . . . . . . . . . . . . . . . . . 7

3.1 Deployment of SIP outbound in WeSAHMI security architecture . . 10

4.1 Explicit probe before sending STUN messages . . . . . . . . . . . . . 15

6.1 Experimental environment . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1 Updated binding behaviour in SIP outbound . . . . . . . . . . . . . 20

5.1 Registration behavoir . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.1 REGISTER request proxied to the primary EP . . . . . . . . . . . . 31

A.1 Modification to eXosip and osip libraries . . . . . . . . . . . . . . . . 45

1.1 Research problem

1.2 Brief motivation

1.3 Structure of the thesis

Figure 2.1: Data push and pull service

Figure 2.2: Data pull service with a edge proxy

3.1 WeSAHMI architecture

 client browser: a X-smile browser on a client node,

 security architecture is used to establish secure channel between clients and

 WWW server: an Apache WWW server to host user interface components

3.2 WeSAHMI security architecture

3.3 SIP outbound

Figure 3.1: Deployment of SIP outbound in WeSAHMI security architecture

SIP client-initiated outbound

4.1 Overview of the mechanism

4.2 User agent behavior

Instance ID and Register ID

UUIDs to be one of the URNs with the lowest minting cost[19].

4.2.2 Flow recovery

T IM Ewait = min(T IM Emax , (T IM Ebase × (2f ailures )))

 T IM Emax : the default value is set to 1800 seconds.

client browser: a X-smile browser on a client node,

security architecture is used to establish secure channel between clients and

WWW server: an Apache WWW server to host user interface components

T IM Emax : the default value is set to 1800 seconds.

failures: is the number of consecutive registration failure.

503 (Service Unavailable) response;

XOR-MAPPED-ADDRESS attribute changes in the STUN Binding Response;

408 (Request Timeout)response to a next-hop OPTIONS probe for STUN

430 (Flow Failed) response;

any transport layer failure, such as a fatal ICMP error;

failure of a STUN request, such as STUN retransmission.

/pro/sys/net/ipv4/tcp keepalive time: the number of seconds the keepalive

/pro/sys/net/ipv4/tcp keepalive intvl: the time interval between keepalive mes-

/pro/sys/net/ipv4/tcp keepalive probes: the number of consecutive probes be-

RCV BIND REQUEST: an incoming STUN BINDING request

RCV BIND RESPONSE: an incoming STUN BINDING response

RCV BIND ERROR RESPONSE: an incoming STUN ERROR response