Download as pdf or txt
Download as pdf or txt
You are on page 1of 44

Benchmarks

HiveMQ 3.0 on AWS

Introduction

HiveMQ is a highly scalable Enterprise MQTT Broker designed for lowest latency and very high
throughput. This benchmark document shows typical HiveMQ use cases and the performance
characteristics of HiveMQ. The individual scenarios are designed to show real-world use cases
MQTT brokers face in typical projects and deployments.

The goal of these benchmarks is to show the scalability and performance characteristics of
HiveMQ with huge amounts of MQTT clients and very high message throughput. The servers used
in the benchmark scenarios are typical servers HiveMQ customers use on a day-to-day basis and
there are no obscure settings applied which could falsify the results. HiveMQ is installed with the
default configuration and while there are many performance relevant knobs available in HiveMQ,
the benchmarks were executed . All Quality of Service benchmarks use disk persistence so all
guarantees the MQTT specification requires are in place.

INFO: This is a technical document and it’s assumed that the reader is familiar with the basic
principles and concepts of MQTT and TCP/IP.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 1


Benchmarks
HiveMQ 3.0 on AWS

Table of Contents
Benchmark Scenarios 3
Benchmark Environment 4
AWS 5
Hardware 7
HiveMQ Server Instance 7
MQTT Client Instances 7
Linux Configuration 8
Java Config 8
HiveMQ Configuration 8
Benchmark Tool 9
Latency Test 10
Benchmark Setup 11
QoS 0 Results 13
Results 13
Discussion 13
QoS 1 Results 15
Results 15
Discussion 16
QoS 2 Results 17
Results 17
Discussion 18

Telemetry Test 19
Benchmark Setup 20
QoS 0 Results 22
Results 23
Discussion 24
QoS 1 Results 25
Results 26
Discussion 27
QoS 2 Results 28
Results 29
Discussion 30

Fan-Out Test 31
Benchmark Setup 32
QoS 0 Results 33
Results 34
Discussion 35
QoS 1 Results 36
Results 37
Discussion 38
QoS 2 Results 39
Results 40
Discussion 41

Conclusion 42
Appendix A: /etc/sysctl.conf 43
Appendix B: /etc/security/limits.conf 44

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 2


Benchmarks
HiveMQ 3.0 on AWS

Benchmark Scenarios
The benchmark document shows 3 typical use cases of HiveMQ, all with a focus on different
performance characteristics. All benchmarks were executed for 30 to 45 minutes (depending on
the test scenario) to verify that the results are stable and show realistic numbers. This also means
that Java Garbage Collections on the server were executed multiple times during the benchmarks,
so the results are not whitewashed by short execution times. The benchmarks scenarios in this
document are the following:

1. Latency Test: This test shows the latency of HiveMQ for different Quality of Service levels for
different amounts of MQTT clients and high throughput

2. Telemetry Test: This benchmark shows a typical telemetry scenario with a high incoming
message rate and a few subscribers which consume the messages. This benchmark discusses
the resource usage of HiveMQ for telemetry use cases.

3. Fan-Out Test: This benchmark is designed to show the performance characteristics of HiveMQ
in a fan-out scenario where a huge amount of subscribers receive messages at the same time.
The focus in this benchmark is resource the consumption of HiveMQ for very high message
amplifications.

IMPORTANT: Always bear in mind that MQTT relies on TCP/IP. By design, TCP sends ACK
segments in order to meet its guarantees. These ACK segments are of course delivered over the
network and also count to incoming / outgoing traffic. So bear that in mind when reading the results,
especially for bandwidth measurements.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 3


Benchmarks
HiveMQ 3.0 on AWS

Benchmark Environment
All benchmarks were executed on Amazon Web Services (AWS), a cloud infrastructure provider.

AWS allows to deploy servers on a shared environment and is often used for cloud services. AWS
is by far the most popular cloud provider of HiveMQ customers, so using AWS for the benchmark
was a natural choice.

Important: Virtual Machines by definition can't be as performant as physical hardware. If multiple


VMs share the same hardware, the CPU utilization is not necessarily as expressive as on real
hardware. No dedicated hardware servers were used in this benchmark and no AWS dedicated
instances were used.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 4


Benchmarks
HiveMQ 3.0 on AWS

AWS

All Virtual Machines were hosted in the Frankfurt (eu-central-1) region of AWS. All EC2 instances
were launched in the same Availability Zone (eu-central-1b), which means all Virtual Machines are
are hosted in the same computing center. This results in very low network latency, so the
benchmark results (especially the latency results) are not suffering from latency variability.

The following picture shows the test architecture:

We have chosen one mid-range sized server for HiveMQ and also smaller mid-range sized
servers for the clients to eliminate the possibility of falsified benchmarks due to overloaded
clients.

The HiveMQ server CPU utilization was monitored with AWS CloudWatch and the highest
resolution for data samples, which is 1 minute for CloudWatch. We also deployed a script which
reports the current RAM usage to CloudWatch (CloudWatch does not support RAM monitoring).
CloudWatch is a black-box monitoring so the monitored CPU and RAM is the utilization of the
whole EC2 instance, not only the HiveMQ process.

While this AWS benchmark scenario is tougher than e.g. a bare-metal scenario for HiveMQ to
proof its scalability, we believe it's the right choice since these environments are the environments
real MQTT deployments face on a day-to-day basis. So bear in mind, if you execute the same
benchmarks on your own dedicated hardware, you can expect even better results.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 5


Benchmarks
HiveMQ 3.0 on AWS

Important: All EC2 instances in this benchmark are shared instances, not dedicated instances,
which means the CPU steal time is significantly higher than on dedicated instances. This is the most
common deployment type we saw from customers, so the goal was not to artificially improve the
results by using an uncommon deployment setup.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 6


Benchmarks
HiveMQ 3.0 on AWS

Hardware

HiveMQ Server Instance


The following EC2 Instance Type was used for the HiveMQ installation:

The c4.2xlarge Instance Type is an AWS EC2 Instance for computing intensive applications and
HiveMQ EC2 Instance Details
Name Value

Instance Type C4.2xlarge

RAM 15GiB (~16GB)

vCPU 8

Physical Processor Intel Xeon E5-2666 v3

Clock Speed (GHz) 2.9

has 8 (virtual) cores. The c4.2xlarge series ranges from hourly costs of $0.2615 - $0.562 (in
September 2015).

MQTT Client Instances


The following EC2 instance Types were used for the client installations:

The c4.xlarge Instance Type is an AWS EC2 Instance for computing intensive applications and has

MQTT Client EC2 Instance Details


Name Value

Instance Type C4.xlarge

RAM 7.5GiB (~8GB)

vCPU 4

Physical Processor Intel Xeon E5-2666 v3

Clock Speed (GHz) 2.9

4 (virtual) cores. The c4.xlarge series ranges from hourly costs of $0.1307 - $0.281 (in September
2015).

Important: AWS vCPUs are not the same as physical CPUs. Amazon uses the following definition for
vCPUs:


“Each vCPU is a hyperthread of an Intel Xeon core for M4, M3, C4, C3, R3, HS1, G2, I2, and
D2.“

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 7


Benchmarks
HiveMQ 3.0 on AWS

Linux Configuration

All servers (broker and client instances) were deployed with the standard AWS Linux AMI Amazon
Linux AMI 2015.03.1 x86_64 HVM. The following files were configured to allow the clients and
HiveMQ to open more sockets and the open file limit was increased:

• /etc/sysctl.conf
• /etc/security/limits.conf

The whole configuration of both files is documented in Appendix A and Appendix B.

Java Config

The default Garbage Collector was used and no Garbage Collection Tuning was made.

The following Java Version was used:

java version "1.7.0_85"



OpenJDK Runtime Environment (amzn-2.6.1.3.61.amzn1-x86_64 u85-b01)

OpenJDK 64-Bit Server VM (build 24.85-b03, mixed mode)

HiveMQ Configuration
There is no magic HiveMQ configuration used in these tests which could falsify the results. All
benchmarks are executed with a vanilla HiveMQ 3.0.0 without any further configuration. The
following default plugins were deleted from the plugins folder of HiveMQ:

• JMX Plugin
• SYS Topic Plugin

In order to increase the memory allocated by the JVM, the run.sh script of HiveMQ was modified
to use 66% of the available memory available on the server. The relevant line in the run.sh script is
the following:

JAVA_OPTS="$JAVA_OPTS -Djava.net.preferIPv4Stack=true -Xms2G -XmX10G"

This means HiveMQ starts with 2GB of allocated memory and it’s possible to scale up to 10GB of
RAM.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 8


Benchmarks
HiveMQ 3.0 on AWS

Benchmark Tool

The HiveMQ Team developed a Netty based MQTT Benchmark Tool which is able to scale up to
many thousands of MQTT clients at once due to it’s non blocking and event-driven implementation.
The tool is Java based and uses NIO to implement the non-blocking behaviour. For metric
capturing the tool uses the Dropwizard Metrics library with a UniformReservoir, so all metrics
captured over the runtime of the benchmark tool are weighted equally.

The benchmark tool is designed to avoid Java Garbage collection as much as possible so
benchmarks results are not falsified by excessive Garbage Collection on the client side. All
benchmarks scenarios ran for 45 minutes, so Garbage Collection wasn’t avoided totally on the
client side but the results are significant enough to show the real-world behaviour of HiveMQ
without too much additional latency introduced by the clients.

If you want to perform your own benchmarks with HiveMQ, we are can provide the benchmark tool
for free so you can validate the results on your own hardware. All you need to do is contact
contact@hivemq.com and answer some general and benchmark specific questions about your
MQTT test case.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 9


Benchmarks
HiveMQ 3.0 on AWS

Latency Test
Latency is key for IoT systems at high scale where responsiveness and the real time experience
are key acceptance factors of end users or downstream systems. The following benchmark shows
how HiveMQ performs in an end-to-end scenario with real network roundtrip for latencies. The
benchmark consists of different tests with an increasing number of clients and messages per
second. Besides key metrics like average roundtrip latencies, the benchmark also shows relevant
quantiles which help to understand the distribution of the results and outlier data samples.

The latency benchmark is designed to measure the real network roundtrip time of a MQTT
message, which means that network latency fluctuation is also included in these real end-to-end
measurements.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 10


Benchmarks
HiveMQ 3.0 on AWS

Benchmark Setup

The benchmark uses the HiveMQ Benchmark Tool with the roundtrip scenario. The HiveMQ
Benchmark Tool is used to create a massive amount of truly non-blocking MQTT clients. Each
client acts as publisher and subscriber. Each client publishes 1 message / second to a unique
topic the same (and only the same!) client also subscribes to. This means number of clients =
messages/second. The clients start a timer when publishing a message and stop the timer when
the published message is received again by the same client. Each result is then written to the
Metric store which calculates the statistics.

Each published message has the following properties:

Topic ${clientId}/1

QoS The QoS defined in the test


scenario

Payload Random 128 byte payload

The clients don’t have a disk persistence for QoS > 0 and do not implement an in-flight window in
order to increase throughput. All clients are started with a -XmX6g and -Xms6g JVM parameter in
order to reduce Garbage Collection (which would introduce latency on the client). GC can’t be
avoided completely, so the clients randomly introduce additional millisecond latency to the test
results.

Each EC2 instance starts 10.000 clients. So if the test e.g. uses 30.000 clients, 3 EC2 instances
are used for the clients.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 11


Benchmarks
HiveMQ 3.0 on AWS

All tests measured the following metrics (per 10.000 clients):

Metric Description

Mean The Average Latency

75th percentile The 75th percentile

95th percentile The 95th percentile

98th percentile The 98th percentile

99th percentile The 99th percentile

Median The Median (50% percentile)

StdDev The Standard Deviation

If more than one EC2 instance is used, an average of all results of each instance is calculated.
That means the individual results of each EC2 instance are summed up and divided by the number
of the instances. That means the following formula (Arithmetic Mean) is used for each metric:

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 12


Benchmarks
HiveMQ 3.0 on AWS

QoS 0 Results

This benchmark tests the end-to-end latency of MQTT messages with QoS 0 guarantees. This
means, HiveMQ does not use disk persistence due to the fire-and-forget semantics of QoS 0.
No messages were lost in this test since the TCP connection was stable all the time.

Results

QoS 0 Latency Results


Mean 75th 95th 98th 99th Median StdDev
11 ms

9,9 ms

8,8 ms

7,7 ms

6,6 ms
Latency

5,5 ms

4,4 ms

3,3 ms

2,2 ms

1,1 ms

0 ms
10.000 20.000 30.000 40.000 50.000
Number of Clients / Messages per second

This chart shows all measurements, which include Mean, 75th percentile, 95th percentile, 98th
percentile, 99th percentile, Median and the standard deviation.

Discussion

This test result show that the throughput and latency was stable all the time for the whole
measurement time (45 minutes) of every individual test. With an increasing number of clients and

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 13


Benchmarks
HiveMQ 3.0 on AWS

messages / second the latency did not increase significantly. Average roundtrip time was always

QoS 0 Latency Results


10.000 20.000 30.000 40.000 50.000

Mean 0,601411133 0,704656201 0,554681218 0,695654531 0,775675179

75th 0,29779875 0,322965875 0,363472 0,456715563 0,54715865

95th 0,35568555 0,413587525 0,508441367 0,8039185 1,15333623

98th 0,39161474 0,47640335 0,625745567 1,21136467 1,983686516

99th 0,43736171 0,841774955 0,823965263 3,487792645 6,131923234

Median 0,264643 0,28450425 0,309943667 0,35570175 0,3862116

StdDev 7,640493554 10,25841411 5,392033737 6,531972426 5,234000972

sub-millisecond even with linearly increasing throughput and number of subscriptions HiveMQ
maintains.

This benchmark demonstrated, HiveMQ delivers very high QoS 0 message throughput
(50.000 messages per second) with sub-millisecond latency.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 14


Benchmarks
HiveMQ 3.0 on AWS

QoS 1 Results

This benchmark tests the end-to-end latency of MQTT messages with QoS 1 guarantees. This
means, HiveMQ uses disk persistence for every outgoing MQTT message due to the at-least-
once semantics of QoS 1. No messages were lost in this test since the TCP connection was stable
all the time and the QoS 1 guarantees were in place.

Results

QoS 1 Latency Results


Mean 75th 95th 98th 99th Median StdDev
40 ms
37,5 ms
35 ms
32,5 ms
30 ms
27,5 ms
25 ms
22,5 ms
Latency

20 ms
17,5 ms
15 ms
12,5 ms
10 ms
7,5 ms
5 ms
2,5 ms
0 ms
2.500 5.000 7.500 10.000 12.500 15.000 17.500
Number of Clients / Messages per second

This chart shows all measurements, which include Mean, 75th percentile, 95th percentile, 98th
percentile, 99th percentile, Median and the standard deviation.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 15


Benchmarks
HiveMQ 3.0 on AWS

QoS 1 Latency Results


2.500 5.000 7.500 10.000 12.500 15.000 17.000

Mean 0,415047152 0,604632525 0,767690498 1,062635311 1,163355852 2,032116374 2,459094346

75th 0,358394 0,389408 0,461854 0,525717875 0,675208875 0,857987 1,044124375

95th 1,17566645 1,2139728 1,374251375 1,3377911 2,4430323 3,134619025 3,9443258

98th 2,64368922 3,07982922 2,95271823 3,6996623 7,28728491 13,49425474 18,23148569

99th 2,8415169 5,4699688 7,403215115 12,88660823 19,44312396 32,44365795 33,49844204

Median 0,3032895 0,328127 0,3717255 0,40715475 0,481312 0,55862825 0,63633975

StdDev 0,506549184 2,889852539 5,889325998 7,383906539 5,198095769 12,02608314 17,18158651

Discussion

This test shows that the throughput and latency was stable all the time for the whole measurement
time (45 minutes) of every individual test. With an increasing number of clients and messages per
second the latency did not increase significantly. Average roundtrip time was always in the lower
one-digit milliseconds. Even with linearly increasing throughput and number of subscriptions all
measured latencies remain very low.

Every single message was persisted to disk before delivering so the additional latency compared to
QoS 0 messages are a result of the additional disk I/O overhead.

This benchmark demonstrated that HiveMQ delivers very high QoS 1 message throughput
(> 15.000 messages per second) with a one-digit latency by average, while complying to the
QoS 1 at-least-once guarantees.

If the QoS 1 guarantees would get weakened by using an in-memory persistence, which HiveMQ
also supports, results close to QoS 0 results can be expected.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 16


Benchmarks
HiveMQ 3.0 on AWS

QoS 2 Results

This benchmark tests the end-to-end latency of MQTT messages with QoS 2 guarantees. This
means, HiveMQ uses disk persistence for every outgoing MQTT message and message
acknowledgements due to the at-exactly-once semantics of QoS 2. No messages were lost in this
test since the TCP connection was stable all the time and the QoS 2 guarantees were in place.

Results

QoS 2 Latency Results


Mean 75th 95th 98th 99th Median StdDev
40 ms
37,5 ms
35 ms
32,5 ms
30 ms
27,5 ms
25 ms
22,5 ms
Latency

20 ms
17,5 ms
15 ms
12,5 ms
10 ms
7,5 ms
5 ms
2,5 ms
0 ms
2.000 4.000 6.000 8.000 10.000 12.000
Number of Clients / Messages per second

This chart shows all measurements, which include Mean, 75th percentile, 95th percentile, 98th
percentile, 99th percentile, Median and the standard deviation.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 17


Benchmarks
HiveMQ 3.0 on AWS

Discussion

QoS 2 Latency Results


2.000 4.000 6.000 8.000 10.000 12.000

Mean 0,738507188 1,241731747 1,037575518 2,007192856 2,727078588 3,73635089

75th 0,7084635 0,758113 1,02890375 1,372976 1,81565575 2,93099875

95th 2,5722831 2,02603975 2,51790775 3,8749513 5,23610975 8,826081525

98th 3,22340022 3,31387626 4,40526338 6,88232687 11,48056945 19,44313161

99th 3,68480135 5,93211478 7,60917181 17,13762828 33,23997715 36,30610799

Median 0,521314 0,5783345 0,7021285 0,85303175 1,0951385 1,50396275

StdDev 0,869120678 10,85796472 1,461630153 13,12471279 15,76470025 15,05738578

This test shows that the throughput and latency was stable all the time for the whole measurement
time (45 minutes) of every individual test and that with increasing number of clients and
messages / second the latency did not increase significantly. Average roundtrip time was always in
the lower one-digit milliseconds. Even with linearly increasing throughput and number of
subscriptions the measured latencies remain very low.

Every single message and message acknowledgement (PUBREL) was persisted to disk before
delivering so the additional latency compared to QoS 0 messages are a result of the additional disk I/
O overhead.

QoS 2 uses a four-way message flow, so the network latencies play the biggest role compared to
other latency benchmarks. The amount of disk persistence operations are also doubled in this test
compared to QoS 1 messages.

This benchmark demonstrated, HiveMQ delivers very high QoS 2 message throughput with
a one-digit latency average, while complying to the QoS 2 exactly-once guarantees.

If the QoS 2 guarantees are weakened by using an in-memory persistence, which HiveMQ also
supports, results close to QoS 0 results can be expected. The additional network latency which is a
result of the four-way communication for QoS 2 guarantees needs to be considered, though.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 18


Benchmarks
HiveMQ 3.0 on AWS

Telemetry Test
MQTT brokers are often deployed in environments where it's key to collect data from a huge
amount of devices while only a few subscribers process the data published by the devices. A
typical use case are telemetry scenarios where the MQTT broker needs to process a very high
incoming MQTT message rate.

The following benchmarks focus on the throughput of HiveMQ in such a scenario. In order to
understand the runtime behaviour of HiveMQ in a telemetry scenario, all relevant runtime statistics
like CPU usage, and RAM and used bandwidth are measured. So this benchmark is focused on
resource consumption of HiveMQ while delivering constant message throughput.

NOTE: A more sophisticated way to process data in such a scenario is by using the HiveMQ plugin
system instead of using MQTT subscribers for processing data. Typical MQTT subscriber
applications get overwhelmed quickly (even with a few thousand messages/second) while HiveMQ
plugins allow constant data delivery to your backend systems, independent of the load on specific
"hot MQTT topics".

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 19


Benchmarks
HiveMQ 3.0 on AWS

Benchmark Setup

The benchmark uses different instances of the HiveMQ Benchmark Tool with either the publish
scenario or subscribe scenario enabled. The HiveMQ Benchmark Tool is used to create a massive
amount of truly non-blocking MQTT clients. Each client instance either acts as publisher or
subscriber. Every single publishing client publishes 1 message / second to a specific topic.
Each EC2 instance with a MQTT client publishes to a number of topics, the result is that one topic
is used per 10.000 clients. This also means number of publishing clients = messages/second.
One subscribing client is used for each topic, so every subscriber deals with 10.000 messages /
second, which is a rate a subscribing client can handle without getting overwhelmed.

All subscribers subscribe with QoS 0. This ensures that the subscribing clients are not
overwhelmed by the extreme amount of messages and don’t need to acknowledge each MQTT
packet. The other main reason for this decision is, that in order to meet the QoS 1 and 2
guarantees, HiveMQ needs to maintain an in-flight window per topic in order to meet the Ordered
Topic guarantees of the MQTT specification. This means if the subscriber is not able to
acknowledge all incoming messages at once (which is impossible), messages are queued on the
broker, which would mean the test won’t be executable and would at best measure other
behaviours (like load shedding).

Every single benchmark was executed for 30 minutes to show that the results are stable when
running under constant load for a long period of time. CPU, average bandwidth usage per minute
and RAM were at a constant level for the whole time in every single benchmark.

1 Subscriber Instance (which subscribes to all topics) and 2 publishing EC2 instances were
used in this test.

Each published message has the following properties:

Topic topic/${clientNumber % 10.000}

QoS The QoS defined in the test


scenario

Payload Random 128 byte payload

Example:
In a scenario where a EC2 instance spawns 20.000 clients and 20.000 messages/second are
published by the instance, the following topics are used for the specific test:

• topic/1
• topic/2

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 20


Benchmarks
HiveMQ 3.0 on AWS

This means all MQTT messages are distributed across 2 topics in this example.

The clients don’t have a disk persistence for QoS > 0 and do not implement an in-flight window in
order to increase throughput. All clients are started with a -XmX6g and -Xms6g JVM parameter in
order to reduce Garbage Collection (which would introduce latency on the client). GC can’t be
avoided completely, so the clients randomly introduce additional millisecond latency to the test
results.

For the measurements of CPU, RAM and bandwidth, AWS CloudWatch was used in these tests.
CloudWatch doesn’t report RAM metrics by default, so the AWS CloudWatch Monitoring Scripts
(provided by Amazon) were used and the reports took place once a minute with cron. You can
learn more about these monitoring scripts by following this link.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 21


Benchmarks
HiveMQ 3.0 on AWS

QoS 0 Results

This benchmark tests the resource consumption of HiveMQ with incoming QoS 0 messages.
The following measurements were executed during the test executions:

• Average CPU utilization


• Used total memory of the EC2 instance
• Incoming and outgoing traffic per minute

No messages were lost in this test since the TCP connection was stable all the time.

The parameters for the benchmark clients were the following:

Number of Publisher Instances 2

Number of topics 1 per 10.000 clients


per EC2 instance

Number of Subscriber Instances 1

Messages each subscribing client 10.000


subscribes to:

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 22


Benchmarks
HiveMQ 3.0 on AWS

Results

QoS 0 Telemetry CPU Utilization


100 % 94,63%
91,77%

90 % 85,88%

78,45%
80 %

70 % 64,4%

60 %
CPU %

48,79%
50 %

40 %

30 %
21,74%

20 %

10 %

0%
10.000 20.000 30.000 40.000 50.000 60.000 70.000
Incoming QoS 0 messages / second

QoS 0 Telemetry RAM Usage


3.000 MB 2.866 MB
2.731 MB

2.445 MB

2.400 MB 2.214 MB

1.963 MB

1.800 MB
RAM (MB)

1.537 MB
1.415 MB

1.200 MB

600 MB

0 MB
10.000 20.000 30.000 40.000 50.000 60.000 70.000
Incoming QoS 0 messages / second

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 23


Benchmarks
HiveMQ 3.0 on AWS

Incoming Outgoing

QoS 0 Bandwidth Usage


1.100 MB

880 MB
Bandwidth (MB / min)

660 MB

440 MB

220 MB

0 MB
10.000 20.000 30.000 40.000 50.000 60.000 70.000
Incoming QoS 0 messages / second

Discussion
Increasing the total number of QoS 0 messages per second linearly result in linear
bandwidth increase while CPU and RAM usage grow at a predictable level.

A notable observation is, that while the bandwidth usage increases linearly with the number of
messages per second, the CPU and RAM usage do not increase linearly. HiveMQ delivers
constant and predictable results until CPU limits of the EC2 instance are reached. RAM is
negligible in this test since 3GB of RAM usage were never exceeded although the machine was
configured to reserve up to 10GB of RAM for HiveMQ. The limiting factor in this test is clearly CPU
and even higher throughput can be expected for machines with more computing power. The
multithreaded nature of HiveMQ allows to scale with the number of CPUs.

Another observation is, that the CPU usage seems counter-intuitive at first sight, since HiveMQ
starts with comparatively high CPU usage and then increases the CPU usage at an decreasing
rate. This behaviour is something we see a lot with EC2 while the behaviour of HiveMQ on physical
hardware tends to be increasing more steadily but starting with lower CPU utilization.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 24


Benchmarks
HiveMQ 3.0 on AWS

QoS 1 Results

This benchmark tests the resource consumption of HiveMQ with incoming QoS 1 messages.
As discussed in the Benchmark Setup chapter, the subscribing clients subscribe with QoS 0. The
following measurements were executed during the test executions:

• Average CPU utilization


• Used total memory
• Incoming and outgoing traffic per minute

No messages were lost in this test since the TCP connection was stable all the time.

The parameters for the benchmark clients were the following:

Number of Publisher Instances 2

Number of topics 1 per 10.000 clients


per EC2 instance

Number of Subscriber Instances 1

Messages each subscribing client 10.000


subscribes to:

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 25


Benchmarks
HiveMQ 3.0 on AWS

Results

QoS 1 Telemetry CPU Utilization


100 % 95,9%

89,32%
90 %
80,4%
80 %

70 % 64,92%

60 %
CPU %

48,53%
50 %

40 %

30 %23,83%

20 %

10 %

0%
10.000 20.000 30.000 40.000 50.000 60.000
Incoming QoS 1 messages / second

QoS 1 Telemetry RAM Usage


3.000 MB 2.798,94 MB 2.835,48 MB
2.649,76 MB

2.286,87 MB
2.400 MB

1.823,64 MB

1.800 MB
RAM (MB)

1.414,68 MB

1.200 MB

600 MB

0 MB
10.000 20.000 30.000 40.000 50.000 60.000
Incoming QoS 1 messages / second

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 26


Benchmarks
HiveMQ 3.0 on AWS

Incoming Outgoing

QoS 1 Bandwidth Usage


1.200 MB

1.000 MB
Bandwidth (MB / min)

800 MB

600 MB

400 MB

200 MB

0 MB
10.000 20.000 30.000 40.000 50.000 60.000
Number of Clients / Messages per second

Discussion
Increasing the total number of QoS 1 messages per second linearly result in linear
bandwidth increase while CPU and RAM usage grow at a predictable level.

A notable observation is, that while the bandwidth usage increases linearly with the number of
messages / second, the CPU and RAM usage do not increase linearly. HiveMQ delivers constant
and predictable results until CPU limits of the EC2 instance are reached. RAM is negligible in this
test since 3GB of RAM usage were never exceeded although the machine was configured to
reserve up to 10GB of RAM for HiveMQ. The limiting factor in this test is clearly CPU and even
higher throughput can be expected for machines with more computing power. The multithreaded
nature of HiveMQ allows to scale with the number of CPUs.

Another observation is, that the CPU usage seems counter-intuitive at first sight, since HiveMQ
starts with comparatively high CPU usage and then increases the CPU usage at an decreasing
rate. This behaviour is something we see a lot with EC2 while the behaviour of HiveMQ on physical
hardware tends to be increasing more steadily but starting with lower CPU utilization.

The bandwidth measurements show, that TCP overhead plays a higher role compared to QoS 0 in
this benchmark since the payload in this benchmark is relatively small (128 byte). The fact, that the
incoming QoS 1 packets are bigger (they include a message identifier) than outgoing QoS 0
messages may also be worth considering.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 27


Benchmarks
HiveMQ 3.0 on AWS

QoS 2 Results

This benchmark tests the resource consumption of HiveMQ with incoming QoS 2 messages in a
telemetry scenario. As discussed in the Benchmark Setup chapter, the subscribing clients use QoS
0. The following measurements were executed during the test executions:

• Average CPU utilization


• Used total memory
• Incoming and outgoing traffic per minute

No messages were lost in this test since the TCP connection was stable all the time.

The parameters for the benchmark clients were the following:

Number of Publisher Instances 2

Number of topics 1 per 10.000 clients


per EC2 instance

Number of Subscriber Instances 1

Messages each subscribing client 10.000


subscribes to:

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 28


Benchmarks
HiveMQ 3.0 on AWS

Results

QoS 2 Telemetry CPU Utilization


96,98%
100 %

90 % 85,77%

80 %

66,82%
70 %

60 %
CPU %

50 %
37,84%
40 %

30 %

20 %

10 %

0%
10.000 20.000 30.000 40.000
Incoming QoS 2 messages / second

Incoming Outgoing
QoS 2 Telemetry RAM Usage
QoS 2 Bandwidth Usage 2.779,15 MB
2.734,95 MB
1.0002.800
MB MB

8002.240
MB MB
1.892,91 MB
Bandwidth (MB / min)

6001.680
MB MB
RAM (MB)

1.422,08 MB

4001.120
MB MB

560 MB
200 MB

0 MB
0 MB
10.000 20.000 30.000 40.000
10.000 20.000 30.000 40.000
Incoming QoS 2 messages / second
Incoming QoS 2 messages / second

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 29


Benchmarks
HiveMQ 3.0 on AWS

Discussion
Increasing the total number of QoS 2 messages per second linearly result in linear
bandwidth increase while CPU and RAM usage grow at a predictable level.

A notable observation is, that while the bandwidth usage increases linearly with the number of
messages / second, the CPU and RAM usage do not increase linearly. RAM utilization even
stabilizes at ~3GB. HiveMQ delivers constant and predictable results until CPU limits of the EC2
instance are reached. RAM is negligible in this test since 3GB of RAM usage were never exceeded
although the machine was configured to reserve up to 10GB of RAM for HiveMQ. The limiting
factor in this test is clearly CPU and even higher throughput can be expected for machines with
more computing power. The multithreaded nature of HiveMQ allows to scale with the number of
CPUs.

The bandwidth measurements show, that the QoS 2 overhead plays a high role compared to QoS
0 and QoS 1 in this benchmark. The payload for MQTT PUBLISH messages is relatively small
(128 byte) while the total bandwidth usage is quite high. The results show, that although the
overhead of QoS 2 is very high, HiveMQ is able to use scale with QoS 2 predictable and delivers
high message throughput.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 30


Benchmarks
HiveMQ 3.0 on AWS

Fan-Out Test
Due to its Publish / Subscribe nature, MQTT is often used for broadcasting systems where a single
message needs to be delivered to multiple subscribing clients. This can result in very high
message amplification, which could potentially drain resources on the broker quickly. Fan-Out tests
are sometimes considered as the supreme discipline in messaging due to the high amplification
rate fan-out deliveries cause.

The following benchmarks are designed to show the behaviour of HiveMQ in scenarios with very
high, up to extreme message amplification and measure the resource consumption of HiveMQ
while dealing with a very high amount of outgoing messages / sec.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 31


Benchmarks
HiveMQ 3.0 on AWS

Benchmark Setup

The benchmark uses different instances of the HiveMQ Benchmark Tool with either the publish
scenario or subscribe scenario enabled. The HiveMQ Benchmark Tool is used to create a massive
amount of truly non-blocking MQTT clients. Each client instance either acts as publisher or
subscriber. Exactly one publisher is used to constantly publish one message per second to the
broker. The number of outgoing publishes / second for each test is the number of subscribing
clients. So the number of subscribers = number outgoing messages /sec.

A variable number of EC2 instances hosting the subscribing clients is used for each test, since the
subscribing clients turned out to be the bottleneck in this benchmark. So tests with lower outgoing
messages per second rates use a smaller amount of EC2 instances with subscribing clients

All subscribers use the QoS level defined in the test and the publisher also publishes with the
same QoS level. Although HiveMQ needs to maintain an in-flight window per topic in order to meet
the Ordered Topic guarantees of the MQTT specification, the in-flight message queue is not
expected to increase in this test

Every single benchmark was executed for 30 minutes to show that the results are stable when
running under constant load for a long period of time. CPU, average bandwidth usage per minute
and RAM were at a constant level for the whole time in every single benchmark.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 32


Benchmarks
HiveMQ 3.0 on AWS

QoS 0 Results

This benchmark tests the resource consumption of HiveMQ with a huge amount of outgoing
QoS 0 messages. The following measurements were captured during the test executions:

• Average CPU utilization


• Used total memory
• Incoming and outgoing traffic per minute

No messages were lost in this test since the TCP connection was stable all the time.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 33


Benchmarks
HiveMQ 3.0 on AWS

Results

QoS 0 Fan-Out CPU Utilization


80 % 74,29%
72,138%
72 %

64 %

56 % 52,04%

48 %
CPU %

40 % 36,74%

32 %
25,03%

24 %

16 %13,07%

8%

0%
25.000 50.000 75.000 100.000 125.000 150.000
Outgoing QoS 0 messages / second

QoS 0 Fan-Out RAM Usage


9.000 MB 8.566,35 MB

7.919,43 MB 8.007 MB

7.200 MB

5.400 MB
RAM (MB)

3.336,92 MB
3.600 MB
2.822,13 MB

2.009,06 MB

1.800 MB

0 MB
25.000 50.000 75.000 100.000 125.000 150.000
Outgoing QoS 0 messages / second

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 34


Benchmarks
HiveMQ 3.0 on AWS

Incoming Outgoing

QoS 0 Fan-Out Bandwidth Usage


2.200 MB

1.925 MB

1.650 MB
Bandwidth (MB / min)

1.375 MB

1.100 MB

825 MB

550 MB

275 MB

0 MB
25.000 50.000 75.000 100.000 125.000 150.000
Outgoing QoS 0 messages / second

Discussion

The results show, that with a higher amplification rate (from 25.000x up to 150.000x), the total
throughput increases as expected. TCP overhead plays a big role in these tests. The incoming
message rate stays the same (1 publish / second) and the overall traffic increases drastically which
can be explained due to excessive sending of (required) TCP ACK segments at the subscriber
side.

The overall throughput starts to stagnate at ~125.000 messages/second, which is most likely
caused by the MQTT subscriber test setup or by bandwidth limitations of EC2. The CPU utilization
of the broker EC2 instance was always below 75% and the available RAM was not exhausted
while network usage was very high on the client and broker side.

A notable observation is, that the RAM usage increases significantly between 75.000 and 100.000
messages / seconds, which correlates to the outgoing bandwidth increase. After the initial drastic
increase, the RAM usage stays stable at a high rate. The RAM limits weren’t exceeded in these
benchmarks.

HiveMQ was able to deliver an extreme amount of fan-out messages without hitting any
CPU or RAM limits in the 30 minute long individual tests.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 35


Benchmarks
HiveMQ 3.0 on AWS

QoS 1 Results

This benchmark tests the resource consumption of HiveMQ with a huge amount of outgoing
QoS 1 messages. In order to meet the QoS 1 guarantees of the MQTT specification, HiveMQ uses
disk persistence to save every outgoing message to disk before delivering. HiveMQ also
maintains a separate in-flight window for each individual subscriber (and topic) in order to meet the
Ordered Topic guarantee that is required for all QoS 1 and 2 messages.
The following measurements were captured during the test executions:

• Average CPU utilization


• Used total memory
• Incoming and outgoing traffic per minute

No messages were lost in this test since the TCP connection was stable all the time.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 36


Benchmarks
HiveMQ 3.0 on AWS

Incoming Outgoing

QoS 1 Fan-Out Bandwidth Usage


300 MB QoS 1 Fan-Out CPU Utilization
70 %
262,5 MB 61,019%
63 %

225 MB
56 %
Bandwidth (MB / min)

47,32%
49 %
187,5 MB
41,217%
42 %
150 MB
CPU %

32,736%
35 %
112,5 MB
25,757%
28 %
75 MB
21 % 18,1%

37,5
14 %MB 11,141%

4,61%
70%
MB
2.000 4.000 6.000 8.000 10.000 12.000 14.000 16.000
0%
2.000 4.000 Outgoing8.000
6.000 QoS 1 messages
10.000 / second
12.000 14.000 16.000
Outgoing QoS 1 messages / second

Results

QoS 1 Fan-Out RAM Usage


6.924,168 MB
7.000 MB

5.998,88 MB 5.983,379 MB
5.699,695 MB

5.600 MB
4.704,383 MB

4.200 MB
RAM (MB)

3.434,086 MB

2.678,93 MB 2.709,95 MB
2.800 MB

1.400 MB

0 MB
2.000 4.000 6.000 8.000 10.000 12.000 14.000 16.000
Outgoing QoS 1 messages / second

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 37


Benchmarks
HiveMQ 3.0 on AWS

Discussion
QoS 1 message fan-out rates are significantly lower than with QoS 0, which is the result of the
additional disk persistence used for outgoing QoS 1 messages. Although disk I/O is one of the
limiting factors for this benchmark, the throughput, CPU and RAM grow predictably with the
message rate. At a rate of 15.000 outgoing QoS 2 messages / second, a more significant increase
in RAM and CPU can be observed. Higher QoS 1 message rates are expected to come at a higher
cost in terms of CPU and RAM in this concrete scenario.

TCP overhead plays a big role in these tests for bandwidth usage. The incoming message rate
stays the same (1 publish / second) and the overall traffic increases drastically which can be
explained due to excessive sending of (required) TCP ACK segments at the subscriber side.

A notable observation is, that the RAM usage hits a sweet spot at ~10.000 messages / second up
to 14.000 messages / second while the throughput increases linearly. After the initial drastic
increase, the RAM usage stays stable at a high rate.

HiveMQ was able to deliver more than 15.000 QoS 1 messages per second while using disk
persistence and without hitting any CPU or RAM limits in the 30 minute long individual
tests.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 38


Benchmarks
HiveMQ 3.0 on AWS

QoS 2 Results

This benchmark tests the resource consumption of HiveMQ with a high amount of outgoing QoS
2 messages. In order to meet the QoS 2 guarantees of the MQTT specification, HiveMQ uses disk
persistence to save every outgoing message and acknowledgement to disk before delivering.
HiveMQ also maintains a separate in-flight window for each individual subscriber (and topic) in
order to meet the Ordered Topic guarantee that is required for all QoS 1 and 2 messages.
The following measurements were captured during the test executions:

• Average CPU utilization


• Used total memory
• Incoming and outgoing traffic per minute

No messages were lost in this test since the TCP connection was stable all the time.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 39


Benchmarks
HiveMQ 3.0 on AWS

Incoming Outgoing

QoS 2 Fan-Out
QoS Bandwidth
2 Fan-Out Usage
CPU Utilization
400 MB
70 %
63,91%

35063
MB%

53,72%
30056 %
MB
Bandwidth (MB / min)

49 %
250 MB 42,26%
42 %
CPU %

200 MB
35 %
29,3%
15028
MB%

18,425%
10021
MB%

14 %
50 MB 7,78%

7%
0 MB
0%2.000 4.000 6.000 8.000 10.000 12.000
2.000 4.000 Outgoing 6.000 8.000 10.000 12.000
QoS 2 messages / second
Outgoing QoS 2 messages / second

Results

QoS 2 Fan-Out RAM Usage


8.000 MB
7.082,44 MB

6.544,582 MB

6.400 MB 6.072,672 MB

4.615,72 MB
4.800 MB
RAM (MB)

3.320,63 MB

3.200 MB
2.701,77 MB

1.600 MB

0 MB
2.000 4.000 6.000 8.000 10.000 12.000
Outgoing QoS 2 messages / second

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 40


Benchmarks
HiveMQ 3.0 on AWS

Discussion
QoS 2 message fan-out rates are significantly lower than with QoS 0, which is the result of the
additional disk persistence used for outgoing QoS 2 messages and their PUBREL
acknowledgement. Although disk I/O is one of the limiting factors for this benchmark, the
throughput, CPU and RAM grew predictably with the message rate. Significantly higher QoS 2
message rates are expected to come at a higher cost in terms of CPU and RAM in this concrete
scenario.

Beside the four-way MQTT message flow for QoS 2, TCP overhead plays a role in these tests for
bandwidth usage. The incoming message rate stayed the same (1 publish / second) and the
overall traffic increased drastically which can be explained due to excessive sending of (required)
TCP ACK segments at the subscriber side.

A notable observation is, that the CPU usage grows predictably and linearly with the throughput.

HiveMQ was able to deliver more than 12.000 QoS 2 messages per second while using disk
persistence and without hitting any CPU or RAM limits in the 30 minute long individual
tests.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 41


Benchmarks
HiveMQ 3.0 on AWS

Conclusion
This benchmark document focused on three completely different test scenarios which are heavily
inspired by real-world MQTT broker uses. These different tests gave insight how HiveMQ behaves
even under very high load. The following, long running (30-45 min) tests were executed:

• Latency tests
• Telemetry tests
• Fan-Out tests

All tests were executed with QoS 0, QoS 1 and QoS 2 and the findings were discussed in the
individual chapters.

The Latency tests proofed that HiveMQ delivers MQTT messages with lowest, mostly sub-
millisecond latencies even at message rates up to 50.000 MQTT messages / second. Beside
average roundtrip times, different quantiles and the standard deviation was also part of the
measurement to show the real-world behaviour of HiveMQ, including outlining data samples.

The Telemetry tests showed that HiveMQ handles more than 60.000 messages / second with
minimal resource consumption (RAM < 3 GB) with linear increasing throughput up to 15 MB/s for
each, incoming and outgoing traffic.

The Fan-Out tests discussed the HiveMQ performance characteristics under a single-publisher,
multi-subscriber benchmark. HiveMQ served up to 150.000 messages / second with 33MB/s to
subscribers with medium CPU utilization and used RAM below 9 GB. Due to disk persistence,
HiveMQ meet all QoS 1 and 2 guarantees in the fan-out tests at the cost of a lower message rate
per second.

This benchmark document proofed that HiveMQ runs rock stable and at predictable
performance in all major MQTT areas of operation with very high throughput and very low,
mostly sub-millisecond, latency. HiveMQ is suitable for mission-critical deployments in
enterprise environments at dedicated data centers and in the cloud.

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 42


Benchmarks
HiveMQ 3.0 on AWS

Appendix A: /etc/sysctl.conf

The following /etc/sysctl.conf configuration was used on both, the HiveMQ server and
MQTT client instances:

# Kernel sysctl configuration file for Red Hat Linux



#

# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and

# sysctl.conf(5) for more details.


# Controls IP packet forwarding

net.ipv4.ip_forward = 0


# Controls source route verification

net.ipv4.conf.default.rp_filter = 1


# Do not accept source routing

net.ipv4.conf.default.accept_source_route = 0


# Controls the System Request debugging functionality of the kernel

kernel.sysrq = 0


# Controls whether core dumps will append the PID to the core filename.

# Useful for debugging multi-threaded applications.

kernel.core_uses_pid = 1


# Controls the use of TCP syncookies

net.ipv4.tcp_syncookies = 1


# Disable netfilter on bridges.

net.bridge.bridge-nf-call-ip6tables = 0

net.bridge.bridge-nf-call-iptables = 0

net.bridge.bridge-nf-call-arptables = 0


# Controls the default maxmimum size of a mesage queue

kernel.msgmnb = 65536


# Controls the maximum size of a message, in bytes

kernel.msgmax = 65536


# Controls the maximum shared segment size, in bytes

kernel.shmmax = 68719476736


# Controls the maximum number of shared memory segments, in pages

kernel.shmall = 4294967296

#ipv6 support in the kernel, set to 0 by default

net.ipv6.conf.all.disable_ipv6 = 1

net.ipv6.conf.default.disable_ipv6 = 1


#For HiveMQ Load Testing

net.ipv4.tcp_fin_timeout = 30

fs.file-max = 5097152

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_tw_reuse = 1

net.core.rmem_default = 524288

net.core.wmem_default = 524288

net.core.rmem_max = 67108864

net.core.wmem_max = 67108864

net.ipv4.tcp_rmem = 4096 87380 16777216

net.ipv4.tcp_wmem = 4096 65536 16777216

net.ipv4.ip_local_port_range = 1024 65535

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 43


Benchmarks
HiveMQ 3.0 on AWS

Appendix B: /etc/security/limits.conf

The following /etc/security/limits.conf configuration was used for both, HiveMQ and
MQTT client server instances:

#<domain> <type> <item> <value>



#


#* soft core 0

#* hard rss 10000

#@student hard nproc 20

#@faculty soft nproc 20

#@faculty hard nproc 50

#ftp hard nproc 0

#@student - maxlogins 4



* hard nofile 500000

* soft nofile 500000

root hard nofile 500000

root soft nofile 500000

# End of file

© 2015 dc-square GmbH HiveMQ 3.0.0 AWS Benchmark 44

You might also like