Professional Documents
Culture Documents
Hivemq 3 Benchmarks Aws
Hivemq 3 Benchmarks Aws
Introduction
HiveMQ is a highly scalable Enterprise MQTT Broker designed for lowest latency and very high
throughput. This benchmark document shows typical HiveMQ use cases and the performance
characteristics of HiveMQ. The individual scenarios are designed to show real-world use cases
MQTT brokers face in typical projects and deployments.
The goal of these benchmarks is to show the scalability and performance characteristics of
HiveMQ with huge amounts of MQTT clients and very high message throughput. The servers used
in the benchmark scenarios are typical servers HiveMQ customers use on a day-to-day basis and
there are no obscure settings applied which could falsify the results. HiveMQ is installed with the
default configuration and while there are many performance relevant knobs available in HiveMQ,
the benchmarks were executed . All Quality of Service benchmarks use disk persistence so all
guarantees the MQTT specification requires are in place.
INFO: This is a technical document and it’s assumed that the reader is familiar with the basic
principles and concepts of MQTT and TCP/IP.
Table of Contents
Benchmark Scenarios 3
Benchmark Environment 4
AWS 5
Hardware 7
HiveMQ Server Instance 7
MQTT Client Instances 7
Linux Configuration 8
Java Config 8
HiveMQ Configuration 8
Benchmark Tool 9
Latency Test 10
Benchmark Setup 11
QoS 0 Results 13
Results 13
Discussion 13
QoS 1 Results 15
Results 15
Discussion 16
QoS 2 Results 17
Results 17
Discussion 18
Telemetry Test 19
Benchmark Setup 20
QoS 0 Results 22
Results 23
Discussion 24
QoS 1 Results 25
Results 26
Discussion 27
QoS 2 Results 28
Results 29
Discussion 30
Fan-Out Test 31
Benchmark Setup 32
QoS 0 Results 33
Results 34
Discussion 35
QoS 1 Results 36
Results 37
Discussion 38
QoS 2 Results 39
Results 40
Discussion 41
Conclusion 42
Appendix A: /etc/sysctl.conf 43
Appendix B: /etc/security/limits.conf 44
Benchmark Scenarios
The benchmark document shows 3 typical use cases of HiveMQ, all with a focus on different
performance characteristics. All benchmarks were executed for 30 to 45 minutes (depending on
the test scenario) to verify that the results are stable and show realistic numbers. This also means
that Java Garbage Collections on the server were executed multiple times during the benchmarks,
so the results are not whitewashed by short execution times. The benchmarks scenarios in this
document are the following:
1. Latency Test: This test shows the latency of HiveMQ for different Quality of Service levels for
different amounts of MQTT clients and high throughput
2. Telemetry Test: This benchmark shows a typical telemetry scenario with a high incoming
message rate and a few subscribers which consume the messages. This benchmark discusses
the resource usage of HiveMQ for telemetry use cases.
3. Fan-Out Test: This benchmark is designed to show the performance characteristics of HiveMQ
in a fan-out scenario where a huge amount of subscribers receive messages at the same time.
The focus in this benchmark is resource the consumption of HiveMQ for very high message
amplifications.
IMPORTANT: Always bear in mind that MQTT relies on TCP/IP. By design, TCP sends ACK
segments in order to meet its guarantees. These ACK segments are of course delivered over the
network and also count to incoming / outgoing traffic. So bear that in mind when reading the results,
especially for bandwidth measurements.
Benchmark Environment
All benchmarks were executed on Amazon Web Services (AWS), a cloud infrastructure provider.
AWS allows to deploy servers on a shared environment and is often used for cloud services. AWS
is by far the most popular cloud provider of HiveMQ customers, so using AWS for the benchmark
was a natural choice.
AWS
All Virtual Machines were hosted in the Frankfurt (eu-central-1) region of AWS. All EC2 instances
were launched in the same Availability Zone (eu-central-1b), which means all Virtual Machines are
are hosted in the same computing center. This results in very low network latency, so the
benchmark results (especially the latency results) are not suffering from latency variability.
We have chosen one mid-range sized server for HiveMQ and also smaller mid-range sized
servers for the clients to eliminate the possibility of falsified benchmarks due to overloaded
clients.
The HiveMQ server CPU utilization was monitored with AWS CloudWatch and the highest
resolution for data samples, which is 1 minute for CloudWatch. We also deployed a script which
reports the current RAM usage to CloudWatch (CloudWatch does not support RAM monitoring).
CloudWatch is a black-box monitoring so the monitored CPU and RAM is the utilization of the
whole EC2 instance, not only the HiveMQ process.
While this AWS benchmark scenario is tougher than e.g. a bare-metal scenario for HiveMQ to
proof its scalability, we believe it's the right choice since these environments are the environments
real MQTT deployments face on a day-to-day basis. So bear in mind, if you execute the same
benchmarks on your own dedicated hardware, you can expect even better results.
Important: All EC2 instances in this benchmark are shared instances, not dedicated instances,
which means the CPU steal time is significantly higher than on dedicated instances. This is the most
common deployment type we saw from customers, so the goal was not to artificially improve the
results by using an uncommon deployment setup.
Hardware
The c4.2xlarge Instance Type is an AWS EC2 Instance for computing intensive applications and
HiveMQ EC2 Instance Details
Name Value
vCPU 8
has 8 (virtual) cores. The c4.2xlarge series ranges from hourly costs of $0.2615 - $0.562 (in
September 2015).
The c4.xlarge Instance Type is an AWS EC2 Instance for computing intensive applications and has
vCPU 4
4 (virtual) cores. The c4.xlarge series ranges from hourly costs of $0.1307 - $0.281 (in September
2015).
Important: AWS vCPUs are not the same as physical CPUs. Amazon uses the following definition for
vCPUs:
“Each vCPU is a hyperthread of an Intel Xeon core for M4, M3, C4, C3, R3, HS1, G2, I2, and
D2.“
Linux Configuration
All servers (broker and client instances) were deployed with the standard AWS Linux AMI Amazon
Linux AMI 2015.03.1 x86_64 HVM. The following files were configured to allow the clients and
HiveMQ to open more sockets and the open file limit was increased:
• /etc/sysctl.conf
• /etc/security/limits.conf
Java Config
The default Garbage Collector was used and no Garbage Collection Tuning was made.
HiveMQ Configuration
There is no magic HiveMQ configuration used in these tests which could falsify the results. All
benchmarks are executed with a vanilla HiveMQ 3.0.0 without any further configuration. The
following default plugins were deleted from the plugins folder of HiveMQ:
• JMX Plugin
• SYS Topic Plugin
In order to increase the memory allocated by the JVM, the run.sh script of HiveMQ was modified
to use 66% of the available memory available on the server. The relevant line in the run.sh script is
the following:
This means HiveMQ starts with 2GB of allocated memory and it’s possible to scale up to 10GB of
RAM.
Benchmark Tool
The HiveMQ Team developed a Netty based MQTT Benchmark Tool which is able to scale up to
many thousands of MQTT clients at once due to it’s non blocking and event-driven implementation.
The tool is Java based and uses NIO to implement the non-blocking behaviour. For metric
capturing the tool uses the Dropwizard Metrics library with a UniformReservoir, so all metrics
captured over the runtime of the benchmark tool are weighted equally.
The benchmark tool is designed to avoid Java Garbage collection as much as possible so
benchmarks results are not falsified by excessive Garbage Collection on the client side. All
benchmarks scenarios ran for 45 minutes, so Garbage Collection wasn’t avoided totally on the
client side but the results are significant enough to show the real-world behaviour of HiveMQ
without too much additional latency introduced by the clients.
If you want to perform your own benchmarks with HiveMQ, we are can provide the benchmark tool
for free so you can validate the results on your own hardware. All you need to do is contact
contact@hivemq.com and answer some general and benchmark specific questions about your
MQTT test case.
Latency Test
Latency is key for IoT systems at high scale where responsiveness and the real time experience
are key acceptance factors of end users or downstream systems. The following benchmark shows
how HiveMQ performs in an end-to-end scenario with real network roundtrip for latencies. The
benchmark consists of different tests with an increasing number of clients and messages per
second. Besides key metrics like average roundtrip latencies, the benchmark also shows relevant
quantiles which help to understand the distribution of the results and outlier data samples.
The latency benchmark is designed to measure the real network roundtrip time of a MQTT
message, which means that network latency fluctuation is also included in these real end-to-end
measurements.
Benchmark Setup
The benchmark uses the HiveMQ Benchmark Tool with the roundtrip scenario. The HiveMQ
Benchmark Tool is used to create a massive amount of truly non-blocking MQTT clients. Each
client acts as publisher and subscriber. Each client publishes 1 message / second to a unique
topic the same (and only the same!) client also subscribes to. This means number of clients =
messages/second. The clients start a timer when publishing a message and stop the timer when
the published message is received again by the same client. Each result is then written to the
Metric store which calculates the statistics.
Topic ${clientId}/1
The clients don’t have a disk persistence for QoS > 0 and do not implement an in-flight window in
order to increase throughput. All clients are started with a -XmX6g and -Xms6g JVM parameter in
order to reduce Garbage Collection (which would introduce latency on the client). GC can’t be
avoided completely, so the clients randomly introduce additional millisecond latency to the test
results.
Each EC2 instance starts 10.000 clients. So if the test e.g. uses 30.000 clients, 3 EC2 instances
are used for the clients.
Metric Description
If more than one EC2 instance is used, an average of all results of each instance is calculated.
That means the individual results of each EC2 instance are summed up and divided by the number
of the instances. That means the following formula (Arithmetic Mean) is used for each metric:
QoS 0 Results
This benchmark tests the end-to-end latency of MQTT messages with QoS 0 guarantees. This
means, HiveMQ does not use disk persistence due to the fire-and-forget semantics of QoS 0.
No messages were lost in this test since the TCP connection was stable all the time.
Results
9,9 ms
8,8 ms
7,7 ms
6,6 ms
Latency
5,5 ms
4,4 ms
3,3 ms
2,2 ms
1,1 ms
0 ms
10.000 20.000 30.000 40.000 50.000
Number of Clients / Messages per second
This chart shows all measurements, which include Mean, 75th percentile, 95th percentile, 98th
percentile, 99th percentile, Median and the standard deviation.
Discussion
This test result show that the throughput and latency was stable all the time for the whole
measurement time (45 minutes) of every individual test. With an increasing number of clients and
messages / second the latency did not increase significantly. Average roundtrip time was always
sub-millisecond even with linearly increasing throughput and number of subscriptions HiveMQ
maintains.
This benchmark demonstrated, HiveMQ delivers very high QoS 0 message throughput
(50.000 messages per second) with sub-millisecond latency.
QoS 1 Results
This benchmark tests the end-to-end latency of MQTT messages with QoS 1 guarantees. This
means, HiveMQ uses disk persistence for every outgoing MQTT message due to the at-least-
once semantics of QoS 1. No messages were lost in this test since the TCP connection was stable
all the time and the QoS 1 guarantees were in place.
Results
20 ms
17,5 ms
15 ms
12,5 ms
10 ms
7,5 ms
5 ms
2,5 ms
0 ms
2.500 5.000 7.500 10.000 12.500 15.000 17.500
Number of Clients / Messages per second
This chart shows all measurements, which include Mean, 75th percentile, 95th percentile, 98th
percentile, 99th percentile, Median and the standard deviation.
Discussion
This test shows that the throughput and latency was stable all the time for the whole measurement
time (45 minutes) of every individual test. With an increasing number of clients and messages per
second the latency did not increase significantly. Average roundtrip time was always in the lower
one-digit milliseconds. Even with linearly increasing throughput and number of subscriptions all
measured latencies remain very low.
Every single message was persisted to disk before delivering so the additional latency compared to
QoS 0 messages are a result of the additional disk I/O overhead.
This benchmark demonstrated that HiveMQ delivers very high QoS 1 message throughput
(> 15.000 messages per second) with a one-digit latency by average, while complying to the
QoS 1 at-least-once guarantees.
If the QoS 1 guarantees would get weakened by using an in-memory persistence, which HiveMQ
also supports, results close to QoS 0 results can be expected.
QoS 2 Results
This benchmark tests the end-to-end latency of MQTT messages with QoS 2 guarantees. This
means, HiveMQ uses disk persistence for every outgoing MQTT message and message
acknowledgements due to the at-exactly-once semantics of QoS 2. No messages were lost in this
test since the TCP connection was stable all the time and the QoS 2 guarantees were in place.
Results
20 ms
17,5 ms
15 ms
12,5 ms
10 ms
7,5 ms
5 ms
2,5 ms
0 ms
2.000 4.000 6.000 8.000 10.000 12.000
Number of Clients / Messages per second
This chart shows all measurements, which include Mean, 75th percentile, 95th percentile, 98th
percentile, 99th percentile, Median and the standard deviation.
Discussion
This test shows that the throughput and latency was stable all the time for the whole measurement
time (45 minutes) of every individual test and that with increasing number of clients and
messages / second the latency did not increase significantly. Average roundtrip time was always in
the lower one-digit milliseconds. Even with linearly increasing throughput and number of
subscriptions the measured latencies remain very low.
Every single message and message acknowledgement (PUBREL) was persisted to disk before
delivering so the additional latency compared to QoS 0 messages are a result of the additional disk I/
O overhead.
QoS 2 uses a four-way message flow, so the network latencies play the biggest role compared to
other latency benchmarks. The amount of disk persistence operations are also doubled in this test
compared to QoS 1 messages.
This benchmark demonstrated, HiveMQ delivers very high QoS 2 message throughput with
a one-digit latency average, while complying to the QoS 2 exactly-once guarantees.
If the QoS 2 guarantees are weakened by using an in-memory persistence, which HiveMQ also
supports, results close to QoS 0 results can be expected. The additional network latency which is a
result of the four-way communication for QoS 2 guarantees needs to be considered, though.
Telemetry Test
MQTT brokers are often deployed in environments where it's key to collect data from a huge
amount of devices while only a few subscribers process the data published by the devices. A
typical use case are telemetry scenarios where the MQTT broker needs to process a very high
incoming MQTT message rate.
The following benchmarks focus on the throughput of HiveMQ in such a scenario. In order to
understand the runtime behaviour of HiveMQ in a telemetry scenario, all relevant runtime statistics
like CPU usage, and RAM and used bandwidth are measured. So this benchmark is focused on
resource consumption of HiveMQ while delivering constant message throughput.
NOTE: A more sophisticated way to process data in such a scenario is by using the HiveMQ plugin
system instead of using MQTT subscribers for processing data. Typical MQTT subscriber
applications get overwhelmed quickly (even with a few thousand messages/second) while HiveMQ
plugins allow constant data delivery to your backend systems, independent of the load on specific
"hot MQTT topics".
Benchmark Setup
The benchmark uses different instances of the HiveMQ Benchmark Tool with either the publish
scenario or subscribe scenario enabled. The HiveMQ Benchmark Tool is used to create a massive
amount of truly non-blocking MQTT clients. Each client instance either acts as publisher or
subscriber. Every single publishing client publishes 1 message / second to a specific topic.
Each EC2 instance with a MQTT client publishes to a number of topics, the result is that one topic
is used per 10.000 clients. This also means number of publishing clients = messages/second.
One subscribing client is used for each topic, so every subscriber deals with 10.000 messages /
second, which is a rate a subscribing client can handle without getting overwhelmed.
All subscribers subscribe with QoS 0. This ensures that the subscribing clients are not
overwhelmed by the extreme amount of messages and don’t need to acknowledge each MQTT
packet. The other main reason for this decision is, that in order to meet the QoS 1 and 2
guarantees, HiveMQ needs to maintain an in-flight window per topic in order to meet the Ordered
Topic guarantees of the MQTT specification. This means if the subscriber is not able to
acknowledge all incoming messages at once (which is impossible), messages are queued on the
broker, which would mean the test won’t be executable and would at best measure other
behaviours (like load shedding).
Every single benchmark was executed for 30 minutes to show that the results are stable when
running under constant load for a long period of time. CPU, average bandwidth usage per minute
and RAM were at a constant level for the whole time in every single benchmark.
1 Subscriber Instance (which subscribes to all topics) and 2 publishing EC2 instances were
used in this test.
Example:
In a scenario where a EC2 instance spawns 20.000 clients and 20.000 messages/second are
published by the instance, the following topics are used for the specific test:
• topic/1
• topic/2
This means all MQTT messages are distributed across 2 topics in this example.
The clients don’t have a disk persistence for QoS > 0 and do not implement an in-flight window in
order to increase throughput. All clients are started with a -XmX6g and -Xms6g JVM parameter in
order to reduce Garbage Collection (which would introduce latency on the client). GC can’t be
avoided completely, so the clients randomly introduce additional millisecond latency to the test
results.
For the measurements of CPU, RAM and bandwidth, AWS CloudWatch was used in these tests.
CloudWatch doesn’t report RAM metrics by default, so the AWS CloudWatch Monitoring Scripts
(provided by Amazon) were used and the reports took place once a minute with cron. You can
learn more about these monitoring scripts by following this link.
QoS 0 Results
This benchmark tests the resource consumption of HiveMQ with incoming QoS 0 messages.
The following measurements were executed during the test executions:
No messages were lost in this test since the TCP connection was stable all the time.
Results
90 % 85,88%
78,45%
80 %
70 % 64,4%
60 %
CPU %
48,79%
50 %
40 %
30 %
21,74%
20 %
10 %
0%
10.000 20.000 30.000 40.000 50.000 60.000 70.000
Incoming QoS 0 messages / second
2.445 MB
2.400 MB 2.214 MB
1.963 MB
1.800 MB
RAM (MB)
1.537 MB
1.415 MB
1.200 MB
600 MB
0 MB
10.000 20.000 30.000 40.000 50.000 60.000 70.000
Incoming QoS 0 messages / second
Incoming Outgoing
880 MB
Bandwidth (MB / min)
660 MB
440 MB
220 MB
0 MB
10.000 20.000 30.000 40.000 50.000 60.000 70.000
Incoming QoS 0 messages / second
Discussion
Increasing the total number of QoS 0 messages per second linearly result in linear
bandwidth increase while CPU and RAM usage grow at a predictable level.
A notable observation is, that while the bandwidth usage increases linearly with the number of
messages per second, the CPU and RAM usage do not increase linearly. HiveMQ delivers
constant and predictable results until CPU limits of the EC2 instance are reached. RAM is
negligible in this test since 3GB of RAM usage were never exceeded although the machine was
configured to reserve up to 10GB of RAM for HiveMQ. The limiting factor in this test is clearly CPU
and even higher throughput can be expected for machines with more computing power. The
multithreaded nature of HiveMQ allows to scale with the number of CPUs.
Another observation is, that the CPU usage seems counter-intuitive at first sight, since HiveMQ
starts with comparatively high CPU usage and then increases the CPU usage at an decreasing
rate. This behaviour is something we see a lot with EC2 while the behaviour of HiveMQ on physical
hardware tends to be increasing more steadily but starting with lower CPU utilization.
QoS 1 Results
This benchmark tests the resource consumption of HiveMQ with incoming QoS 1 messages.
As discussed in the Benchmark Setup chapter, the subscribing clients subscribe with QoS 0. The
following measurements were executed during the test executions:
No messages were lost in this test since the TCP connection was stable all the time.
Results
89,32%
90 %
80,4%
80 %
70 % 64,92%
60 %
CPU %
48,53%
50 %
40 %
30 %23,83%
20 %
10 %
0%
10.000 20.000 30.000 40.000 50.000 60.000
Incoming QoS 1 messages / second
2.286,87 MB
2.400 MB
1.823,64 MB
1.800 MB
RAM (MB)
1.414,68 MB
1.200 MB
600 MB
0 MB
10.000 20.000 30.000 40.000 50.000 60.000
Incoming QoS 1 messages / second
Incoming Outgoing
1.000 MB
Bandwidth (MB / min)
800 MB
600 MB
400 MB
200 MB
0 MB
10.000 20.000 30.000 40.000 50.000 60.000
Number of Clients / Messages per second
Discussion
Increasing the total number of QoS 1 messages per second linearly result in linear
bandwidth increase while CPU and RAM usage grow at a predictable level.
A notable observation is, that while the bandwidth usage increases linearly with the number of
messages / second, the CPU and RAM usage do not increase linearly. HiveMQ delivers constant
and predictable results until CPU limits of the EC2 instance are reached. RAM is negligible in this
test since 3GB of RAM usage were never exceeded although the machine was configured to
reserve up to 10GB of RAM for HiveMQ. The limiting factor in this test is clearly CPU and even
higher throughput can be expected for machines with more computing power. The multithreaded
nature of HiveMQ allows to scale with the number of CPUs.
Another observation is, that the CPU usage seems counter-intuitive at first sight, since HiveMQ
starts with comparatively high CPU usage and then increases the CPU usage at an decreasing
rate. This behaviour is something we see a lot with EC2 while the behaviour of HiveMQ on physical
hardware tends to be increasing more steadily but starting with lower CPU utilization.
The bandwidth measurements show, that TCP overhead plays a higher role compared to QoS 0 in
this benchmark since the payload in this benchmark is relatively small (128 byte). The fact, that the
incoming QoS 1 packets are bigger (they include a message identifier) than outgoing QoS 0
messages may also be worth considering.
QoS 2 Results
This benchmark tests the resource consumption of HiveMQ with incoming QoS 2 messages in a
telemetry scenario. As discussed in the Benchmark Setup chapter, the subscribing clients use QoS
0. The following measurements were executed during the test executions:
No messages were lost in this test since the TCP connection was stable all the time.
Results
90 % 85,77%
80 %
66,82%
70 %
60 %
CPU %
50 %
37,84%
40 %
30 %
20 %
10 %
0%
10.000 20.000 30.000 40.000
Incoming QoS 2 messages / second
Incoming Outgoing
QoS 2 Telemetry RAM Usage
QoS 2 Bandwidth Usage 2.779,15 MB
2.734,95 MB
1.0002.800
MB MB
8002.240
MB MB
1.892,91 MB
Bandwidth (MB / min)
6001.680
MB MB
RAM (MB)
1.422,08 MB
4001.120
MB MB
560 MB
200 MB
0 MB
0 MB
10.000 20.000 30.000 40.000
10.000 20.000 30.000 40.000
Incoming QoS 2 messages / second
Incoming QoS 2 messages / second
Discussion
Increasing the total number of QoS 2 messages per second linearly result in linear
bandwidth increase while CPU and RAM usage grow at a predictable level.
A notable observation is, that while the bandwidth usage increases linearly with the number of
messages / second, the CPU and RAM usage do not increase linearly. RAM utilization even
stabilizes at ~3GB. HiveMQ delivers constant and predictable results until CPU limits of the EC2
instance are reached. RAM is negligible in this test since 3GB of RAM usage were never exceeded
although the machine was configured to reserve up to 10GB of RAM for HiveMQ. The limiting
factor in this test is clearly CPU and even higher throughput can be expected for machines with
more computing power. The multithreaded nature of HiveMQ allows to scale with the number of
CPUs.
The bandwidth measurements show, that the QoS 2 overhead plays a high role compared to QoS
0 and QoS 1 in this benchmark. The payload for MQTT PUBLISH messages is relatively small
(128 byte) while the total bandwidth usage is quite high. The results show, that although the
overhead of QoS 2 is very high, HiveMQ is able to use scale with QoS 2 predictable and delivers
high message throughput.
Fan-Out Test
Due to its Publish / Subscribe nature, MQTT is often used for broadcasting systems where a single
message needs to be delivered to multiple subscribing clients. This can result in very high
message amplification, which could potentially drain resources on the broker quickly. Fan-Out tests
are sometimes considered as the supreme discipline in messaging due to the high amplification
rate fan-out deliveries cause.
The following benchmarks are designed to show the behaviour of HiveMQ in scenarios with very
high, up to extreme message amplification and measure the resource consumption of HiveMQ
while dealing with a very high amount of outgoing messages / sec.
Benchmark Setup
The benchmark uses different instances of the HiveMQ Benchmark Tool with either the publish
scenario or subscribe scenario enabled. The HiveMQ Benchmark Tool is used to create a massive
amount of truly non-blocking MQTT clients. Each client instance either acts as publisher or
subscriber. Exactly one publisher is used to constantly publish one message per second to the
broker. The number of outgoing publishes / second for each test is the number of subscribing
clients. So the number of subscribers = number outgoing messages /sec.
A variable number of EC2 instances hosting the subscribing clients is used for each test, since the
subscribing clients turned out to be the bottleneck in this benchmark. So tests with lower outgoing
messages per second rates use a smaller amount of EC2 instances with subscribing clients
All subscribers use the QoS level defined in the test and the publisher also publishes with the
same QoS level. Although HiveMQ needs to maintain an in-flight window per topic in order to meet
the Ordered Topic guarantees of the MQTT specification, the in-flight message queue is not
expected to increase in this test
Every single benchmark was executed for 30 minutes to show that the results are stable when
running under constant load for a long period of time. CPU, average bandwidth usage per minute
and RAM were at a constant level for the whole time in every single benchmark.
QoS 0 Results
This benchmark tests the resource consumption of HiveMQ with a huge amount of outgoing
QoS 0 messages. The following measurements were captured during the test executions:
No messages were lost in this test since the TCP connection was stable all the time.
Results
64 %
56 % 52,04%
48 %
CPU %
40 % 36,74%
32 %
25,03%
24 %
16 %13,07%
8%
0%
25.000 50.000 75.000 100.000 125.000 150.000
Outgoing QoS 0 messages / second
7.919,43 MB 8.007 MB
7.200 MB
5.400 MB
RAM (MB)
3.336,92 MB
3.600 MB
2.822,13 MB
2.009,06 MB
1.800 MB
0 MB
25.000 50.000 75.000 100.000 125.000 150.000
Outgoing QoS 0 messages / second
Incoming Outgoing
1.925 MB
1.650 MB
Bandwidth (MB / min)
1.375 MB
1.100 MB
825 MB
550 MB
275 MB
0 MB
25.000 50.000 75.000 100.000 125.000 150.000
Outgoing QoS 0 messages / second
Discussion
The results show, that with a higher amplification rate (from 25.000x up to 150.000x), the total
throughput increases as expected. TCP overhead plays a big role in these tests. The incoming
message rate stays the same (1 publish / second) and the overall traffic increases drastically which
can be explained due to excessive sending of (required) TCP ACK segments at the subscriber
side.
The overall throughput starts to stagnate at ~125.000 messages/second, which is most likely
caused by the MQTT subscriber test setup or by bandwidth limitations of EC2. The CPU utilization
of the broker EC2 instance was always below 75% and the available RAM was not exhausted
while network usage was very high on the client and broker side.
A notable observation is, that the RAM usage increases significantly between 75.000 and 100.000
messages / seconds, which correlates to the outgoing bandwidth increase. After the initial drastic
increase, the RAM usage stays stable at a high rate. The RAM limits weren’t exceeded in these
benchmarks.
HiveMQ was able to deliver an extreme amount of fan-out messages without hitting any
CPU or RAM limits in the 30 minute long individual tests.
QoS 1 Results
This benchmark tests the resource consumption of HiveMQ with a huge amount of outgoing
QoS 1 messages. In order to meet the QoS 1 guarantees of the MQTT specification, HiveMQ uses
disk persistence to save every outgoing message to disk before delivering. HiveMQ also
maintains a separate in-flight window for each individual subscriber (and topic) in order to meet the
Ordered Topic guarantee that is required for all QoS 1 and 2 messages.
The following measurements were captured during the test executions:
No messages were lost in this test since the TCP connection was stable all the time.
Incoming Outgoing
225 MB
56 %
Bandwidth (MB / min)
47,32%
49 %
187,5 MB
41,217%
42 %
150 MB
CPU %
32,736%
35 %
112,5 MB
25,757%
28 %
75 MB
21 % 18,1%
37,5
14 %MB 11,141%
4,61%
70%
MB
2.000 4.000 6.000 8.000 10.000 12.000 14.000 16.000
0%
2.000 4.000 Outgoing8.000
6.000 QoS 1 messages
10.000 / second
12.000 14.000 16.000
Outgoing QoS 1 messages / second
Results
5.998,88 MB 5.983,379 MB
5.699,695 MB
5.600 MB
4.704,383 MB
4.200 MB
RAM (MB)
3.434,086 MB
2.678,93 MB 2.709,95 MB
2.800 MB
1.400 MB
0 MB
2.000 4.000 6.000 8.000 10.000 12.000 14.000 16.000
Outgoing QoS 1 messages / second
Discussion
QoS 1 message fan-out rates are significantly lower than with QoS 0, which is the result of the
additional disk persistence used for outgoing QoS 1 messages. Although disk I/O is one of the
limiting factors for this benchmark, the throughput, CPU and RAM grow predictably with the
message rate. At a rate of 15.000 outgoing QoS 2 messages / second, a more significant increase
in RAM and CPU can be observed. Higher QoS 1 message rates are expected to come at a higher
cost in terms of CPU and RAM in this concrete scenario.
TCP overhead plays a big role in these tests for bandwidth usage. The incoming message rate
stays the same (1 publish / second) and the overall traffic increases drastically which can be
explained due to excessive sending of (required) TCP ACK segments at the subscriber side.
A notable observation is, that the RAM usage hits a sweet spot at ~10.000 messages / second up
to 14.000 messages / second while the throughput increases linearly. After the initial drastic
increase, the RAM usage stays stable at a high rate.
HiveMQ was able to deliver more than 15.000 QoS 1 messages per second while using disk
persistence and without hitting any CPU or RAM limits in the 30 minute long individual
tests.
QoS 2 Results
This benchmark tests the resource consumption of HiveMQ with a high amount of outgoing QoS
2 messages. In order to meet the QoS 2 guarantees of the MQTT specification, HiveMQ uses disk
persistence to save every outgoing message and acknowledgement to disk before delivering.
HiveMQ also maintains a separate in-flight window for each individual subscriber (and topic) in
order to meet the Ordered Topic guarantee that is required for all QoS 1 and 2 messages.
The following measurements were captured during the test executions:
No messages were lost in this test since the TCP connection was stable all the time.
Incoming Outgoing
QoS 2 Fan-Out
QoS Bandwidth
2 Fan-Out Usage
CPU Utilization
400 MB
70 %
63,91%
35063
MB%
53,72%
30056 %
MB
Bandwidth (MB / min)
49 %
250 MB 42,26%
42 %
CPU %
200 MB
35 %
29,3%
15028
MB%
18,425%
10021
MB%
14 %
50 MB 7,78%
7%
0 MB
0%2.000 4.000 6.000 8.000 10.000 12.000
2.000 4.000 Outgoing 6.000 8.000 10.000 12.000
QoS 2 messages / second
Outgoing QoS 2 messages / second
Results
6.544,582 MB
6.400 MB 6.072,672 MB
4.615,72 MB
4.800 MB
RAM (MB)
3.320,63 MB
3.200 MB
2.701,77 MB
1.600 MB
0 MB
2.000 4.000 6.000 8.000 10.000 12.000
Outgoing QoS 2 messages / second
Discussion
QoS 2 message fan-out rates are significantly lower than with QoS 0, which is the result of the
additional disk persistence used for outgoing QoS 2 messages and their PUBREL
acknowledgement. Although disk I/O is one of the limiting factors for this benchmark, the
throughput, CPU and RAM grew predictably with the message rate. Significantly higher QoS 2
message rates are expected to come at a higher cost in terms of CPU and RAM in this concrete
scenario.
Beside the four-way MQTT message flow for QoS 2, TCP overhead plays a role in these tests for
bandwidth usage. The incoming message rate stayed the same (1 publish / second) and the
overall traffic increased drastically which can be explained due to excessive sending of (required)
TCP ACK segments at the subscriber side.
A notable observation is, that the CPU usage grows predictably and linearly with the throughput.
HiveMQ was able to deliver more than 12.000 QoS 2 messages per second while using disk
persistence and without hitting any CPU or RAM limits in the 30 minute long individual
tests.
Conclusion
This benchmark document focused on three completely different test scenarios which are heavily
inspired by real-world MQTT broker uses. These different tests gave insight how HiveMQ behaves
even under very high load. The following, long running (30-45 min) tests were executed:
• Latency tests
• Telemetry tests
• Fan-Out tests
All tests were executed with QoS 0, QoS 1 and QoS 2 and the findings were discussed in the
individual chapters.
The Latency tests proofed that HiveMQ delivers MQTT messages with lowest, mostly sub-
millisecond latencies even at message rates up to 50.000 MQTT messages / second. Beside
average roundtrip times, different quantiles and the standard deviation was also part of the
measurement to show the real-world behaviour of HiveMQ, including outlining data samples.
The Telemetry tests showed that HiveMQ handles more than 60.000 messages / second with
minimal resource consumption (RAM < 3 GB) with linear increasing throughput up to 15 MB/s for
each, incoming and outgoing traffic.
The Fan-Out tests discussed the HiveMQ performance characteristics under a single-publisher,
multi-subscriber benchmark. HiveMQ served up to 150.000 messages / second with 33MB/s to
subscribers with medium CPU utilization and used RAM below 9 GB. Due to disk persistence,
HiveMQ meet all QoS 1 and 2 guarantees in the fan-out tests at the cost of a lower message rate
per second.
This benchmark document proofed that HiveMQ runs rock stable and at predictable
performance in all major MQTT areas of operation with very high throughput and very low,
mostly sub-millisecond, latency. HiveMQ is suitable for mission-critical deployments in
enterprise environments at dedicated data centers and in the cloud.
Appendix A: /etc/sysctl.conf
The following /etc/sysctl.conf configuration was used on both, the HiveMQ server and
MQTT client instances:
Appendix B: /etc/security/limits.conf
The following /etc/security/limits.conf configuration was used for both, HiveMQ and
MQTT client server instances: