Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

TLS Fingerprinting with JA3 and JA3S

John Althouse Follow


Jan 15, 2019 · 10 min read

TL;DR
In this blog post, I’ll go over how to utilize JA3 with JA3S as a method to fingerprint
the TLS negotiation between client and server. This combined fingerprinting can
assist in producing higher fidelity identification of the encrypted communication
between a specific client and its server. For example —

Standard Tor Client:


JA3 = e7d705a3286e19ea42f587b344ee6865  ( Tor Client )


JA3S = a95ca7eab4d47d051a5cd4fb7b6005dc ( Tor Server Response )

The Tor servers always respond to the Tor client in exactly the same way, providing
higher confidence that the traffic is indeed Tor. Further examples —

Trickbot malware:

JA3 = 6734f37431670b3ab4292b8f60f29984  ( Trickbot )


JA3S = 623de93db17d313345d7ea481e7443cf ( C2 Server Response )

Emotet malware:

JA3 = 4d7a28d6f2263ed61de88ca66eb011e3  ( Emotet )


JA3S = 80b3a14bccc8598a1f3bbe83e71f735f  ( C2 Server Response )


In these malware examples, the command and control server always responds to the
malware client in exactly the same way; it does not deviate. So even though the traffic
is encrypted and one may not know the command and control server’s IPs or domains
as they are constantly changing, we can still identify, with reasonable confidence, the
malicious communication by fingerprinting the TLS negotiation between client and
server.

JA3 and JA3S have been open sourced and can be found
here: https://github.com/salesforce/ja3

Some Background on JA3


We open sourced JA3, a method for fingerprinting TLS clients on the wire, in this blog
post in 2017:

Open Sourcing JA3


SSL/TLS Client Fingerprinting for Malware Detection
engineering.salesforce.com

The primary concept for fingerprinting TLS clients came from Lee Brotherston’s 2015
research which can be found here and his DerbyCon talk which is here. If it weren’t
for Lee’s research and open sourcing of it, we would not have started work on JA3. So,
thank you Lee and all those who blog and open source!

To recap; TLS and its predecessor, SSL, are used to encrypt communication for both
common applications, to keep your data secure, and malware, so it can hide in the
noise. To initiate a TLS session, a client will send a TLS Client Hello packet following
the TCP 3-way handshake. This packet and the way in which it is generated is
dependent on packages and methods used when building the client application. The
server, if accepting TLS connections, will respond with a TLS Server Hello packet that
is formulated based on server-side libraries and configurations as well as details in the
Client Hello. Because TLS negotiations are transmitted in the clear, it’s possible to
fingerprint and identify client applications using the details in the TLS Client Hello
packet.
This exquisitely drawn network diagram shows the SSL/TLS initial communication pattern.

The JA3 method is used to gather the decimal values of the bytes for the following
fields in the Client Hello packet: Version, Accepted Ciphers, List of Extensions,
Elliptic Curves, and Elliptic Curve Formats. It then concatenates those values together
in order, using a “,” to delimit each field and a “-” to delimit each value in each field.
Example Client Hello packet as viewed in Wireshark

The field order is as follows:


TLSVersion,Ciphers,Extensions,EllipticCurves,EllipticCurvePointFormats

Example:

769,47–53–5–10–49161–49162–49171–49172–50–56–19–4,0–10–11,23–24–25,0

If there are no TLS Extensions in the Client Hello, the fields are left empty.

Example:

769,4–5–10–9–100–98–3–6–19–18–99,,,

These strings are then MD5 hashed to produce an easily consumable and shareable 32
character fingerprint. This is the JA3 TLS Client Fingerprint.

769,47–53–5–10–49161–49162–49171–49172–50–56–19–4,0–10–11,23–24–25,0 →

ada70206e40642a3e4461f35503241d5

769,4–5–10–9–100–98–3–6–19–18–99,,, → de350869b8c85de67a350c8d186f11e6

We also needed to introduce some code to account for Google’s GREASE (Generate
Random Extensions And Sustain Extensibility) as described here. Google uses this as
a mechanism to prevent extensibility failures in the TLS ecosystem. JA3 ignores these
values completely to ensure that programs utilizing GREASE can still be identified
with a single JA3 hash.

Does JA3 work on TLS 1.3? Yes.


Here we have TLS 1.3 Client Hello packets for two different browsers, each ordering
their ciphers and extensions differently as well as including (or excluding) different
ciphers and extensions. Therefore the JA3 will still be unique per client.

JA3S
After creating JA3 we started playing with using the same method to fingerprint the
server side of the TLS handshake, the TLS Server Hello message. The JA3S method is
to gather the decimal values of the bytes for the following fields in the Server Hello
packet: Version, Accepted Cipher, and List of Extensions. It then concatenates those
values together in order, using a “,” to delimit each field and a “-” to delimit each value
in each field.

The field order is as follows:


TLSVersion,Cipher,Extensions

Example:

769,47,65281–0–11–35–5–16

If there are no TLS Extensions in the Server Hello, the fields are left empty.
Example:
769,47,

These strings are then MD5 hashed to produce an easily consumable and shareable 32
character fingerprint. This is the JA3S Fingerprint.

769,47,65281–0–11–35–5–16 → 4835b19f14997673071435cb321f5445

We MD5 hash because there is no limit to how many ciphers or extensions can be
added to the Client Hello or Server Hello respectively, and our rule of thumb is that if
the fingerprint cannot fit in a tweet, then it’s too long. We also use MD5 so the JA3
method can be more easily integrated into existing technologies. Remember that JA3
is a method that is designed to work within any application on any hardware. I admit,
fuzzy hashing would be better, but we wanted to use a method that could be
incorporated into currently-deployed technologies and most of them do not yet have
fuzzy hashing support, while even the oldest Netscreen Firewall can churn out MD5s.
Also, given the limited data set, hash collisions are not a concern here. I know MD5
can be a point of contention within the security community so I hope this helps
explain our reasons behind using it.

Our code does allow for the entire string to be logged along with the hash value for
added analysis. I highly recommend that if you are able, you log the entire fingerprint
string for JA3 and JA3S as well as the hash values. The added analysis capability can
come in handy. Though, if your organization is the type that’s short on log space, just
logging the hash should do you just fine.

Why JA3S Works


We found that the same server will formulate its Server Hello message differently
depending on the Client Hello message and its contents. So it’s not possible to
fingerprint a server just based on its Hello message like we could with clients and JA3.
Because of this, some suggested that there was no value here. But we ran with it
anyway because Salesforce has a never-ending supply of caffeine. After some time we
found that, though servers will respond to different clients differently, they will always
respond to the same client the same.
In this network diagram, we can see that the client is sending a TLS Client Hello
packet of all A’s. Therefore the server responds with A and will always respond to As
with A.

Here a different client sends all B’s. The same server as before now responds with B
and will always respond to Bs with B. Different client, different response, but always
the same for each client.

Real World Example:

In this log output JA3 is on the left and JA3S is on the right
In this example I have contacted the same server 4 times over using the same client. I
then contacted it again using a different client 4 times over. The way that the server
responds is always the same for the same client, though different for a different client.

Usage for Security


In the event that a threat actor custom-built their own malware executable, it’s likely
that the JA3 fingerprint will be unique to that executable. For example here is the
Client Hello of a custom piece of malware developed by pen testers for an
engagement:

You can see that there is only a single strong accepted cipher suite which is anomalous
and the resulting JA3 was unique in our environment, making this easy to detect, no
matter the destination.

Other pen testing tools such as PupyRAT will specify their ciphers and ordering as
seen here in the Pupy code:
This makes for an unusual and unique splattering of ciphers in the Client Hello which
therefore generates a unique JA3:
One can then pivot on the JA3 to enhance their hunting or response operations.

But what if the client application uses common libraries or OS sockets for
communication like Python or Windows Socket? The JA3 would be common in the
environment and therefore not as useful for detection. This is where JA3S can assist
in identifying the malicious communication.

For example, both Metasploit’s Meterpreter and Cobalt Strike’s Beacon use a
Windows socket to initiate TLS communication. For Windows 10, that
is  JA3=72a589da586844d7f0818ce684948eea  (when going to an IP)
and  JA3=a0e9f5d64349fb13191bc781f81f42e1  (when going to a domain). Other legitimate
applications on Windows use the same socket, making identification of the malicious
communication difficult. However, the way that the C2 servers on Kali Linux respond
to this client application is unique compared to the way normal servers on the internet
respond to this socket. So if we combine JA3 + JA3S, we are then able to identify this
malicious communication regardless of destination IP, Domain, or Certificate Details.
The search (at the time of this writing) could look like:

Metasploit Win10 to Kali:


(JA3=72a589da586844d7f0818ce684948eea OR JA3=a0e9f5d64349fb13191bc781f81f42e1) AND

JA3S=70999de61602be74d4b25185843bd18e

Cobalt Strike Win10 to Kali:


(JA3=72a589da586844d7f0818ce684948eea OR JA3=a0e9f5d64349fb13191bc781f81f42e1) AND

JA3S=b742b407517bac9536a77a7b0fee28e9

As with everything, there is a risk of false positives. You could think of JA3 as the TLS
equivalent of the User-Agent string. Just because one piece of software or malware
has a particular string doesn’t mean it will always be unique to that software. It is
possible for other software to use the same string. However, there’s no reason not to
use the string to augment your analysis and detections. Just like other network
metadata, JA3 is an extra piece of information to be used in enriching your data.
JA3S, when used in conjunction with JA3, can significantly reduce the level of false
positives if you’re looking for something specific.
Pen Tester Example
In another example we have pen testers using the Python version of Empire as their
malware of choice. The JA3 in this case would be that of Python, not unique in any
developer environment.

If we were to search for this JA3 across the environment the results would look
something like this:
However, the pen tester’s C2 server responded to the Python client in a unique way.
So when we search for the JA3 of Python and the JA3S of the way their C2 server
responded, the results looked more like this:

I forgot to take screenshots so you’ll just have to trust me that this is exactly what Splunk looked like.

The resulting output are the beacons of the malware to their C2 server. As you can see,
JA3 and JA3S combined essentially creates a fingerprint of the cryptographic
negotiation between client and server.

Following the eradication of the pen testers, they moved their C2 image to another IP
and domain. However, the malware and server remained the same applications and
therefore the fingerprints remained the same. The previous detection worked
immediately. Finally the pen testers purchased space in a completely different service
provider, purchased a new legitimate looking certificate, purchased a new domain,
and moved their C2 image there. Detection was instant.
Because detection was based on the infrastructure and technology, not on destination
IPs, domains, or certs, we no longer had to rely on traditional IOCs which are easily
changed. This moved the detection near the top of David Bianco’s Pyramid of
Pain and increased the cost of engagement for the adversary.

Conclusion
JA3 and JA3S are TLS fingerprinting methods. JA3 fingerprints the way that a client
application communicates over TLS and JA3S fingerprints the server response.
Combined, they essentially create a fingerprint of the cryptographic negotiation
between client and server. While not always a silver bullet to TLS-based detection or a
guaranteed mapping to client applications, they are always valuable as a pivot point
for analysis.

We designed these methods so that they can be easily applied to existing technologies.
The resulting fingerprints are easy to consume and easy to share. The BSD 3-Clause
license makes it easy to implement. We just wanted it to be easy. In doing so, our hope
is that it becomes a valuable addition to your defensive arsenal and that it inspires
others to build off of our research and push the industry forward.
Zeek/Bro and Python versions of JA3 and JA3S are available
at https://github.com/salesforce/ja3 as well as links to other tools which have
implemented the methods.

JA3 was created by:


John Althouse

Jeff Atkinson

Josh Atkins

For SSH client and server fingerprinting, please


see HASSH at https://github.com/salesforce/hassh

For automatic client to JA3 or HASHH mapping, please see Bro-


Sysmon at https://github.com/salesforce/bro-sysmon/

For any questions or comments, please feel free to contact me


on LinkedIn or @4A4133.

Security Tls Ssl Ja3 Open Source

You might also like