Professional Documents
Culture Documents
TLS Fingerprinting With JA3 and JA3S - by John Althouse - Salesforce Engineering
TLS Fingerprinting With JA3 and JA3S - by John Althouse - Salesforce Engineering
TL;DR
In this blog post, I’ll go over how to utilize JA3 with JA3S as a method to fingerprint
the TLS negotiation between client and server. This combined fingerprinting can
assist in producing higher fidelity identification of the encrypted communication
between a specific client and its server. For example —
The Tor servers always respond to the Tor client in exactly the same way, providing
higher confidence that the traffic is indeed Tor. Further examples —
Trickbot malware:
Emotet malware:
JA3 and JA3S have been open sourced and can be found
here: https://github.com/salesforce/ja3
The primary concept for fingerprinting TLS clients came from Lee Brotherston’s 2015
research which can be found here and his DerbyCon talk which is here. If it weren’t
for Lee’s research and open sourcing of it, we would not have started work on JA3. So,
thank you Lee and all those who blog and open source!
To recap; TLS and its predecessor, SSL, are used to encrypt communication for both
common applications, to keep your data secure, and malware, so it can hide in the
noise. To initiate a TLS session, a client will send a TLS Client Hello packet following
the TCP 3-way handshake. This packet and the way in which it is generated is
dependent on packages and methods used when building the client application. The
server, if accepting TLS connections, will respond with a TLS Server Hello packet that
is formulated based on server-side libraries and configurations as well as details in the
Client Hello. Because TLS negotiations are transmitted in the clear, it’s possible to
fingerprint and identify client applications using the details in the TLS Client Hello
packet.
This exquisitely drawn network diagram shows the SSL/TLS initial communication pattern.
The JA3 method is used to gather the decimal values of the bytes for the following
fields in the Client Hello packet: Version, Accepted Ciphers, List of Extensions,
Elliptic Curves, and Elliptic Curve Formats. It then concatenates those values together
in order, using a “,” to delimit each field and a “-” to delimit each value in each field.
Example Client Hello packet as viewed in Wireshark
TLSVersion,Ciphers,Extensions,EllipticCurves,EllipticCurvePointFormats
Example:
769,47–53–5–10–49161–49162–49171–49172–50–56–19–4,0–10–11,23–24–25,0
If there are no TLS Extensions in the Client Hello, the fields are left empty.
Example:
769,4–5–10–9–100–98–3–6–19–18–99,,,
These strings are then MD5 hashed to produce an easily consumable and shareable 32
character fingerprint. This is the JA3 TLS Client Fingerprint.
769,47–53–5–10–49161–49162–49171–49172–50–56–19–4,0–10–11,23–24–25,0 →
ada70206e40642a3e4461f35503241d5
769,4–5–10–9–100–98–3–6–19–18–99,,, → de350869b8c85de67a350c8d186f11e6
We also needed to introduce some code to account for Google’s GREASE (Generate
Random Extensions And Sustain Extensibility) as described here. Google uses this as
a mechanism to prevent extensibility failures in the TLS ecosystem. JA3 ignores these
values completely to ensure that programs utilizing GREASE can still be identified
with a single JA3 hash.
JA3S
After creating JA3 we started playing with using the same method to fingerprint the
server side of the TLS handshake, the TLS Server Hello message. The JA3S method is
to gather the decimal values of the bytes for the following fields in the Server Hello
packet: Version, Accepted Cipher, and List of Extensions. It then concatenates those
values together in order, using a “,” to delimit each field and a “-” to delimit each value
in each field.
Example:
769,47,65281–0–11–35–5–16
If there are no TLS Extensions in the Server Hello, the fields are left empty.
Example:
769,47,
These strings are then MD5 hashed to produce an easily consumable and shareable 32
character fingerprint. This is the JA3S Fingerprint.
769,47,65281–0–11–35–5–16 → 4835b19f14997673071435cb321f5445
We MD5 hash because there is no limit to how many ciphers or extensions can be
added to the Client Hello or Server Hello respectively, and our rule of thumb is that if
the fingerprint cannot fit in a tweet, then it’s too long. We also use MD5 so the JA3
method can be more easily integrated into existing technologies. Remember that JA3
is a method that is designed to work within any application on any hardware. I admit,
fuzzy hashing would be better, but we wanted to use a method that could be
incorporated into currently-deployed technologies and most of them do not yet have
fuzzy hashing support, while even the oldest Netscreen Firewall can churn out MD5s.
Also, given the limited data set, hash collisions are not a concern here. I know MD5
can be a point of contention within the security community so I hope this helps
explain our reasons behind using it.
Our code does allow for the entire string to be logged along with the hash value for
added analysis. I highly recommend that if you are able, you log the entire fingerprint
string for JA3 and JA3S as well as the hash values. The added analysis capability can
come in handy. Though, if your organization is the type that’s short on log space, just
logging the hash should do you just fine.
Here a different client sends all B’s. The same server as before now responds with B
and will always respond to Bs with B. Different client, different response, but always
the same for each client.
In this log output JA3 is on the left and JA3S is on the right
In this example I have contacted the same server 4 times over using the same client. I
then contacted it again using a different client 4 times over. The way that the server
responds is always the same for the same client, though different for a different client.
You can see that there is only a single strong accepted cipher suite which is anomalous
and the resulting JA3 was unique in our environment, making this easy to detect, no
matter the destination.
Other pen testing tools such as PupyRAT will specify their ciphers and ordering as
seen here in the Pupy code:
This makes for an unusual and unique splattering of ciphers in the Client Hello which
therefore generates a unique JA3:
One can then pivot on the JA3 to enhance their hunting or response operations.
But what if the client application uses common libraries or OS sockets for
communication like Python or Windows Socket? The JA3 would be common in the
environment and therefore not as useful for detection. This is where JA3S can assist
in identifying the malicious communication.
For example, both Metasploit’s Meterpreter and Cobalt Strike’s Beacon use a
Windows socket to initiate TLS communication. For Windows 10, that
is JA3=72a589da586844d7f0818ce684948eea (when going to an IP)
and JA3=a0e9f5d64349fb13191bc781f81f42e1 (when going to a domain). Other legitimate
applications on Windows use the same socket, making identification of the malicious
communication difficult. However, the way that the C2 servers on Kali Linux respond
to this client application is unique compared to the way normal servers on the internet
respond to this socket. So if we combine JA3 + JA3S, we are then able to identify this
malicious communication regardless of destination IP, Domain, or Certificate Details.
The search (at the time of this writing) could look like:
JA3S=70999de61602be74d4b25185843bd18e
JA3S=b742b407517bac9536a77a7b0fee28e9
As with everything, there is a risk of false positives. You could think of JA3 as the TLS
equivalent of the User-Agent string. Just because one piece of software or malware
has a particular string doesn’t mean it will always be unique to that software. It is
possible for other software to use the same string. However, there’s no reason not to
use the string to augment your analysis and detections. Just like other network
metadata, JA3 is an extra piece of information to be used in enriching your data.
JA3S, when used in conjunction with JA3, can significantly reduce the level of false
positives if you’re looking for something specific.
Pen Tester Example
In another example we have pen testers using the Python version of Empire as their
malware of choice. The JA3 in this case would be that of Python, not unique in any
developer environment.
If we were to search for this JA3 across the environment the results would look
something like this:
However, the pen tester’s C2 server responded to the Python client in a unique way.
So when we search for the JA3 of Python and the JA3S of the way their C2 server
responded, the results looked more like this:
I forgot to take screenshots so you’ll just have to trust me that this is exactly what Splunk looked like.
The resulting output are the beacons of the malware to their C2 server. As you can see,
JA3 and JA3S combined essentially creates a fingerprint of the cryptographic
negotiation between client and server.
Following the eradication of the pen testers, they moved their C2 image to another IP
and domain. However, the malware and server remained the same applications and
therefore the fingerprints remained the same. The previous detection worked
immediately. Finally the pen testers purchased space in a completely different service
provider, purchased a new legitimate looking certificate, purchased a new domain,
and moved their C2 image there. Detection was instant.
Because detection was based on the infrastructure and technology, not on destination
IPs, domains, or certs, we no longer had to rely on traditional IOCs which are easily
changed. This moved the detection near the top of David Bianco’s Pyramid of
Pain and increased the cost of engagement for the adversary.
Conclusion
JA3 and JA3S are TLS fingerprinting methods. JA3 fingerprints the way that a client
application communicates over TLS and JA3S fingerprints the server response.
Combined, they essentially create a fingerprint of the cryptographic negotiation
between client and server. While not always a silver bullet to TLS-based detection or a
guaranteed mapping to client applications, they are always valuable as a pivot point
for analysis.
We designed these methods so that they can be easily applied to existing technologies.
The resulting fingerprints are easy to consume and easy to share. The BSD 3-Clause
license makes it easy to implement. We just wanted it to be easy. In doing so, our hope
is that it becomes a valuable addition to your defensive arsenal and that it inspires
others to build off of our research and push the industry forward.
Zeek/Bro and Python versions of JA3 and JA3S are available
at https://github.com/salesforce/ja3 as well as links to other tools which have
implemented the methods.
John Althouse
Jeff Atkinson
Josh Atkins