Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler.

| by A B | Medium

Downloading “undownloadable” web PDFs


with Fiddler.
A B · Follow
7 min read · Jul 13, 2018

Listen Share

I was once teaching a course in the area of backend software engineering. I didn’t
own the course material, my duties included going over and presenting the slide
deck that I had been provided by the course coordinator, answering any outstanding
questions from the class, being on time, having lunch, and timely getting lost at 5:30
pm. At the end of the course, naturally, the students asked me to share the slide
deck with them so they could go over it on their own. And that’s when the issue
revealed itself — the course slides were provided to me via a secure document
sharing platform, let’s call it PDFLord [I won’t mention the actual name for the sake
of… reasons], which imposed downloading and printing restrictions on all the
course PDFs. So, unfortunately, the students had to leave the class empty-handed.
However, something didn’t seem right in my mind — if you can see the document on
your screen, surely its source is hiding somewhere in the files downloaded/cached
by your browser, and consequently the download restriction is artificial in a sense.
In this article I will show you a method to overcome these restrictions that I
discovered in the two days following the course. My tutorial will assume MacOS
(High Sierra) development environment, Chrome browser, and PDFLord platform,
but similar steps could be undertaken for other operating systems and other
document sharing platforms.

To begin with, let’s list the reasons why PDFLord was a bane of my existence:

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 1/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

1. As mentioned before, the PDFs had downloading and printing restrictions (as
indicated by the grayed out icons in the top right corner).

2. The PDFs were copy-protected, meaning I could not select any text (as indicated
by the “Protected File” pop-up on mouse click).

3. The PDFs were unsearchable, meaning I had to memorize the page numbers of
all sections in the course that I wanted to quickly navigate to.

4. There was no fullscreen or present button.

My first intuition was to examine the page source files. I will skip the parts where I
was randomly clicking through all possible directories and folders while looking for
the right files, and instead will go straight to the ones relevant to this tutorial. You
can press Command+Shift+C to bring up the developer console in Chrome. Then
open the Sources tab.

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 2/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 3/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

As you can see there is a pdflord.com directory, with a plugins folder under assets. If
you scroll down, you will find a folder called pdfjs, which contains two files — pdf.js
and viewer.js. It turns out that PDFLord is using an open-source PDF rendering and
parsing javascript library by Mozilla, which you can find here
https://mozilla.github.io/pdf.js/

Let’s dig through the viewer.js file a bit more. After some inspection we find a
method which sounds like it deals with page rendering:

function webViewerPageRendered(evt)

Let’s add a breakpoint on line 2141 inside this method right after the pageView
variable and reload the page. Our goal is to examine what the object pointed at by
this variable represents.

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 4/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

Surely, now we can just write a script to go over every page in the PDF, extract the
image data arrays, convert them to jpegs, and end up with a sequence of images of
the PDF file. To be honest, I wasn’t quite satisfied with this finding — I would still
not be able to select any text or search through the images. I was looking for a better
way.

If we examine the viewer.js file a bit more, we find another interesting function:

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 5/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

In particular, there is this very intriguing line which looks like it deals with
restricting downloads:

if (PDFViewerApplication &&
PDFViewerApplication.appConfig.allowdownload) {

And then we also find the following sequence which deals with binding events to
button click listeners. It’s amusing how the “print” and “download” events are very
sloppily commented out, most likely to handle print and download logic in a
different part of the code.

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 6/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

At this point our action plan is clear:

1. We will rebind one of the buttons to serve as a download button (simply


uncommenting the download event listener didn’t work, I didn’t dig too much
into why).

2. Change the download permissions logic to not require allowdownload.

3. ???

4. Proceed to downloading the PDF.

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 7/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

To make changes to javascript files returned by a web page we need a man-in-the-


middle proxy server. For this purpose, we will be using Fiddler — a free web
debugging proxy by Telerik https://www.telerik.com/fiddler. Fiddler was originally
developed as a Windows application, and only recently got ported to Mac. On
MacOS it runs using Mono — an open-source implementation of .NET Framework.
You can follow this tutorial https://www.telerik.com/blogs/introducing-fiddler-for-
os-x-beta-1 to install Mono and Fiddler. The only difference is that Fiddler 64bit
version doesn’t work on OS X, so you would need to use this command to start
Fiddler and avoid errors:

mono --arch=32 Fiddler.exe

Most websites nowadays use https, so we need to configure Fiddler to correctly


capture and decrypt https traffic. Open Tools->Options->HTTPS, and check the
Decrypt HTTPS Traffic checkbox.

Since Fiddler acts as a proxy, browser traffic gets redirected to it. All browsers know
how to protect user data from man-in-the-middle attacks, so they don’t let the traffic

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 8/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

be delivered to actors whose certificates are not trusted. To bypass this constraint
we click on Actions->Export Root Certificate To Desktop. Next, open Keychain Access —
MacOS app that manages certificates — and drag-n-drop the generated certificate
from your desktop to the Keychain window. The certificate will appear as
DO_NOT_TRUST_FiddlerRoot. Double click on it, and in the new window select
Always Trust.

The final step is to actually redirect the traffic from Chrome to Fiddler. Open System
Preferences->Network->Advanced->Proxies. Check Web Proxy and Secure Web Proxy,
and for both set the host to 127.0.0.1 and the port to 8888. Click Ok, then Apply.

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 9/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

You should now start seeing the traffic from your browser in the main Fiddler
window. If you don’t see anything, try using an Incognito Window.

Now the fun part: hacking the javascript files and serving them in place of the
original files. Download (or copy paste) the viewer.js file, open it in your favorite
editor, and replace line 10279 with:

items.zoomIn.addEventListener('click', function() {
//eventBus.dispatch('zoomin');
eventBus.dispatch('download');
});

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 10/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

In short, we are binding the download event to the zoom-in button. Next, remove
`PDFViewerApplication.appConfig.allowdownload` from lines 1475 and 5067 (and
anywhere else in the file for that matter):

if (PDFViewerApplication)

Our substitute viewer.js file is ready for deployment. Find and select the viewer.js
resource in Fiddler (you might want to stop capturing traffic to prevent the window
from refreshing by disabling File->Capture Traffic).

Actual name of the website replaced with pdflord.

Then in the panel on the right select AutoResponder->Add Rule. In the bottom drop-
down menu choose Find File, select your substitute viewer.js file and click Save. Make
sure both Enable rules and Unmatched requests passthrough are checked.

Open in app Sign up Sign in

Search

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 11/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

Actual name of the website replaced with pdflord.

Aaaaaand… drum roll… we are done! We are ready to download our PDF.

Open your Chrome window with the PDF viewer. With your debugging console
being open, right click the refresh button and click on Empty Cache and Hard Reload.
Don’t forget to reenable Capture Traffic in Fiddler.

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 12/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

Actual name of the website replaced with pdflord.

Emptying the cache is necessary to not let Chrome pick up the original version of
viewer.js and instead make it download it again from the web. The downloaded
javascript file gets intercepted by Fiddler and replaced with our custom one.

Now, whenever you click on the Zoom In button (“+”), your PDF will get
downloaded. Great success!

Final thoughts and lessons learned:

When any data reaches your computer, there is absolutely no way to guarantee
its complete integrity.

Basing your business model on a premise that the data you share is fully secure
and protected is a terrible idea.

Hope y’all who got this far had as much fun with this tutorial as I did when fiddling
with this challenge.

Disclaimer: use at your own risk. Make sure you are not breaching any contracts
with your document providers. There is a very obvious potential harm to the
business models of the secure document sharing companies.

JavaScript Fiddler Hacking Pdf Chrome

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 13/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

Follow

Written by A B
46 Followers

I do things.

More from A B

AB

How I Earned $1000 on Two Freelance Projects in One Week.


Enter the Freelancer.com Ecosystem as a Complete Beginner.

8 min read · Nov 4, 2018

1 1

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 14/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

AB

Preparing for Coding Interviews Like Nobody Told You Before.


Or how to use LeetCode the right way.

12 min read · Nov 11, 2018

See all from A B

Recommended from Medium

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 15/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

George Stavrakis in Towards Data Science

Extracting text from PDF files with Python: A comprehensive guide


A complete process to extract textual information from tables, images, and plain text from a
PDF file

· 17 min read · Sep 22, 2023

1.3K 23

Artturi Jalli

I Built an App in 6 Hours that Makes $1,500/Mo


https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 16/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

Copy my strategy!

· 3 min read · Jan 24, 2024

12.1K 146

Lists

Stories to Help You Grow as a Software Developer


19 stories · 872 saves

General Coding Knowledge


20 stories · 985 saves

Living Well as a Neurodivergent Person


10 stories · 627 saves

Generative AI Recommended Reading


52 stories · 790 saves

François

Record Audio in JS and upload as wav or mp3 file to your backend


Learn how to record audio in Javascript and save the file as mp3 or wav on your local disk or
Amazon S3.

5 min read · Jan 26, 2024


https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 17/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

14

Cloudmersive

How to Convert PDF to Text with OCR using Node.js


When our PDF files are rasterized (bitmap images instead of vector images), we need OCR
services to extract plain text from the document.

3 min read · Nov 14, 2023

50

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 18/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

Kunho Lee

Exploring PDF — Basic Object


PDF is composed of basic objects: Boolean, Numeric, String, Name, Array, Dictionary, Stream,
and Null object + Indirect object.

2 min read · Sep 28, 2023

Sandeep Kumar

Pdf Upload and Pdf View


https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 19/20
06/03/2024, 22:02 Downloading “undownloadable” web PDFs with Fiddler. | by A B | Medium

2 min read · 6 days ago

See more recommendations

https://medium.com/@peacefuleast/downloading-undownloadable-web-pdfs-with-fiddler-32094da02285 20/20

You might also like