Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

A Practical Guide to Writing a Software

Technical Design Document


The Iterative Options Analysis Approach

Disclaimer: The methodologies discussed in this article are a result of my collective experience of all the
past years in software, hardware, and engineering in general. They do not represent any view of any
company.

Design document writing is not only a brain dump of how you design software to communicate
to other people (like recipes) but also most importantly, a tool for the software designer to form a
concrete idea of how to solve a problem systematically and critically.

Over the years, I have been iterating my methodology for software design writing — I call it,
Iterative Option Analysis (IOA). In the article, I will talk in-depth about what it is and how it
works. I will use an example of building a transcribing app as an example to illustrate the idea,
throughout the article.

Document Structure from a Bird’s-eye view

Problem

Background

Goals

Non-Goals

Design Options

Option 1
- Pros
- Cons

Option 2
- Pros
- Cons

Conclusion

Sold to
triahmadirfan@gmail.com
A Practical Guide to Writing a Software Technical Design Document

Diving Deep on Each Section

Problem
Provide a concise description of an issue to be addressed, or a process to be improved upon.

Problem

We would like to design a Speech-to-Text app for Android devices. A user can tap on a button
on the app, and the transcript will show up on the screen as the user speaks.

Background
This section paints the context to the reviewers, and clarifies why we need to solve this problem.

Background

According to recent research and interviews with Patent Attorneys, we noticed there is an
increasing need to auto dictate while interviewing their clients. Patent Attorneys also need to
review the original voice recordings if the transcript is not accurate.

How much context should I provide? What if I can write a history book about this? You might
ask. Just explain in the same way you would explain to interview candidates within a short time.
Interview candidates often know almost zero context of how your company builds the systems
or none of your company’s jargon. The background should be concise and on point.

In Scope / Goals
This section is often called In Scope, Goals, Tenets, or Requirements. It describes everything
you want to achieve by this design. I see Goals as rulers in different dimensions, to measure
how good the options are.

Goals

1. The transcribing accuracy rate can be 90% and higher.


2. The speech binary data can be captured and stored for future auditing.
3. The user data should be secure.

Authored by Grace Huang https://gumroad.com/gracehuang


1
A Practical Guide to Writing a Software Technical Design Document

If your Product Manager already has a Product Requirement Document ready, which lays out all
the details about how the product will work, you can summarize and extract the key points, and
list them as goals as well.

Out of Scope / Non-Goals


Out of Scope is also often called Non-Goals. Like the term Out of Scope, it draws boundaries,
outside which will not be considered in the Options Analysis. Each Non-Goal needs to come
with reasons to explain why this is a Non-Goal.

A good hint on when a Non-Goal is needed is that a reviewer comes to ask you why something
is not considered in your design, and you have a specific reason for not including it. If you can
preemptively mention Non-Goals, this would avoid questions or gaps during the final design
review.

Non-Goals

1. Build the speech data auditing process.


Reason: we will decide on how to build this later. Currently, we need to preserve user info
and speech data.

2. Design the Android app.


Reason: In this document, we will mainly design the backend system. There will be another
client design document for the Android client.

Design Option Analysis


Millions (unlimited, in my personal belief) of ways can solve the same problem.

For example, if you want to get to Los Angeles from San Francisco, you can drive on Highway 1
or Interstate 5, take a Delta flight, or walk to New York and walk all the way back to Los
Angeles. If you optimize on time, flying is your best option, and walking sounds outrageous and
stupid. If you want to witness the ultimate beauty of America and have unlimited time to kill,
walking across the country does not sound unreasonable.

Pick the most reasonable 3 or 4 options to consider. With limitations in software tooling and
environment, reasonable options are often numbered, but rarely just one.

In this example, we are considering a list of backend design options that supports transcribing.

Option 1: The app calls a cloud-based Automatic Speech Recognition (ASR) service directly

Authored by Grace Huang https://gumroad.com/gracehuang


2
A Practical Guide to Writing a Software Technical Design Document

Option 2: Build the pre-trained embedded ASR in the App

Option 3: A HTTP/1.1 service makes proxy calls to a cloud-based Automatic Speech


Recognition (ASR) service

Option 4: A bidirectional RPC service to proxy calls to a cloud-based ASR service

For each option, describe in detail what the proposed system looks like. Be generous with
diagrams! As the old saying goes, “a picture is worth a thousand words”. A diagram is the best
tool to illustrate a system.

Option 4: A bidirectional RPC service to proxy calls to a cloud-based ASR service

For each option, provide an exhaustive list of pros and cons. Each option should be given
the same weight of consideration, no matter how much it already sounds worse than others at
the beginning. It is too early to draw such a conclusion without deep analysis. If you find it
difficult to come up with the pros and cons, in the beginning, it is okay! Goals can give you some
ideas. Use the Goals to measure this option, and see if it is good or bad in that dimension.

Option 1: The app calls a cloud-based Automatic Speech Recognition (ASR) service directly

We could use a 3rd party cloud service (such as Amazon Autotranscribe, Google
Speech-to-Text) directly. The Android app can make requests to the ASR service directly, and
display the response to the screen immediately.

Authored by Grace Huang https://gumroad.com/gracehuang


3
A Practical Guide to Writing a Software Technical Design Document

Pros
1. High accuracy. By now, a mature ASR service such as Amazon Transcribe is able to
provide 95% and over transcribing accuracy. (This meets Goal #1)
2. Easy setup. No need to set up a server.
Cons
1. Unable to capture the speech binary data. (This is against Goal #2)
2. The 3rd party ASR service is not free.

In the example below, I noticed that we care about the size of the Android APK and how the
transcribing process is like, during pros and cons analysis. So I added the pro and the con
related to those points, and added two more Goals as well, and then reviewed other options
with the new Goals in mind.

Goals
1. The transcribing accuracy rate can be 90% and higher.
2. The speech binary data can be captured and stored for future auditing.
3. The user data should be secure.
4. Keep the size of the Android APK under 100 MB.
5. The transcribing process should be continuous, meaning that the user can see
the transcripts while the user is talking.

Option 3: A HTTP/1.1 service makes proxy calls to a cloud-based Automatic Speech


Recognition (ASR) service

We could set up a proxy service between the Android app and a 3rd party cloud service (such
as Amazon Autotranscribe, Google Speech-to-Text). When the user taps the button, the
speech binary data is recorded on the app and sends it to the proxy service. As the proxy
service pipes the data to the ASR service, it can also grab the data and store it elsewhere
(For example, Google Storage).

Pros
1. Google Speech-to-Text service can provide 95% transcribing accuracy. (This meets
Goal #1)
2. The speech binary data can be captured and stored. (This meets Goal #2)
3. Keep the size of APK small. The Android app would be a thin client without
transcribing itself. (This meets Goal #4)
Cons
1. An additional service needs to be set up. However, it is very easy to set up a service
with HTTP/1.1.
2. The transcribing process would not be continuous because of the
one-directional nature of the HTTP/1.1 service. (This is against Goal #5)

Authored by Grace Huang https://gumroad.com/gracehuang


4
A Practical Guide to Writing a Software Technical Design Document

Now you can see, this approach is iterative — we keep improving each part of the analysis as
more ideas come up.

Oftentimes, as you evaluate the cons, this may give you other option ideas, by simply asking
“What can we mitigate this con?”

It is perfectly okay you have a con or a pro that is not part of Goals. They can be bonus points
or minus points.

Remember, there is no wrong option, but always a better option in the context of Goals.

Conclusion
Based on all the analysis in-depth, you will start to see the best option start to emerge by having
more pros or all pros that meet the Goals. The chosen solution can also have cons, which are
often not deal breakers for the decision. In this section, we can also add what we will do to
mitigate the cons of the chosen option.

Conclusion

Based on the analysis above, the proposed approach is Option 4, which meets all the Goals.

See the full example in Appendix A.

Design Review Process

Proofreading
Look at your design document as someone else’s doc and critique every statement.

● See whether assumptions are being made without evidence. If yes, do research and add
more evidence — Past design/code, historical data, etc.
● See whether the words are subjective. If yes, remove subjective words. Here are some
words that could lead to subjectiveness: “I/We think”, “I/We feel”, “Good/Great/Nice”,
“Bad/Terrible”, etc.
● See whether you would ask other questions. If yes, add more clarification to cover the
questions.

Picking Reviewers
The ones who matter, a.k.a. stakeholders, aka the people who would be affected by the
design. The ones who matter are the following but not subject to —

Authored by Grace Huang https://gumroad.com/gracehuang


5
A Practical Guide to Writing a Software Technical Design Document

● Product Manager (who cares whether the design meets the product requirements)
● Teammates (who needs to be in line with the design. Because they will eventually
implement and maintain the system)
● Test Engineers (who need to know how they would test the final implementation)
● Dependent Teams (whose systems are your system’s dependencies, clients,
downstream services)

In a small startup where people wear many hats, it is still worthwhile to get people together to
review the document and provide feedback.

The more experienced ones, a.k.a., the more senior engineers than you are. In an
environment that encourages learning, each task should be treated as an opportunity to grow.
Invite the experienced ones to come and challenge you. Sometimes, they may just ask a
question that is important enough to make a big difference in the design. Throughout the
process, you’ve just learned something new.

Under Review
Treat the review process as a proofreading process. Treat your reviewers as partners. However,
when you present the doc for review, it should be your best version it can be.

I always treat people’s comments very seriously. If they have questions, it is a signal that my
doc is not clear enough and I need to address their questions in the doc. The more information I
have, the best design decision I can make.

It is very important to time-box the review process. Set the expectations with your reviewers
within what timeframe you would like the review to be complete.

Making a Decision
Before the doc is presented for review, you should already have a preliminary conclusion on
what to go with. Take in the feedback from the reviewers and see how it would skew the
conclusion, and make a decision together and keep all the reviewers on the same page.

Here are some common outcomes -


● Go with the option you concluded previously
● Switch to another option that is listed
● New options emerged and being considered and taken
● Have an option as a long term solution and another option as a short term solution. It is
very common in a time-sensitive project.
● Have one or more questions to answer before making a final decision. It is common that
your project has a strong outside dependency. In this case, create a tentative plan to
guide the decision, for example, “If the answer is A, we go with Option X. If the answer is

Authored by Grace Huang https://gumroad.com/gracehuang


6
A Practical Guide to Writing a Software Technical Design Document

B, we then go with Option Y”. This way, we don’t need to revisit the whole design with all
the reviewers but have a strategy to reach the final decision.

Other Applications
The Iterative Option Analysis can be also applied to other real-life decision-making processes.

At Roxy (the first startup I co-founded), when we were vetting manufacturers for Roxy, we used
this approach as well. With three of the founders having no prior experience in manufacturing, it
would be difficult to start. There were thousands of manufacturers in Shenzhen. How should we
choose the one for Roxy?

Then we started by listing what we cared about (Goals).

● Capabilities of the manufacturers — whether they can mass-produce an intelligent


speaker with OS.
● Reliability — whether they have the track records to deliver on time
● Stability — whether they will continue to be there but not going to bankrupt soon

Then, we started to search for options by visiting Expos, talking to investors, and friends who
knew manufacturing in China.

At the very beginning, we had a long list of options. By checking against the Goals, some weak
options have been filtered at an early stage. The rest would be the ones we would scrutinize
and dive deep into analysis. To know the pros and cons, we needed to find ways to capture
more information. Some were not obvious from the surface, such as financial stability. We would
ask a friend who knew about the industry well and get their inputs. We visited their factories,
talked to the owners, observed how their employees worked and how their managers treated
the employees.

Decisions are hard to make for manufacturing. It is like a big investment that you would commit
and have faith in their ability after you put the money down (usually in 6-digit figures).

Final Note
Next time, whether you don’t know where to start or you are afraid of making decisions, check
back Grace’s Iterative Option Analysis. I hope it will inspire you in some way.

But if it is a decision for choosing which girlfriend or boyfriend, Grace’s Iterative Option Analysis
is not applicable and discouraged! (!!)

Authored by Grace Huang https://gumroad.com/gracehuang


7
A Practical Guide to Writing a Software Technical Design Document

Appendix A: Technical Design Document Example

Transcribing App For Patent Attorneys


Problem
We would like to design a Speech-to-Text app for Android devices. A user can tap on a button
on the app, and the transcript will show up on the screen as the user speaks.

Background
According to recent research and interviews with Patent Attorneys, we noticed there is an
increasing need to dictate while interviewing their clients. Patent Attorneys also need to review
the original voice recording if the transcript is not accurate.

Goals
1. The transcribing accuracy rate can be 90% and higher.
2. The speech binary data can be captured and stored for future auditing.
3. The user data should be secure.
4. Keep the size of the Android APK under 100 MB.
5. The transcribing process should be continuous, meaning that the user can see the
transcripts while the user is talking.

Non-Goals
1. Build the speech data auditing process
a. Reason: we will decide how to build this later. Currently, we need to preserve
user info and speech data.
2. Design the Android app.
a. Reason: In this document, we will mainly design the backend system. There will
be another client design document for the Android client.

Authored by Grace Huang https://gumroad.com/gracehuang


8
A Practical Guide to Writing a Software Technical Design Document

Design Options

Option 1: The app calls a cloud-based Automatic Speech


Recognition (ASR) service directly
We could use a 3rd party cloud service (such as Amazon Autotranscribe, Google
Speech-to-Text) directly. The Android app can make requests to the ASR service directly, and
display the response to the screen immediately.

Pros
1. By now, a mature ASR service such as Amazon Transcribe is able to provide 95% and
over transcribing accuracy. (This meets Goal #1)
2. No need to set up a server.

Cons
1. The speech binary data cannot be captured. (This is against Goal #2)
2. The 3rd party ASR service is not free.

Option 2: Build the pre-trained embedded ASR in the App


We could include a 3rd party embedded ASR engine to the app. Without making requests to an
existing service, the app can get the results from the local engine and display the results to the
user.

Authored by Grace Huang https://gumroad.com/gracehuang


9
A Practical Guide to Writing a Software Technical Design Document

Pros
1. No need to set up any server.

Cons
1. The transcribing accuracy of any embedded ASR solution (e.g. CMU Sphinx) is low.
(This is against Goal #1)
2. The speech binary data cannot be captured on the device. (This is against Goal #2)
3. The pre-trained model could significantly increase the size of the Android APK. (This is
against Goal #4)

Option 3: A HTTP/1.1 service makes proxy calls to a cloud-based


Automatic Speech Recognition (ASR) service
We could set up a proxy service between the Android app and a 3rd party cloud service (such
as Amazon Autotranscribe, Google Speech-to-Text). When the user taps the button, the speech
binary data is recorded on the app and sends it to the proxy service. As the proxy service pipes
the data to the ASR service, it can also grab the data and store it elsewhere (For example,
Google Storage).

Pros
1. Google Speech-to-Text service can provide 95% transcribing accuracy. (This meets Goal
#1)

Authored by Grace Huang https://gumroad.com/gracehuang


10
A Practical Guide to Writing a Software Technical Design Document

2. The speech binary data can be captured and stored. (This meets Goal #2)
3. The Android app would be a thin client without transcribing itself. The size of the app
cannot increase significantly. (This meets Goal #4)

Cons
1. An additional service needs to be set up. However, it is very easy to set up a service with
HTTP/1.1.
2. The transcribing process would not be continuous because of the one-directional nature
of the HTTP/1.1 service. (This is against Goal #5)

Option 4: A bidirectional RPC service to proxy calls to Google


Speech-to-Text service
We could set up a proxy service between the Android app and a 3rd party cloud service (such
as Amazon Autotranscribe, Google Speech-to-Text). This service will use HTTP/2 gRPC
protocol to enable bidirectional requests. When the user taps the button, the speech binary data
is recorded on the app and sends it to the proxy service. As the proxy service pipes the data to
the ASR service, it can also grab the data and store it elsewhere (For example, Google
Storage).

Authored by Grace Huang https://gumroad.com/gracehuang


11
A Practical Guide to Writing a Software Technical Design Document

Pros
1. Google Speech-to-Text service can provide 95% transcribing accuracy. (This meets Goal
#1)
2. The speech binary data can be captured and stored. (This meets Goal #2)
3. The Android app would be a thin client without transcribing itself. The size of the app
cannot increase significantly. (This meets Goal #4)
4. The transcribing process would be continuous. (This meets Goal #5)

Cons
1. An additional service needs to be set up, and it is more complicated to do because
bidirectional RPC service requires managing the server/client states by the events.

Conclusion
Based on the analysis above, the proposed approach is Option 4, which meets all the Goals.

Authored by Grace Huang https://gumroad.com/gracehuang


12

You might also like