Lecture 1 - Getting To Know Scalability

Getting to know
Scalability
BITS Pilani Dr. Shreyas Rao
Associate Prof. (Off Campus), CSIS, BITS-Pilani
Pilani Campus
Instructor Profile
Dr. Shreyas Rao
• 18+ Years of Experience in IT, Teaching and Research
• Working as Associate Professor (Off Campus), Dept. of CSIS, BITS-Pilani, WILP
• B.E from VTU, M.S in Software Systems from BITS (WILP) and PhD from MAHE
• Worked as Business Analyst and Team Lead at SLK Software Services for 7+
years
• Previously worked in Presidency University and Sahyadri College, Mangaluru as

R&D Head, CSE
• Executed 10+ Consultancy projects
• COE member in AI&ML and COE member in Data Science (Govt. Sponsored for
1.2 Cr)
BITS Pilani, Pilani Campus

BITS Pilani
Pilani Campus
SE ZG583, Scalable Services

Lecture No. 1
Course Objectives
No Course Objective
CO1 Build competence to design, develop, implement and manage scalable information
systems
CO2 Gain understanding of different techniques & tools for building and managing
scalable services
CO3 Gain understanding of challenges and best practices in creating and managing
scalable services

Text Books and Reference

Evaluation
Evaluation Name Type Weight Duration Day, Date,

Component (Quiz, Lab, Project, (Open book, Session, Time
Midterm exam, End Closed book,
semester exam, etc) Online, etc.)
EC – 1 Quiz 1 5% February 13-23,

2023
Quiz 2 5% March 20-30, 2023
Lab (exploring different 10% To be announced

tools)
Assignment (end to 10% To be announced

end app development)
EC – 2 Mid-term Exam Open book 30% 2 hours Saturday,

11/03/2023 (FN)
EC – 3 End Semester Exam Open book 40% 2 ½ hours Saturday,

20/05/2023 (FN)

Flipkart Big Billion Dollar Day Sale
• Launched on 6-Oct-2014
• Opened at 8am
• Big discount in 70+ categories, flash sales, lucky draw
• Sold large units of Nokia Lumia 525, Samsung Galaxy Tabs
at throw-away prices
• Three Lakh order in 6 hours!

Flipkart Big Billion Dollar Day Sale
Negatives:
• Website crashed due to huge traffic and footfalls
• Already selected products vanished from cart after
recovery or appeared as sold out!
• Money got deducted from account, but order not executed
• Big complaints from customers
• Reviews were hidden; no refund and no cancellation of
orders
From an Architecture perspective, what could have gone wrong?

Hotstar Case Study
What steps did the Hotstar Cloud Architects take to handle 25.3 Million
concurrent users?

Current Day Scenario!
Daily Micro-Deployments
Company Deployments Streaming / OTT
Amazon 23,000/ day
Google 5,500 / day
Netflix 500 / day
Twitter 3 / week
* Source: Phoenix Project (2021)
Social Media Apps E-Commerce Apps

Software Quality Attributes
• Scalability
• Performance
• Availability
• Reliability
• Interoperability
• Testability
• Usability
• Modifiability
• Security
• Portability
• Maintainability

Agenda
1. What is Scalability
2. Need for Scalable Architectures
3. Principles of Scalability
4. Scale Cube
5. CAP Theorem
6. Guidelines for Building Highly Scalable Systems
7. Architecture’s Scalability Requirements
8. Challenges for Scalability
9. Case Study – Uber / Netflix

What is Scalability
• Scalable is the term used to describe software systems that can

accommodate growth
• It is a non-functional requirement
• Capability to handle growth in “some dimension” of its operations
Operational dimensions can include

• number of simultaneous users or requests a system can process
• amount of data a system can effectively process and manage

Scale Up vs Scale Down
• Scaling Up refers to increasing the size or capacity of a system

• Scaling Down refers to reducing the size or capacity of a system
Ex: Adding more Physical Servers, RAM, CPUs, Virtual Machines,

Containers etc.

Vertical vs Horizontal Scaling
• Vertical Scaling (Scaling Up) is defined as increasing a single machine’s capacity with
the rising resources in the same logical server or unit
• Horizontal Scaling (Scaling Out) is an approach to enhance the performance of the
server node by adding new instances of the server to distribute the workload equally

Vertical vs Horizontal Scaling
• Vertical Scaling - Adding more power to an existing ec2 instance
• Horizontal Scaling - Adding more ec2 instances

Need for Scalable Architectures
The foundations of scale need to be built in from the beginning, with the
recognition that the components will evolve over time.
“Hyper scalable systems exhibit exponential growth in computational

and storage capabilities while exhibiting linear growth rates in the costs
of resources required to build, operate, support and evolve the required
software and hardware resources.”
- Ion Gorton

Scalability and Costs
Ex: We have a Web-based (e.g. web server and database) system that can
service a load of 100 concurrent requests with a mean response time of 1
second. As the request load increases, we see the mean response time
steadily grow to 10 seconds with the projected load.
Requirement: Scale up this system to handle 1000 concurrent requests with the
same response time??
Solution:
Scale Up - running the Web server on a more powerful machine (Vertical)
Scale Out - run multiple instances of the Web server to increase capacity
(Horizontal, Cloning)
Is it that easy to scale?

Scalability and Costs (contd)
Some Potential causes that increase effort and cost

1) Database becomes less responsive with 1000 requests per second - Requires
upgrading the database server
2) Web server may generate slow dynamic requests (depends on other downstream
services) - requires identifying bottlenecks and scaling them
3) Web server framework that was selected emphasized ease of development over
scalability - complete rewrite is required
All the above requires development/upgradation cost that is seriously significant!!!

If a system is not designed intrinsically to scale, then the downstream costs and
resources of increasing its capacity to meet requirements may be massive

Tenets of Scalable Architecture
Architectures evolve over time.
1. Avoid single point of failure: Have multiple application instances (Ex: ec2)
2. Scale horizontally, not vertically
3. Stateless - no client context stored on the server
4. Caching - Well-managed caching eliminates some client–server interactions,
improving scalability and performance
5. Loose coupling - Targeted, granular changes are possible simplifying scaling
approaches
6. Role of observability - Scaling a system requires understanding its behavior.
Logs, metrics, and traces all give you the information
7. Use of APIs
All the above target – Availability / Performance and Reliability areas

Twelve Architectural Principles
1. N+1 Design. Never less than two of anything, and remember the rule of three.
2. Design for Rollback. Ensure you can roll back any release of functionality.
3. Design to Be Disabled. Be able to turn off anything you release.
4. Design to Be Monitored. Think about monitoring during design, not after.
5. Design for Multiple Live Sites. Don’t box yourself into one-site solutions.
6. Use Mature Technologies. Use things you know work well.
7. Asynchronous Design. Communicate synchronously only when absolutely necessary.
8. Stateless Systems. Use state only when the business return justifies it.
9. Scale Out Not Up. Never rely on bigger, faster systems.
10.Buy When Non Core. If you aren’t the best at building it and it doesn’t offer competitive
differentiation, buy it.
11. Commodity Hardware. Cheaper is better most of the time.
12. Design for at Least Two Axes. Think one step ahead of your scale needs.
Ref: R2 – The Art of Scalability

Twelve Architectural Principles
Ref: R2 – The Art of Scalability

Scale Cubes: 3 ways to scale an
application
The scale cube defines

three separate ways to
scale an application:
X-axis scaling load
balances requests across
multiple, identical
instances;
Z-axis scaling routes
requests based on an
attribute of the request;
Y-axis functionally
decomposes an application
into services.
Source: “Microservices Patterns” by Chris Richardson Book

Scale Cubes
• X-axis application splits scale linearly with transaction growth. They do not help with the growth in
code complexity, customers, or data. X-axis splits are “clones” of each other.
• Y-axis application splits help scale code complexity as well as transaction growth. They are mostly
meant for code scale because as they are not as efficient as x-axis in transaction growth.
• Y-axis splits tend to be more costly to implement than x-axis splits as a result of engineering time
necessary to separate monolithic code bases.
• Y-axis splits aid in fault isolation.
• Z-axis application splits help scale customer growth, some elements of data growth and
transaction growth.
• As with y-axis splits, z-axis splits aid in fault isolation.
• The choice of when to use what method or axis of scale is both art and science.
• Production data should be used over time to help inform the decision.

Scale Cubes: X-Axis Scaling

Scale Cubes : Z-Axis Scaling

Scale Cubes : Y-Axis Scaling

Scale Cubes: Three-Axis Split

CAP Theorem
CAP Theorem states that a distributed system can only guarantee two out of the three
characteristics: Consistency, Availability and Partition Tolerance
• Consistency is the ability for the system to

read from any replicated service instance,
and always return the most recent write
made across replicated services of the
same type.
• Availability is the ability for a service to
execute a request and receive a response
within a reasonable time interval without
timing-out or throwing an error.
• A network partition occurs when one or
more of the application's services become
unreachable. Partition Tolerance is the
application's ability to continue to function
in the event a network partition occurs.

CAP Theorem
A distributed system always needs to be partition tolerant, we shouldn’t be making a
system where a network partition brings down the whole system.
So, a distributed system is always built Partition Tolerant.
So, In simple words, CAP theorem means if there is network partition and if you want
your system to keep functioning you can provide either Availability or Consistency and
not both.
https://bikas-katwal.medium.com/mongodb-vs-cassandra-vs-rdbms-where-do-they-stand-in-the-cap-theorem-1bae779a7a15

Dealing with Network Partitions
Sacrifice Consistency-AP(maintain Availability and Partition tolerance). In this

scenario we allow writes on both sides of the partition. When the partition has been
resolved, both sides of the partition need to merge their data to return to a
consistent state.
Sacrifice Availability-CP(maintain Consistency and Partition tolerance)In this

scenario, after a partition has occurred, one side of the partition is disabled to
prevent inconsistency. The remaining partition remains consistent, and the
application continues with a lower degree of availability.
CA is not possible in distributed systems-it requires a single process

Eventual vs Strong Consistency
Strong Consistency - simply means that all the nodes across the world should
contain the same value for an entity at any point in time.
Eventual Consistency - at some point the system will let all users read the most
recently made updates, but all will not do so immediately. This greatly reduces
synchronism in our application and allows the system to remain available to all
users, even in the face of network partitions - during the partition, users may see
inconsistent data, but as the partition heals the consistency of the system will
eventually be restored. [Decide level of staleness that is acceptable]

Guidelines for building highly
scalable systems
1. Avoid shared resources as they might become bottleneck
2. Avoid slow services
3. Scaling Data tier is tricky
4. Cache is the key
5. Monitoring is important

Architecture’s Scalability
Requirements
Identify Scalability requirements early in the software life cycle so that it allows the
architectural framework to become sound as the development proceeds.
System scalability criteria could include the ability to accommodate:

• Increasing number of users [Ex: Facebook]
• Increasing number of transactions per millisecond [Ex: Banking applications]
• Increase in the amount of data [Ex: Instagram Reels / Amazon Prime / Flipkart]

Challenges for Scalability
1. Centralized Approach
2. Synchronous Communication
3. Cost

Case Study 1 - Uber
Uber Monolithic Architecture (Previous)
• A REST API is present with which

the passenger and driver connect.
• Three different adapters are used
with API within them, to perform
actions such as billing, payments,
sending emails/messages that we
see when we book a cab.
• A MySQL database to store all
their data.
Source - https://dzone.com/articles/microservice-architecture-learn-build-and-deploy-a

Basic Microservices Architecture

Case Study 1 - Uber
Challenges of Monolithic Application
1. All the features had to be re-built, deployed and tested again and again to
update a single feature.
2. Fixing bugs became extremely difficult in a single repository as developers had
to change the code again and again.
3. Scaling the features simultaneously with the introduction of new features
worldwide was quite tough to be handled together.

Case Study 1 - Uber
Uber Microservices Architecture (Current)

Case Study 1 - Uber
Advantages of Microservices Architecture
1. The units are individual separate deployable units performing separate functionalities.
For Example: If you want to change anything in the billing microservices, then you just have to deploy only
billing microservices and don’t have to deploy the others.
2. All the features were now scaled individually i.e. The interdependency between each and every
feature was removed.
For Example, we all know that the number of people searching for cabs is more comparatively more than
the people actually booking a cab and making payments. This gets us an inference that the number of
processes working on the passenger management microservice is more than the number of processes
working on payments.

Case Study 2 - Netflix
• Netflix is a subscription-based video-on-demand OTT streaming service

• Handles large content of movies, television content, web series
• Has 180 Million subscribers across 200+ countries
What is Netflix Architecture, and how does it provide scalability?
Source - https://www.geeksforgeeks.org/system-design-netflix-a-complete-architecture/


• Open Connect CDN (Content Delivery Network)

• AWS Cloud
• Client device – Laptop / Mobile phone
• Backend services (includes database)
Source - https://www.geeksforgeeks.org/system-design-netflix-a-complete-architecture/

Video Encoding

• AWS Services – Login, Recommendation, Search, User History, Home Page,

Billing, Customer Support
Other components
• ZUUL – Gateway Service that provides dynamic routing based on the input
request
• Inbound filter – Authentication, Routing
• Endpoint filter – Return static response or forward to backend service
• Outbound filter – Zip content and calculate metrics. Send response to Netty Web
server and to the client

Other components
Microservices
• Critical services – searching video, navigating to video, play video
• Make it independent from other services
Database
• MYSQL (RDBMS) – Data for billing information, user information and transaction
information. Deployed on ec2 instances.
• Cassandra (NOSQL) – Viewing history data (over 50 Cassandra Clusters with
500 nodes)

References
1) Building Scalable Systems by Ian Gorton

2) The Art of Scalability: Scalable Web Architecture, Processes, and Organizations
for the Modern Enterprise, Second Edition by Michael T. Fisher; Martin L. Abbott
Published by Addison-Wesley Professional, 2015
3) Microservices patterns by Chris Richardson, Manning Publications 2018

Summary
1. What is Scalability
2. Need for Scalable Architectures
3. Principles of Scalability
4. Scale Cube
5. CAP Theorem
6. Guidelines for Building Highly Scalable Systems
7. Architecture’s Scalability Requirements
8. Challenges for Scalability
9. Case Study – Uber / Netflix

Thank You!

Lecture 1 - Getting To Know Scalability

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 1 - Getting To Know Scalability

Uploaded by

Copyright:

Available Formats

Getting to know

• Working as Associate Professor (Off Campus), Dept. of CSIS, BITS-Pilani, WILP

• Previously worked in Presidency University and Sahyadri College, Mangaluru as

• Executed 10+ Consultancy projects

BITS Pilani, Pilani Campus

SE ZG583, Scalable Services

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Evaluation Name Type Weight Duration Day, Date,

EC – 1 Quiz 1 5% February 13-23,

Quiz 2 5% March 20-30, 2023

Lab (exploring different 10% To be announced

Assignment (end to 10% To be announced

EC – 2 Mid-term Exam Open book 30% 2 hours Saturday,

EC – 3 End Semester Exam Open book 40% 2 ½ hours Saturday,

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

From an Architecture perspective, what could have gone wrong?

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Social Media Apps E-Commerce Apps

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• Scalable is the term used to describe software systems that can

Operational dimensions can include

BITS Pilani, Pilani Campus

• Scaling Up refers to increasing the size or capacity of a system

Ex: Adding more Physical Servers, RAM, CPUs, Virtual Machines,

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

“Hyper scalable systems exhibit exponential growth in computational

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Some Potential causes that increase effort and cost

All the above requires development/upgradation cost that is seriously significant!!!

BITS Pilani, Pilani Campus

All the above target – Availability / Performance and Reliability areas

BITS Pilani, Pilani Campus

Ref: R2 – The Art of Scalability

BITS Pilani, Pilani Campus

Ref: R2 – The Art of Scalability

BITS Pilani, Pilani Campus

The scale cube defines

Source: “Microservices Patterns” by Chris Richardson Book

BITS Pilani, Pilani Campus

Source: “Microservices Patterns” by Chris Richardson Book

BITS Pilani, Pilani Campus

Source: “Microservices Patterns” by Chris Richardson Book

BITS Pilani, Pilani Campus

Source: “Microservices Patterns” by Chris Richardson Book

BITS Pilani, Pilani Campus

Source: “Microservices Patterns” by Chris Richardson Book

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

• Consistency is the ability for the system to

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus

Sacrifice Consistency-AP(maintain Availability and Partition tolerance). In this

Sacrifice Availability-CP(maintain Consistency and Partition tolerance)In this

CA is not possible in distributed systems-it requires a single process