Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Helicopter Racing League

Company overview
Helicopter Racing League (HRL) is a global sports league for competitive helicopter racing. Each
year HRL holds the world championship and several regional league competitions where teams
compete to earn a spot in the world championship. HRL offers a paid service to stream the
races all over the world with live telemetry and predictions throughout each race.

What do we know after reading the Overview?

● Users subscribe to the service, something similar to


○ https://www.peacocktv.com/sports/wwe
● Telemetry data is being streamed
○ Speed, latitude, longitude, other
● What would the predictions be based on?
○ The telemetry data, maybe weather (wind/rain), pilot, other?

Reference documents:
● Dynamic schema mapping for IoT streaming ingestion
● Real-time data processing with IoT Core
● Why companies should do their own video processing:
https://medium.com/@saibalaji4/do-it-yourself-video-solutions-on-google-cloud-88483
6c0faf8

Solution concept
HRL wants to migrate their existing service to a new platform to expand their use of managed
AI and ML services to facilitate race predictions. Additionally, as new fans engage with the
sport, particularly in emerging regions, they want to move the serving of their content, both
real-time and recorded, closer to their users. (Recorded content – Cloud Storage in areas close
to their users. CDN for caching at edge locations)

What do we know after reading the Solution concept?

● Will migrate all existing services to Google


● Want to expand usage of AI and ML for better race predictions
● Want to be able to serve content closer to end-users
○ Streaming data - races as they are happening
○ Recorded data - past events
Existing technical environment
HRL is a public cloud-first company; the core of their mission-critical applications runs on their
current public cloud provider. Video recording and editing is performed at the race tracks, and
the content is encoded and transcoded, where needed, in the cloud.

Enterprise-grade connectivity and local compute is provided by truck-mounted mobile data


centers. Their race prediction services are hosted exclusively on their existing public cloud
provider. Their existing technical environment is as follows:

● Existing content is stored in an object storage (Cloud Storage) service on their existing
public cloud provider.

● Video encoding and transcoding is performed on VMs created for each job. (Compute
Engine w/GPUs or could be done on KE cluster with GPU )

● Race predictions are performed using TensorFlow running on VMs in the current public
cloud provider. (TensorFlow on GC)

What do we know about the existing technical environment?

● Have mobile data-centers on trucks


○ VMs are doing video editing - probably encoding them to reduce size prior to
sending them to cloud provider
● Using another cloud provider
○ Content is stored in object storage
○ VMs run TensorFlow to create race predictions
■ How is the telemetry data being transmitted????
○ Another set of VMs do video encoding/transcoding
● “Enterprise” connectivity between truck data-center and cloud provider
Business requirements
HRL’s owners want to expand their predictive capabilities and reduce latency for their viewers in
emerging markets. Their requirements are:

● Support ability to expose the predictive models to partners


o APIs - Apigee
▪ A powerful video API is crucial for broadcasters that want complete
control over their video streaming workflow. Developers can use APIs to
integrate video content into nearly any website, mobile app, business
software, or cloud application

● Increase predictive capabilities during and before races:


o Race results
▪ Vertix AI/AI Platform, Tensorflow
o Mechanical failures
▪ Does this data exist as of now?
o Crowd sentiment
▪ Natural Language API

● Increase telemetry and create additional insights


o Vertix AI/AI Platform

● Measure fan engagement with new predictions.


o Video Intelligence API

● Enhance global availability and quality of the broadcasts.


o Cloud Storage & CDN for prior content
o Managed Instance Groups or GKE for transcoding

● Increase the number of concurrent viewers.


o Cloud Storage & CDN for prior content

● Minimize operational complexity.


o Global load balancing
o Automatic scaling with both Machine Instance Groups and/or GKE

● Ensure compliance with regulations


o Follow best practices
● Create a merchandising revenue stream.
o Video Intelligence API

Technical requirements
● Maintain or increase prediction throughput and accuracy.
o Improve machine learning model via various iterations of development and
testing
https://cloud.google.com/learn/what-is-predictive-analytics

● Reduce viewer latency.


o Multi-regional Cloud Storage and CDN

● Increase transcoding performance.


o Machines with GPUs, local SSDs
o Transcoder API a possibility

● Create real-time analytics of viewer consumption patterns and engagement.


o Data analytics in BigQuery and the presentation in Data Studio/Looker
o Make more educated decisions: what promotions to run, which content to
suggest and what region to enter next
o https://cloud.google.com/solutions/stream-analytics

● Create a data mart to enable processing of large volumes of race data.


o Cloud Storage, partitioning in BigQuery, other database product
o https://cloud.google.com/solutions/smart-analytics
o https://cloud.google.com/learn/what-is-a-data-lake
o https://cloud.google.com/architecture/build-a-data-lake-on-gcp

Executive statement
Our CEO, S. Hawke, wants to bring high-adrenaline racing to fans all around the world. We listen
to our fans, and they want enhanced video streams that include predictions of events within
the race (e.g., overtaking). Our current platform allows us to predict race outcomes but lacks
the facility to support real-time predictions during races and the capacity to process
season-long results.
EHR Healthcare
Company overview

EHR Healthcare is a leading provider of electronic health record software to the


medical industry. EHR Healthcare provides their software as a service to
multi-national medical offices, hospitals, and insurance providers.

Solution concept

Due to rapid changes in the healthcare and insurance industry, EHR Healthcare’s
business has been growing exponentially year over year. They need to be able to
scale their environment, adapt their disaster recovery plan, and roll out new
continuous deployment capabilities to update their software at a fast pace.
Google Cloud has been chosen to replace their current colocation facilities.

Existing technical environment

EHR’s software is currently hosted in multiple colocation facilities. The lease on


one of the data centers is about to expire.

Customer-facing applications are web-based, and many have recently been


containerized to run on a group of Kubernetes clusters (GKE). Data is stored in a
mixture of relational and NoSQL databases (MySQL, MS SQL Server (Cloud SQL?),
Redis (Cloud Memorystore), and MongoDB (https://cloud.google.com/mongodb ,
or VM with MongoDB installed, or Cloud Marketplace).

EHR is hosting several legacy file and API-based integrations with insurance
providers on-premises. These systems are scheduled to be replaced over the next
several years. There is no plan to upgrade or move these systems at the current
time.

Users are managed via Microsoft Active Directory.

● If interested in moving Active Directory to Google, then


https://cloud.google.com/managed-microsoft-ad
● Else can integrate with on-prem AD using Federation:
https://cloud.google.com/architecture/identity/federating-gcp-with-active-
directory-introduction
o Google Cloud Directory Sync:
https://tools.google.com/dlpage/dirsync/
o Cloud Identity: https://cloud.google.com/identity/docs/overview

Monitoring is currently being done via various open-source tools. Alerts are sent
via email and are often ignored.

The alerts are probably ignored because they are not alerting based on SLO’s. Will
probably replace/supplement their monitoring with Cloud Operations Monitoring
and Logging. Can export logs to 3rd party tools.
https://cloud.google.com/architecture/integrating-monitoring-logging-trace-obser
vability-and-alerting

Business requirements

● On-board new insurance providers as quickly as possible.


o Not sure exactly what this means.
● Provide a minimum 99.9% availability for all customer-facing systems.
o https://cloud.google.com/kubernetes-engine/sla
▪ Cannot use zonal KE deployments – must be regional
● https://cloud.google.com/kubernetes-engine/docs/conc
epts/regional-clusters
● Provide centralized visibility and proactive action on system performance
and usage.
o Cloud Operations; Dashboards; Alert Policies, Logging, maybe exports
to BigQuery
● Increase ability to provide insights into healthcare trends.
o Healthcare analytics
▪ https://cloud.google.com/solutions/healthcare-life-sciences
▪ https://cloudonair.withgoogle.com/events/hcls-healthcare-anal
ytics-webinars
● https://services.google.com/fh/files/misc/8kmiles_sum
mary.pdf
● Reduce latency to all customers.
o Regional deployments of KE clusters
o Web app – probably make use of CDN
● Maintain regulatory compliance.
o HIPPA: https://cloud.google.com/security/compliance/hipaa
● Decrease infrastructure administration costs.
o Autoscaling to increase/decrease nodes as needed
● Make predictions and generate reports on industry trends based on
provider data.
o Load data into Cloud Storage> Pub/Sub> Dataflow > Tensorflow (?)
o https://cloud.google.com/dataflow/docs/samples/molecules-walkthr
ough
o https://www.guru99.com/what-is-tensorflow.html

Technical requirements

● Maintain legacy interfaces to insurance providers with connectivity to both


on-premises systems and cloud providers.
o Cloud Interconnect? VPN?
● Provide a consistent way to manage customer-facing applications that are
container-based.
o Kubernetes Engine, Anthos
● Provide a secure and high-performance connection between on-premises
systems and Google Cloud.
o Cloud Interconnect
● Provide consistent logging, log retention, monitoring, and alerting
capabilities.
o Cloud Operations
● Maintain and manage multiple container-based environments.
o Anthos
● Dynamically scale and provision new environments.
o Provisioning – Terraform/Deployment Manager
● Create interfaces to ingest and process data from new providers.
o Pub/Sub, Dataflow (?)

Executive statement

Our on-premises strategy has worked for years but has required a major
investment of time and money in training our team on distinctly different systems,
managing similar but separate environments, and responding to outages. Many of
these outages have been a result of misconfigured systems, inadequate capacity
to manage spikes in traffic, and inconsistent monitoring practices. We want to use
Google Cloud to leverage a scalable, resilient platform that can span multiple
environments seamlessly (Anthos) and provide a consistent and stable user
experience that positions us for future growth.
TerramEarth
Company overview

TerramEarth manufactures heavy equipment for the mining and agricultural


industries. They currently have over 500 dealers and service centers in 100
countries. Their mission is to build products that make their customers more
productive.

Solution concept

There are 2 million TerramEarth vehicles in operation currently, and we see 20%
yearly growth. Vehicles collect telemetry data from many sensors during
operation. A small subset of critical data is transmitted from the vehicles in real
time to facilitate fleet management. The rest of the sensor data is collected,
compressed, and uploaded daily when the vehicles return to home base. Each
vehicle usually generates 200 to 500 megabytes of data per day.

The critical data is streaming in, so Pub/Sub & Dataflow are probably needed.
Data is time-series data, which means that each record has a time stamp,
identifiers indicating the piece of equipment that generated the data, and a series
of metrics. Bigtable is a good option when you need to write large volumes of data
in real time at low latency. Bigtable has support for regional replication, which
improves availability.

Data uploaded upon return to home base: 500 megabytes * 2 million vehicles = 1
petabyte. From reading the below, it looks like the “Transfer Service | On Prem ''
is the correct option to choose. It looks like they already have Cloud Interconnect
setup at their data centers
● https://cloud.google.com/architecture/migration-to-google-cloud-transferri
ng-your-large-datasets
● https://cloud.google.com/storage-transfer/docs/overview#what-is-storage-t
ransfer

Existing technical environment

TerramEarth’s vehicle data aggregation and analysis infrastructure resides in


Google Cloud and serves clients from all around the world. A growing amount of
sensor data is captured from their two main manufacturing plants and sent to
private data centers that contain their legacy inventory and logistics management
systems. The private data centers have multiple network interconnects
configured to Google Cloud.

The web frontend for dealers and customers is running in Google Cloud and allows
access to stock management and analytics.

Business requirements

● Predict and detect vehicle malfunction and rapidly ship parts to dealerships
for just-in- time repair where possible. Cloud Storage – Pub/Sub – Dataflow.
Not quite sure where Dataflow will store the resulting data:
o BigQuery ML is a possibility.
https://amygdala.github.io/gcp_blog/ml/automl/bigquery/2020/07/1
4/bqml_tables.html
● Decrease cloud operational costs and adapt to seasonality. Need more info
on what they are trying to optimize.
https://cloud.google.com/architecture/cost-efficiency-on-google-cloud
● Increase speed and reliability of development workflow. CI/CD -
https://cloud.google.com/architecture/continuous-delivery-toolchain-spinn
aker-cloud
● Allow remote developers to be productive without compromising code or
data security. See below
● Create a flexible and scalable platform for developers to create custom API
services for dealers and partners.
https://cloud.google.com/architecture/serverless-integration-solution-for-g
oogle-marketing-platform

Technical requirements

● Create a new abstraction layer for HTTP API access to their legacy systems
to enable a gradual move into the cloud without disrupting operations.
● Modernize all CI/CD pipelines to allow developers to deploy
container-based workloads in highly scalable environments.
https://cloud.google.com/architecture/best-practices-continuous-integratio
n-delivery-kubernetes
● Allow developers to run experiments without compromising security and
governance requirements
o https://cloud.google.com/blog/products/identity-security/configurin
g-secure-remote-access-for-compute-engine-vms
o https://www.helpnetsecurity.com/2020/04/23/google-remote-access
-service/
● Create a self-service portal for internal and partner developers to create
new projects, request resources for data analytics jobs, and centrally
manage access to the API endpoints. Apigee
https://pages.apigee.com/rs/351-WXY-166/images/API-Best-Practices-eboo
k-2016-12.pdf (warning: This resource is from 2016)
● Use cloud-native solutions for keys and secrets management and optimize
for identity- based access. Cloud KMS and Cloud Secrets Manager
o https://cloud.google.com/secret-manager
o https://cloud.google.com/security-key-management
● Improve and standardize tools necessary for application and network
monitoring and troubleshooting. Cloud Operations

Executive statement

Our competitive advantage has always been our focus on the customer, with our
ability to provide excellent customer service and minimize vehicle downtimes.
After moving multiple systems into Google Cloud, we are seeking new ways to
provide best-in-class online fleet management services to our customers and
improve operations of our dealerships. Our 5-year strategic plan is to create a
partner ecosystem of new products by enabling access to our data, increasing
autonomous operation capabilities of our vehicles, and creating a path to move
the remaining legacy systems to the cloud.
Mountkirk Games
Company overview

https://cloud.google.com/architecture/deploying-xonotic-game-servers

Mountkirk Games makes online, session-based, multiplayer games for mobile


platforms. They have recently started expanding to other platforms after
successfully migrating their on-premises environments to Google Cloud.

Their most recent endeavor is to create a retro-style first-person shooter (FPS)


game that allows hundreds of simultaneous players to join a geo-specific digital
arena from multiple platforms (phones, PCs, etc.) and locations. A real-time digital
banner will display a global leaderboard of all the top players across every active
arena.

Solution concept

Mountkirk Games is building a new multiplayer game that they expect to be very
popular. They plan to deploy the game’s backend on Google Kubernetes Engine so
they can scale rapidly and use Google’s global load balancer to route players to
the closest regional game arenas. In order to keep the global leader board in sync,
they plan to use a multi-region Spanner cluster. (Could also use Memorystore)

Existing technical environment

The existing environment was recently migrated to Google Cloud, and five games
came across using lift-and-shift virtual machine migrations, with a few minor
exceptions.

Each new game exists in an isolated Google Cloud project nested below a folder
that maintains most of the permissions and network policies. Legacy games with
low traffic have been consolidated into a single project. There are also separate
environments for development and testing.

Business requirements

● Support multiple gaming platforms. (phone, desktop, tablet)


● Support multiple regions. (Clusters in multiple regions)
● Support rapid iteration of game features. CI/CD: Cloud Build, Artifact
Registry, Source Code Repository, etc. Look at KE w/Spinnaker example:
https://cloud.google.com/docs/ci-cd
● Minimize latency. Deploy to multiple regions. Also CDN
● Optimize for dynamic scaling. KE supports this – will use a custom metric
https://cloud.google.com/kubernetes-engine/docs/tutorials/autoscaling-m
etrics#custom-metric
● Use managed services and pooled resources. Create multiple pools in KE
● Minimize costs. Scaling up/down as needed

Technical requirements

● Dynamically scale based on game activity. Custom Metrics


● Publish scoring data on a near real–time global leaderboard. Cloud
Spanner/Memorystore
● Store game activity logs in structured files for future analysis. Cloud Storage
● Use GPU processing to render graphics server-side for multi-platform
support. GPU support offered on Compute Engine, KE (be sure to click the
link in this doc for KE info)
https://cloud.google.com/blog/products/ai-machine-learning/cheaper-clou
d-ai-deployments-with-nvidia-t4-gpu-price-cut
● Support eventual migration of legacy games to this new platform. TBD

Executive statement

Our last game was the first time we used Google Cloud, and it was a tremendous
success. We were able to analyze player behavior and game telemetry in ways
that we never could before. Must be using Big Data related products already
(Pub/Sub, Dataflow, Dataproc, BigQuery, ML, other). We don’t have specifics
about this, but they may be mentioned in an exam question. This success allowed
us to bet on a full migration to the cloud and to start building all-new games using
cloud-native design principles. Our new game is our most ambitious to date and
will open up doors for us to support more gaming platforms beyond mobile.
Latency is our top priority, although cost management is the next most important
challenge. As with our first cloud-based game, we have grown to expect the cloud
to enable advanced analytics capabilities (ML) so we can rapidly iterate on our
deployments of bug fixes and new functionality.

You might also like