Truefoundry - ML Engineer Intern Assignment

About the Company
Motivation
90% of ML Models don’t end up in Production! It takes most companies anywhere
from 2-3 months for those that do. Compare this to the 15 minutes it takes teams on
Facebook to deploy a model. Half of these models deployed by most companies
break because of scaling issues. A good % of the remaining is discarded due to issues
that can’t be detected because of the absence of the right monitoring and
Observability Systems!
What do we do?
At TrueFoundry, we want to automate the parts in the ML pipeline that can be
automated and empower Data Scientists and ML Engineers to be able to test models
in production and iterate fast with as few dependencies on other teams as possible.
We draw our motivation from products created by Platform teams in top tech
companies like Facebook, Google, and Netflix, which allow all teams to move much
faster and deploy and iterate on their own!
Our Workflow stores provide a structure to Data Scientist's code, makes it production
ready and deploys model APIs with auto scaling enabled, monitoring and observability
dashboards, business user facing demos - automatically.
Basically, you can go from Model to Live in 15 minutes with the best Practices!
Mission
Our mission is to make Machine Learning productionisation and monitoring simpler
for anyone. Developers (SEs, ML Engineers, DSs) should just focus on writing the
business logic at very high iteration speeds and everything else should be abstracted
out from them!
Vision
Imagine a flow where after writing the code — you can call a genie and tell about
your requirements like kind of service (Serverless, CronJob, Database, an API service),
resource requirements like CPU, memory, etc and the genie creates the service with
the best practices like Gitops, Infrastructure as Code (IAC) and then shows a
dashboard with all the metrics created.
ML Intern Assignment
HuggingFace Model Deployment
Background:
● Truefoundry currently supports deploying hugging face models directly with
just a few clicks.
● When the model is deployed (you need to get access to the platform [1] and
deploy the model on our platform [2]) the API endpoints are exposed with V2
Inference Protocol. This is because we are using seldon’s mlserver internally to
deploy the model.
You can read about V2 inference protocol here.
● The problem with V2 protocol is that it is difficult to understand and non

intuitive to the user. So we want to improve this user experience.
Goal:
● The goal of this assignment is to write a fastapi service, which takes input for
the model (in the same format as the inference endpoints API of huggingface
models on the huggingface hub website) and internally converts the input to V2
inference protocol and returns the response.
● Since the input format will be different for different huggingface pipeline, you
should expose the final API as:
POST https://<your-service-endpoint>/predict/
with body as
Unset
{
"hf_pipeline": string,
"model_deployed_url": string,
"inputs": any,
"parameters": {...}
where:
○ hf_pipeline is the huggingface pipeline name like text-classification

○ model_deployed_url is the deployed endpoint of the hugging face model
(that you will paste from truefoundry’s dashboard)
○ inputs and parameters vary according to hf_pipeline - You can find the
inference API format of hugging face here:
https://huggingface.co/docs/inference-endpoints/supported_tasks
Addition Information:
● You should support the following four huggingface pipelines. Also we have
mentioned one model for each pipeline you can use for testing:
1. sshleifer/tiny-gpt2 [text-generation]
Input Format for your service:

https://huggingface.co/docs/inference-endpoints/supported_tasks#text-g
eneration
Unset
{
"hf_pipeline": "text-generation",
"inputs": string,
"parameters": {
// ... Any keyword arguments accepted by text-generation
2. typeform/distilbert-base-uncased-mnli [zero-shot-classification]

https://huggingface.co/docs/inference-endpoints/supported_tasks#zero-
shot-classification
Unset
"hf_pipeline": "zero-shot-classification",
"inputs": string,
"parameters": {
candidate_labels: string[],
// ... Any keyword arguments accepted by

zero-shot-classification
}
3. d4data/biomedical-ner-all [token-classification]

https://huggingface.co/docs/inference-endpoints/supported_tasks#token
-classifiation
Unset
"hf_pipeline": "token-classification",
"inputs": string,
"parameters": {

token-classification
4. TahaDouaji/detr-doc-table-detection [object-detection]

https://huggingface.co/docs/inference-endpoints/supported_tasks#object
-detection
JavaScript
"hf_pipeline": "object-detection",
"inputs": string, // http url to image OR base64 encoded

image bytes
"parameters": {

object-detection
● The model deployment on truefoundry uses seldon’s mlserver in the

background. To understand the input format for each pipeline, you can refer to
this file:
https://github.com/SeldonIO/MLServer/blob/master/runtimes/huggingface/mlse
rver_huggingface/metadata.py
Starter Template and Test Runner:

https://gist.github.com/chiragjn/62d9b0612c5e05a09d62b1b956a6154a
Evaluation Points:
● Try to support as many different pipelines as possible (from the list shown
above)
● Try to make your code modular and write it in a way that it can be extended to
any other HuggingFace pipeline.
● Finally at the time of submission, submit a deployed link for any of the pipeline
and that too hosted on truefoundry. Relevant link :
https://docs.truefoundry.com/docs/deploying-your-first-service
For the host value in port section follow this format
Unset
https://<your-app-name>-<workspace-name>-<port>.demo1.truefoundry
.com
The New Service form on UI can help you generate this value
References
[1] Getting Access to Truefoundry’s Platform:
You need to signup on https://app.truefoundry.com and join the slack channel and
message there to get access to resources that will be necessary for you to complete
the assignment.
1. For getting access to a workspace, signup on https://app.truefoundry.com and

personally message Parth Kathuria on slack stating your username to get
access to workspace.
2. Once you get access, you can start working on the assignment. You can post
your doubts related to using our platform/assignment on a slack channel
named “ml-intern-assgn”
NOTE: You will be given the workspace access for 7 days. You have to complete and
submit the assignment within this time. After that - the workspace will be deleted.
[2] Deploying a hugging face model on truefoundry:
Get access to https://app.truefoundry.com and resources by joining our slack channel

as described above.
● Go to deployments -> Model -> New Deployment

● Select your workspace
● Select hugging face model hub and enter the its name as shown below:
● Configure the resources wisely [GPU is not required and should NOT be used
for any model] and click on submit
● You can use the following values for requests and limits:
For text-generation, token-classification, zero-shot-classification
JavaScript
resources:
cpu_limit: 0.1
cpu_request: 0.1
memory_request: 850
memory_limit: 900
ephemeral_storage_limit: 1000
ephemeral_storage_request: 1000
For object-detection model

JavaScript
resources:
cpu_limit: 0.1
cpu_request: 0.1
memory_request: 1300
memory_limit: 1500
ephemeral_storage_limit: 1000
ephemeral_storage_request: 1000
● Your model should be live in a few minutes

● Reference Link:
https://docs.truefoundry.com/docs/model-deployment-1#huggingfacemodelhub
How to submit the project ?

The project should be done in github and please send us the link to the github
repository by filling out this form: https://forms.gle/GFjibmvmT2tjgHWSA
We will review your code and you will hear back from us if we are ready to move you
to the next stage. Please try to write a proper README in your github repo so that it’s
easy to understand how to run the code.
The final stage involves a technical interview wherein we will discuss your projects,
review the application you made above and test some general problem solving skills.
Feel free to email at parth@truefoundry.com if you have any questions. Best of luck
for the project!

Truefoundry - ML Engineer Intern Assignment

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Truefoundry - ML Engineer Intern Assignment

Uploaded by

Copyright:

Available Formats

About the Company

90% of ML Models don’t end up in Production! It takes most companies anywhere

Facebook to deploy a model. Half of these models deployed by most companies

At TrueFoundry, we want to automate the parts in the ML pipeline that can be

faster and deploy and iterate on their own!

dashboards, business user facing demos - automatically.

Our mission is to make Machine Learning productionisation and monitoring simpler

out from them!

dashboard with all the metrics created.

● The problem with V2 protocol is that it is difficult to understand and non

○ hf_pipeline is the huggingface pipeline name like text-classification

Input Format for your service:

// ... Any keyword arguments accepted by text-generation

Input Format for your service:

// ... Any keyword arguments accepted by

Input Format for your service:

// ... Any keyword arguments accepted by

Input Format for your service:

"inputs": string, // http url to image OR base64 encoded

// ... Any keyword arguments accepted by

● The model deployment on truefoundry uses seldon’s mlserver in the

Starter Template and Test Runner:

[1] Getting Access to Truefoundry’s Platform:

1. For getting access to a workspace, signup on https://app.truefoundry.com and

[2] Deploying a hugging face model on truefoundry:

Get access to https://app.truefoundry.com and resources by joining our slack channel

● Go to deployments -> Model -> New Deployment

For text-generation, token-classification, zero-shot-classification

For object-detection model

● Your model should be live in a few minutes

How to submit the project ?

You might also like