Professional Documents
Culture Documents
Truefoundry - ML Engineer Intern Assignment
Truefoundry - ML Engineer Intern Assignment
Motivation
from 2-3 months for those that do. Compare this to the 15 minutes it takes teams on
break because of scaling issues. A good % of the remaining is discarded due to issues
that can’t be detected because of the absence of the right monitoring and
Observability Systems!
What do we do?
automated and empower Data Scientists and ML Engineers to be able to test models
in production and iterate fast with as few dependencies on other teams as possible.
We draw our motivation from products created by Platform teams in top tech
companies like Facebook, Google, and Netflix, which allow all teams to move much
Our Workflow stores provide a structure to Data Scientist's code, makes it production
ready and deploys model APIs with auto scaling enabled, monitoring and observability
Basically, you can go from Model to Live in 15 minutes with the best Practices!
Mission
for anyone. Developers (SEs, ML Engineers, DSs) should just focus on writing the
business logic at very high iteration speeds and everything else should be abstracted
Vision
Imagine a flow where after writing the code — you can call a genie and tell about
your requirements like kind of service (Serverless, CronJob, Database, an API service),
resource requirements like CPU, memory, etc and the genie creates the service with
the best practices like Gitops, Infrastructure as Code (IAC) and then shows a
ML Intern Assignment
HuggingFace Model Deployment
Background:
● Truefoundry currently supports deploying hugging face models directly with
just a few clicks.
● When the model is deployed (you need to get access to the platform [1] and
deploy the model on our platform [2]) the API endpoints are exposed with V2
Inference Protocol. This is because we are using seldon’s mlserver internally to
deploy the model.
You can read about V2 inference protocol here.
Goal:
● The goal of this assignment is to write a fastapi service, which takes input for
the model (in the same format as the inference endpoints API of huggingface
models on the huggingface hub website) and internally converts the input to V2
inference protocol and returns the response.
● Since the input format will be different for different huggingface pipeline, you
should expose the final API as:
POST https://<your-service-endpoint>/predict/
with body as
Unset
{
"hf_pipeline": string,
"model_deployed_url": string,
"inputs": any,
"parameters": {...}
where:
https://huggingface.co/docs/inference-endpoints/supported_tasks
Addition Information:
● You should support the following four huggingface pipelines. Also we have
mentioned one model for each pipeline you can use for testing:
1. sshleifer/tiny-gpt2 [text-generation]
Unset
{
"hf_pipeline": "text-generation",
"model_deployed_url": string,
"inputs": string,
"parameters": {
2. typeform/distilbert-base-uncased-mnli [zero-shot-classification]
Unset
"hf_pipeline": "zero-shot-classification",
"model_deployed_url": string,
"inputs": string,
"parameters": {
candidate_labels: string[],
3. d4data/biomedical-ner-all [token-classification]
Unset
"hf_pipeline": "token-classification",
"model_deployed_url": string,
"inputs": string,
"parameters": {
4. TahaDouaji/detr-doc-table-detection [object-detection]
"hf_pipeline": "object-detection",
"model_deployed_url": string,
"parameters": {
Evaluation Points:
● Try to support as many different pipelines as possible (from the list shown
above)
● Try to make your code modular and write it in a way that it can be extended to
any other HuggingFace pipeline.
● Finally at the time of submission, submit a deployed link for any of the pipeline
and that too hosted on truefoundry. Relevant link :
https://docs.truefoundry.com/docs/deploying-your-first-service
For the host value in port section follow this format
Unset
https://<your-app-name>-<workspace-name>-<port>.demo1.truefoundry
.com
The New Service form on UI can help you generate this value
References
You need to signup on https://app.truefoundry.com and join the slack channel and
message there to get access to resources that will be necessary for you to complete
the assignment.
NOTE: You will be given the workspace access for 7 days. You have to complete and
submit the assignment within this time. After that - the workspace will be deleted.
● Configure the resources wisely [GPU is not required and should NOT be used
for any model] and click on submit
● You can use the following values for requests and limits:
JavaScript
resources:
cpu_limit: 0.1
cpu_request: 0.1
memory_request: 850
memory_limit: 900
ephemeral_storage_limit: 1000
ephemeral_storage_request: 1000
We will review your code and you will hear back from us if we are ready to move you
to the next stage. Please try to write a proper README in your github repo so that it’s
easy to understand how to run the code.
The final stage involves a technical interview wherein we will discuss your projects,
review the application you made above and test some general problem solving skills.
Feel free to email at parth@truefoundry.com if you have any questions. Best of luck
for the project!