2.1.1.1 Technical skills · MLIB

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

2.1.1.

1 Technical skills

1. Software engineering. As ML models often require extensive engineering to train and deploy,
it’s important to have a good understanding of engineering principles. Aspects of computer
science that are more relevant to ML include algorithms, data structures, time/space
complexity, and scalability. You should be comfortable with the usual suspects: Python,
Jupyter Notebook or Google Colab, NumPy, scikit-learn19, and a deep learning framework.
Knowing at least one performance-oriented language such as C++ or Go can come in handy.
BestPracticer has an interesting list of engineering skills needed for skills at different levels.

1. Data cleaning, analytics, and visualization. Data handling is important yet often overlooked
in ML education. It’s a huge bonus when a candidate knows how to collect, explore, clean
data as well as knowing how to create training datasets. You should be comfortable with
dataframe manipulation (pandas, dask) and data visualization (seaborn, altair, matplotlib,
etc.). SQL is popular for relational databases and R for data analysis. Familiarity with
distributed toolkits like Spark and Hadoop is also very useful.
2. Machine learning knowledge. You should understand ML beyond citing buzzwords. Ideally,
you should be able to explain every architectural choice you make. You might not need this
understanding if all you do is clone an existing open-source implementation and it runs
flawlessly on your data. But models seldom run flawlessly, so you’d need this understanding
to evaluate potential solutions and debug your models.
3. Domain-specific knowledge. You should have knowledge relevant to the products of the
company you’re interviewing for. If it’s in the autonomous vehicle space, you’re probably
expected to know computer vision techniques as well as computer vision tasks such as
object detection, image segmentation, and motion analysis. If the company builds speech
recognition systems, you should know about mel-filterbank features, CTC loss, and common
benchmark datasets for the task of speech recognition.

19: As of 2019, scikit-learn is as popular as TensorFlow.

You might also like