Professional Documents
Culture Documents
Dp-100 Exam Ques
Dp-100 Exam Ques
Dp-100 Exam Ques
Question 1
Your task is to predict if a person suffers from a disease by setting up a binary classification model.
Your solution needs to be able to detect the classification errors that may appear.
Considering the below description, which of the following would be the best error type?
“A person does not suffer from a disease. Your model classifies the case as having no disease”.
1 / 1 point
True negatives
False negatives
False positives
True positives
Correct
A true negative is an outcome where the model correctly predicts the negative class.
2.
Question 2
Your company is asking you to analyze a dataset that contains historical data obtained from a local
car-sharing company. For this task, you decide to develop a regression model and you want to be
able to foretell what price a trip will be. For the correct evaluation of the regression model, you have
to use performance metrics.
1 / 1 point
Correct
RMSE and R2 are both metrics for regression models. Root mean squared error (RMSE) creates a
single value that summarizes the error in the model.
An R-Squared value close to 0
Correct
RMSE and R2 are both metrics for regression models. Coefficient of determination, often referred to
as R2, represents the predictive power of the model as a value between 0 and 1. Zero means the
model is random (explains nothing); 1 means there is a perfect fit.
3.
Question 3
In order to foretell the price for a student’s craftwork, you have to rely on the following variables: the
student’s length of education, degree type, and art form. You decide to set up a linear regression
model that you will have to evaluate. Solution: Apply the following metrics: Mean Absolute Error,
Root Mean Absolute Error, Relative Absolute Error, Accuracy, Precision, Recall, F1 score, and AUC:
1 / 1 point
Yes
No
Correct
Accuracy, Precision, Recall, F1 score, and AUC are metrics for evaluating classification models;
Mean Absolute Error, Root Mean Absolute Error, Relative Absolute Error are OK for the linear
regression model.
4.
Question 4
Your task is to create and evaluate a model. You decide to use a specific metric that provides you a
direct proportionality with how well the model fits.
1 / 1 point
Correct
This is the evaluation metric described. In essence, this metric represents how much of the variance
between predicted and actual label values the model is able to explain.
5.
Question 5
How should the following sentence be completed?
One example of the machine learning […] type models are the Decision trees algorithms.
0 / 1 point
Classification
Clustering
Regression
Incorrect
Try going back and reviewing Train and Evaluate Regression Models.
6.
Question 6
You have a Pandas DataFrame entitled df_sales that contains the sales data from each day. You
DataFrame contains these columns: year, month, day_of_month, sales_total. Which of the following
codes should you choose if your goal is to return the average sales_total value?
0 / 1 point
df_sales['sales_total'].mean()
df_sales['sales_total'].avg()
mean(df_sales['sales_total'])
Incorrect
Try going back and reviewing Exercise - Explore data.
7.
Question 7
Choose from the list below the evaluation metric that provides you with an absolute metric in the
same unit as the label.
0 / 1 point
Incorrect
Try going back and reviewing Exercise - Train and evaluate a regression model.
8.
Question 8
Which are two appropriate ways to approach a problem when using multiclass classification?
1 / 1 point
One vs Rest
Correct
One vs Rest (OVR), in which a classifier is created for each possible class value, with a positive
outcome for cases where the prediction is this class, and negative predictions for cases where the
prediction is any other class.
One vs One
Correct
One vs One (OVO), in which a classifier for each possible pair of classes is created.
0 / 1 point
model = KMeans(n_clusters=4)
model = Kmeans(n_init=4)
model = Kmeans(max_iter=4)
Incorrect
Try going back and reviewing Exercise - Train and evaluate a clustering model.
10.
Question 10
Which of the layer types described below is a principal one that retrieves important features in
images and works by putting a filter to images?
1 / 1 point
Convolutional layer
Pooling layer
Flattening layer
Correct
One of the principal layer types is a convolutional layer that extracts important features in images. A
convolutional layer works by applying a filter to images.
11.
Question 11
You want to set up a new Azure subscription. The subscription doesn’t contain any resources.
0 / 1 point
Run Python code that uses the Azure ML SDK library and calls the Workspace.get method with
name, subscription_id, and resource_group parameters.
Correct
This is one way to achieve the goal.
Use the Azure Command Line Interface (CLI) with the Azure Machine Learning extension to call the
az group create function with –name and –location parameters, and then the az ml workspace
create function, specifying Cw and Cg parameters for the workspace name and resource group.
Correct
This is one way to achieve the goal.
Run Python code that uses the Azure ML SDK library and calls the Workspace.create method with
name, subscription_id, resource_group, and location parameters.
Correct
This is one way to achieve the goal.
12.
Question 12
You decide to use GPU-based training to develop a deep learning model on Azure Machine
Learning service that is able to recognize image.
The context where you have to configure the model needs to allow real-time GPU-based inferencing.
Considering that you have to set up compute resources for model inferencing, what is the most
suitable compute type?
0 / 1 point
Incorrect
Try going back and reviewing Deploy real-time machine learning services with Azure Machine
Learning.
13.
Question 13
You decide to use the code below for the deployment of a model as an Azure Machine Learning
real-time web service:
service.wait_for_deployment(True)
You have to troubleshoot the deployment failure in order to determine what actions were taken while
deploying and to identify the one action that encountered a problem and didn’t succeed.
For this scenario, which of the following code snippets should you use?
0 / 1 point
service.state
service.get_logs()
service.serialize()
service.update_deployment_state()
Incorrect
Try going back and reviewing Deploy real-time machine learning services with Azure Machine
Learning.
14.
Question 14
You decide to register and train a model in your Azure Machine Learning workspace.
Your pipeline needs to ensure that the client applications are able to use the model for batch
inferencing.
Your single ParallelRunStep step pipeline uses a Python inferencing script in order to obtain
predictions from the input data.
Your task is to configure the inferencing script for the ParallelRunStep pipeline step.
Which are the most suitable two functions that you should use? Keep in mind that every correct
answer presents a part of the solution.
1 / 1 point
main()
init()
Correct
This function is called when the pipeline is initialized.
score(mini_batch)
batch()
run(mini_batch)
Correct
This function is called for each batch of data to be processed.
15.
Question 15
After installing the Azure Machine Learning Python SDK, you decide to use it to configure on your
subscription a workspace entitled “aml-workspace”.
1 / 1 point
ws = Workspace.create(name='aml-workspace',
subscription_id='123456-abc-123...',
resource_group='aml-resources',
create_resource_group=False,
location='eastus'
ws = Workspace.create(name='aml-workspace',
subscription_id='123456-abc-123...',
resource_group='aml-resources',
location='eastus'
)
from azureml.core import Workspace
ws = Workspace.create(name='aml-workspace',
subscription_id='123456-abc-123...',
resource_group='aml-resources',
create_resource_group=True,
location='eastus'
Correct
This is the correct and complete command to run for this scenario.
16.
Question 16
If your goal is to use a configuration file in order to ensure connection to your Azure ML workspace,
what Python command would be the most appropriate?
0 / 1 point
ws = from.config_Workspace()
ws = Workspace.from.config
ws = Workspace.from_config()
Incorrect
Try going back and reviewing Azure Machine Learning tools and interfaces.
17.
Question 17
If you want to extract a dataset after its registration, what are the most suitable methods you should
choose from the Dataset class?
0 / 1 point
find_by_name
get_by_id
get_by_name
find_by_id
18.
Question 18
What are the most appropriate SDK commands you should choose if you want to publish the
pipeline that you created?
0 / 1 point
publishedpipeline = pipeline_publish(name='training_pipeline',
version='1.0')
published.pipeline = pipeline.publish(name='training_pipeline',
version='1.0')
published.pipeline = pipeline_publish(name='training_pipeline',
description='Model training pipeline',
version='1.0')
published_pipeline = pipeline.publish(name='training_pipeline',
version='1.0')
Incorrect
Try going back and reviewing Publish pipelines.
19.
Question 19
True or False?
1 / 1 point
True
False
Correct
You must define parameters for a pipeline before publishing it.
20.
Question 20
Choose from the options below the one that explains how are values for hyperparameters selected
by random sampling.
0 / 1 point
It tries to select parameter combinations that will result in improved performance from the previous
selection
Incorrect
Try going back and reviewing Configuring sampling.
21.
Question 21
What Python code should you write if your goal is to implement a median stopping policy?
0 / 1 point
early_termination_policy = MedianStoppingPolicy(evaluation_interval=1,
delay_evaluation=5)
evaluation_interval=1,
delay_evaluation=5)
early_termination_policy = MedianStoppingPolicy(truncation_percentage=10,
evaluation_interval=1,
delay_evaluation=5)
Incorrect
Try going back and reviewing Configuring early termination.
22.
Question 22
What code should you write for a PFIExplainer if you have a model entitled loan_model?
0 / 1 point
from interpret.ext.blackbox import PFIExplainer
initialization_examples=X_test,
classes=['loan_amount','income','age','marital_status'],
features=['reject', 'approve'])
from interpret.ext.blackbox
initialization_examples=X_test,
features=['loan_amount','income','age','marital_status'],
classes=['reject', 'approve'])
features=['loan_amount','income','age','marital_status'],
classes=['reject', 'approve'])
explainable_model= DecisionTreeExplainableModel,
features=['loan_amount','income','age','marital_status'],
classes=['reject', 'approve'])
Incorrect
Try going back and reviewing Using explainers.
23.
Question 23
Your task is to train a binary classification model in order for it to be able to target the correct
subjects in a marketing campaign.
What actions should you take if you want to ensure that your model is fair and will not be inclined to
ethnic discrimination?
1 / 1 point
Evaluate each trained model with a validation dataset, and use the model with the highest accuracy
score. An accurate model is inherently fair.
Compare disparity between selection rates and performance metrics across ethnicities.
Correct
By using ethnicity as a sensitive field, and comparing disparity between selection rates and
performance metrics for each ethnicity value, you can evaluate the fairness of the model.
24.
Question 24
You decided to preprocess and filter down only the relevant columns for your AirBnB housing
dataframe.
The columns that you kept are: id, host_name, bedrooms, neighbourhood_cleansed, price.
In order to obtain the first initial from the host_name column, you have written the following function
that you entitled firstInitialFunction:
def firstInitialFunction(name):
return name[0]
firstInitialFunction("George")
Your goal is to use the spark.sql.register in order to create a UDF from the function above, because
you want to ensure that the UDF will be created in the SQL namespace.
Considering this scenario, what code should you write?
0 / 1 point
airbnbDF.createAndReplaceTempView("airbnbDF")
spark.udf.register(sql_udf.firstInitialFunction)
airbnbDF.replaceTempView("airbnbDF")
spark.udf.register("sql_udf", firstInitialFunction)
airbnbDF.createTempView("airbnbDF")
spark.udf.register(sql_udf = firstInitialFunction)
airbnbDF.createOrReplaceTempView("airbnbDF")
spark.udf.register("sql_udf", firstInitialFunction)
Incorrect
Try going back and reviewing Work with user-defined functions.
25.
Question 25
You discover a median value for a number of variables in your AirBnB Housing dataset, variables
like the number of rooms, per capita crime and economic status of residents.
Depending on the average number of rooms, you want to be able to predict the median home value
by using Linear Regression.
You decided to use VectorAssembler to import the dataset and to create your column entitled
features that includes a single input variable entitled rm.
0 / 1 point
from pyspark import LinearRegression
lr = LinearRegression(featuresCol="features", labelCol="medv")
lrModel = lr.fit(bostonFeaturizedDF)
lr = LinearRegression(featuresCol="rm", labelCol="medv")
lrModel = lr_fit(bostonFeaturizedDF)
lr = LinearRegression(featuresCol="features", labelCol="medv")
lrModel = lr.fit(bostonFeaturizedDF)
lrModel = lr_fit(bostonFeaturizedDF)
Incorrect
Try going back and reviewing Train a machine learning model.
26.
Question 26
You want to evaluate a Python NumPy array that has six data points with the following definition:
data = [10, 20, 30, 40, 50, 60]
Your task is to use the k-fold algorithm implementation in the Python Scikit-learn machine learning
library to generate the output that follows: train: [10 40 50 60], test: [20 30] train: [20 30 40 60], test:
[10 50] train: [10 20 30 50], test: [40 60]
To give the correct answer, you have to replace the code comments that are bolded with some
suitable code options that you find in the answer area.
Considering this, what snippet should you choose to complete the code?
0 / 1 point
K-means, 6, array
K-fold, 3, array
CrossValidation, 3, data
K-fold, 3, data
Incorrect
Try going back and reviewing Perform model selection with hyperparameter tuning.
27.
Question 27
For your experiment in Azure Machine Learning you decide to run the following code:
ws = Workspace.from_config()
run_config = RunConfiguration()
run_config.target=’local’
script_config = ScriptRunConfig(source_directory=’./script’, script=’experiment.py’,
run_config=run_config)
run = experiment.submit(config=script_config)
run.wait_for_completion()
The experiment run generates several output files that need identification.
In order to retrieve the output file names, you must write some code. Which of the following code
snippets should you choose to complete the script?
0 / 1 point
files = run.get_metrics()
files = run.get_properties()
files = run.get_fine_names()
files = run.get_details_with_logs()
Incorrect
Try going back and reviewing Work with Azure Machine Learning to deploy serving models.
28.
Question 28
One of the categorical variables of your AirBnB dataset is room type.
You have three room types, as follows: private room, entire home/apt, and shared room.
In order for the machine learning model to know how to handle the room types, you have to firstly
encode every unique string into a number.
0 / 1 point
from pyspark.ml.feature import StringIndexer
uniqueTypesDF = airbnbDF.select("room_type").distinct()
indexerModel = indexer.transform(uniqueTypesDF)
indexedDF = indexerModel.transform(uniqueTypesDF)
display(indexedDF)
uniqueTypesDF = airbnbDF.select("room_type").distinct()
indexerModel = indexer.fit(uniqueTypesDF)
indexedDF = indexerModel.transform(uniqueTypesDF)
display(indexedDF)
uniqueTypesDF = airbnbDF.select("room_type").distinct()
indexer = StringIndexer(inputCol="room_type”)
indexerModel = indexer.fit(uniqueTypesDF)
indexedDF = indexerModel.transform(uniqueTypesDF)
display(indexedDF)
uniqueTypesDF = airbnbDF.select("room_type").distinct()
indexer = StringIndexer(inputCol="room_type", outputCol="room_type_index")
indexerModel = indexer.fit(uniqueTypesDF)
indexedDF = indexerModel.transform(uniqueTypesDF)
display(indexedDF)
Incorrect
Try going back and reviewing Perform featurization of the dataset.
29.
Question 29
Your task is to extract from the experiments list the last run.
0 / 1 point
runs[0].data.metrics
runs[0].data.metrics
runs[0].data.metrics
runs[0].data.metrics
Incorrect
Try going back and reviewing Use MLflow to track experiments, log metrics, and compare runs.
30.
Question 30
Choose from the list below the cross-validation technique that belongs to the exhaustive type.
0 / 1 point
K-fold cross-validation
Leave-one-out cross-validation
Leave-p-out cross-validation
Correct
Leave-p-out cross-validation (LpO CV) is an exhaustive type of cross-validation technique. It
involves using p observations as the validation set and the remaining observations as the training
set. This is repeated on all ways to cut the original sample on a validation set of p observations and
a training set.
Holdout cross-validation
31.
Question 31
You decided to use Azure Machine Learning and your goal is to train a Diabetes Model and build a
container image for it.
You choose to make use of the scikit-learn ElasticNet linear regression model.
You want to use Azure Kubernetes Service (AKS) for the model deployment to production.
You have to create an active AKS cluster by using the Azure ML SDK.
1 / 1 point
(name = aks_cluster_name,
provisioning_configuration = prov_config)
name = aks_cluster_name,
provisioning_configuration = prov_config)
name = aks_cluster_name,
provisioning_configuration = prov_config)
name = aks_cluster_name,)
Correct
This is the correct code for this task.
32.
Question 32
If you want to list the generated files after your experiment run is completed, what is the most
suitable object run you should choose?
0 / 1 point
list_file_names
download_files
download_file
get_file_names
Incorrect
Try going back and reviewing Registering models.
33.
Question 33
Your hyperparameter tuning needs to have a search space defined. The values of the batch_size
hyperparameter can be 128, 256, or 512 and the normal distribution values for the learning_rate
hyperparameter can have a mean of 10 and a standard deviation of 3.
What Python code should you write in order to achieve this goal?
0 / 1 point
param_space = {
'--learning_rate': lognormal(10, 3)
param_space = {
'--learning_rate': qnormal(10, 3)
param_space = {
'--learning_rate': normal(10, 3)
}
from azureml.train.hyperdrive import choice, uniform
param_space = {
'--learning_rate': uniform(10, 3)
Incorrect
Try going back and reviewing Defining a search space.
34.
Question 34
You intend to use the Hyperdrive feature of Azure Machine Learning to determine the optimal
hyperparameter values when training a model.
You need to use Hyperdrive to try combinations of the following hyperparameter values:
You must configure the search space for the Hyperdrive experiment.
Which two parameter expressions should you use? Each correct answer presents part of the
solution.
0 / 1 point
Correct
Discrete hyperparameters are specified as a choice among discrete values. choice can be: one or
more comma-separated values -- a range object -- any arbitrary list object.
Correct
Continuous hyperparameters are specified as a distribution over a continuous range of values.
Supported distributions include:
-- uniform(low, high) - Returns a value uniformly distributed between low and high.
35.
Question 35
You are evaluating a completed binary classification machine learning model.
0 / 1 point
Box plot
A violin plot
Gradient descent
Incorrect
Try going back and reviewing Create a classification model with Azure AI.
1.
Question 1
Your task is to predict if a person suffers from a disease by setting up a binary classification model.
Your solution needs to be able to detect the classification errors that may appear.
Considering the below description, which of the following would be the best error type?
“A person does not suffer from a disease. Your model classifies the case as having a disease”.
1 / 1 point
True positives
False positives
False negatives
True negatives
Correct
A false positive is an outcome where the model incorrectly predicts the positive class.
2.
Question 2
As a senior data scientist, you need to evaluate a binary classification machine learning model.
As evaluation metric, you have to use the precision. Considering this, which is the most appropriate
visualization?
0 / 1 point
Violin plot
Scatter plot
Gradient descent
Incorrect
Try going back to Train and evaluate Classification models.
3.
Question 3
In order to foretell the price for a student’s craftwork, you have to rely on the following variables: the
student’s length of education, degree type, and art form. You decide to set up a linear regression
model that you will have to evaluate. Solution: Apply the following metrics: Mean Absolute Error,
Root Mean Absolute Error, Relative Absolute Error, Accuracy, Precision, Recall, F1 score, and AUC:
1 / 1 point
Yes
No
Correct
Accuracy, Precision, Recall, F1 score, and AUC are metrics for evaluating classification models;
Mean Absolute Error, Root Mean Absolute Error, Relative Absolute Error are OK for the linear
regression model.
4.
Question 4
Your task is to create and evaluate a model. One of the metrics shows an absolute metric in the
same unit as the label.
1 / 1 point
Correct
This is the described metric. This means that the smaller the value, the better the model.
5.
Question 5
Python is commonly known to ensure extensive functionality with powerful and statistical numerical
libraries. What are the utilities of TensorFlow?
0 / 1 point
Incorrect
Try going back and reviewing Explore & Analyse Data with Python.
6.
Question 6
If you multiply by 2 a list and a NumPy array, what result would you get?
1 / 1 point
Multiplying a list by 2 creates a new list 2 times the length with the original sequence repeated 2
times.
Correct
This is how a list behaves when multiplied.
Multiplying a NumPy array by 2 performs an element-wise calculation on the array, which sees the
array stay the same size, but each element has been multiplied by 2.
Correct
This is how a NumPy array behaves when multiplied.
Multiplying an NumPy array by 2 creates a new array 2 times the length with the original sequence
repeated 2 times.
Multiplying a list by 2 performs an element-wise calculation on the list, which sees the list stay the
same size, but each element has been multiplied by 2.
7.
Question 7
Choose from the list below the evaluation metric that provides you with an absolute metric in the
same unit as the label.
1 / 1 point
Correct
This is the described metric. This means that the smaller the value, the better the model.
8.
Question 8
Four possible prediction outcomes are able to provide you with the Precision and Recall metrics.
What is the outcome in the scenario where the predicted label is 1, but the actual label is 0?
0 / 1 point
False Positive
True Negative
True Positive
False Negative
Incorrect
Try going back and reviewing Exercise - Train and evaluate a classification model.
9.
Question 9
Your deep neural network is in the process of training. You decided to set 30 epochs to the training
process configuration.
0 / 1 point
The first 30 rows of data are used to train the model, and the remaining rows are used to validate it
The training data is split into 30 subsets, and each subset is passed through the network
Incorrect
Try going back and reviewing Train a deep neural network.
10.
Question 10
Which of the layer types described below is a principal one that retrieves important features in
images and works by putting a filter to images?
1 / 1 point
Convolutional layer
Flattening layer
Pooling layer
Correct
One of the principal layer types is a convolutional layer that extracts important features in images. A
convolutional layer works by applying a filter to images.
11.
Question 11
You are using an Azure Machine Learning service for your data science project. In order to deploy
the project, you have to choose a compute target. For this scenario, which of the following Azure
services is the most suitable?
0 / 1 point
Azure Databricks
Incorrect
Try going back and reviewing Work with Compute in Azure Machine Learning.
12.
Question 12
You have a set of CSV files that contain sales records. Your CSV files follow an identical data
schema.
The sales record for a certain month are held in one of the CSV files and the filename is sales.csv.
For every file there is a corresponding storage folder that shows the month and the year for the data
recording. In an Azure Machine Learning workspace has been set up a datastore for the folders kept
in an Azure blob container. The parent folder entitled sales contains the folders organized to create
the hierarchical structure below:
/sales
/01-2019
/sales.csv
/02-2019
/sales.csv
/03-2019
/sales.csv
…
In the sales folder is added a new folder with a certain month’s sales every time that month has
ended. You want to train a machine learning model by using the sales data while complying with the
requirements below:
- All of your sales data have to be loaded to date by a dataset and into a structure that enables easy
conversion to a dataframe.
- You have to ensure that experiments can be done by using only the data created until a specific
previous month, disregarding any data added after the month selected.
- You have to keep the number of registered datasets to the minimum possible.
Considering that the sales data have to be registered as a dataset in the Azure Machine Learning
service workspace, what actions should you take?
1 / 1 point
Create a tabular dataset that references the datastore and explicitly specifies each 'sales/mm-
yyyy/sales.csv' file every month. Register the dataset with the name sales_dataset each month,
replacing the existing dataset and specifying a tag named month indicating the month and year it
was registered. Use this dataset for all experiments.
Create a tabular dataset that references the datastore and explicitly specifies each 'sales/mm-
yyyy/sales.csv' file. Register the dataset with the name sales_dataset each month as a new version
and with a tag named month indicating the month and year it was registered. Use this dataset for all
experiments, identifying the version to be used based on the month tag as necessary.
Create a new tabular dataset that references the datastore and explicitly specifies each 'sales/mm-
yyyy/sales.csv' file every month. Register the dataset with the name sales_dataset_MM-YYYY each
month with appropriate MM and YYYY values for the month and year. Use the appropriate month-
specific dataset for experiments.
Create a tabular dataset that references the datastore and specifies the path 'sales/*/sales.csv',
register the dataset with the name sales_dataset and a tag named month indicating the month and
year it was registered, and use this dataset for all experiments.
Correct
This is the correct approach to this scenario.
13.
Question 13
You decide to use Azure Machine Learning designer for your real-time service endpoint. You can
make use of only one Azure Machine Learning service compute resource.
You start training the model and preparing the real-time pipeline for deployment.
If you want to obtain a web service by publishing the inference pipeline, what is the most suitable
compute type?
0 / 1 point
HDInsight
Azure Databricks
Incorrect
Try going back and reviewing Deploy real-time machine learning services with Azure Machine
Learning.
14.
Question 14
Yes or No?
In order to explain the model’s predictions, you have to calculate the importance of all the features,
taking into account the overall global relative importance value, but also the measure of local
importance for a certain set of predictions.
You decide to obtain the global and local feature importance values that you need by using an
explainer.
Yes
No
Incorrect
Try going back and reviewing Explain machine learning models with Azure Machine Learning.
15.
Question 15
Yes or No?
You use a logistic regression algorithm to train your classification model. In order to explain the
model’s predictions, you have to calculate the importance of all the features, taking into account the
overall global relative importance value, but also the measure of local importance for a certain set of
predictions.
You decide to obtain the global and local feature importance values that you need by using an
explainer.
0 / 1 point
Yes
No
Incorrect
Try going back and reviewing Explain machine learning models with Azure Machine Learning.
16.
Question 16
If your goal is to use a configuration file in order to ensure connection to your Azure ML workspace,
what Python command would be the most appropriate?
0 / 1 point
ws = from.config_Workspace()
from azureml.core import Workspace
ws = Workspace.from_config()
ws = Workspace.from.config
Incorrect
Try going back and reviewing Azure Machine Learning tools and interfaces.
17.
Question 17
If you want to use the from_delimited_files method of the Dataset.Tabular class to configure and
register a tabular dataset, what are the most appropriate Python commands?
0 / 1 point
blob_ds = ws.get_default_datastore()
(blob_ds, 'data/files/archive/*.csv')]
tab_ds = Dataset.Tabular.from_delimited_files()
blob_ds = ws.get_default_datastore()
(blob_ds, 'data/files/archive/*.csv')]
tab_ds = Dataset.Tabular.from_delimited_files(path=csv_paths)
tab_ds = tab_ds.register(workspace=ws, name='csv_table')
blob_ds = ws.get_default_datastore()
(blob_ds, 'data/files/archive/csv')]
tab_ds = Dataset.Tabular.from_delimited_files(path=csv_paths)
blob_ds = ws.change_default_datastore()
(blob_ds, 'data/files/archive/*.csv')]
tab_ds = Dataset.Tabular.from_delimited_files(path=csv_paths)
Incorrect
Try going back and reviewing Introduction to datasets.
18.
Question 18
Your task is to use the SDK in order to define a compute configuration for a managed compute
target.
Which of the following commands will return you the expected result?
0 / 1 point
compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_DS11_V2',
min_nodes=0, max_nodes=4,
vm_priority='dedicated')
compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_DS11_V2',
min_nodes=0, max_nodes=0,
vm_priority='dedicated')
compute_config = AmlCompute.provisioning.configuration(vm_size='STANDARD_DS11_V2',
min_nodes=0, max_nodes=4,
vm_priority='dedicated')
compute_config = AmlCompute_provisioning_configuration(vm_size='STANDARD_DS11_V2',
min_nodes=0, max_nodes=4,
vm_priority='dedicated')
Incorrect
Try going back and reviewing Create compute targets.
19.
Question 19
Your task is to deploy your service on an AKS cluster that is set up as a compute target.
What SDK commands are able to return you the expected result?
0 / 1 point
cluster_name = 'aks-cluster'
compute_config = AksCompute.provisioning_configuration(location='eastus')
production_cluster.wait_for_completion(show_output=True)
from azureml.core.compute import ComputeTarget, AksCompute
cluster_name = 'aks-cluster'
compute_config = AksCompute.provisioning_configuration(location='eastus')
production_cluster.wait_for_completion(show_output=True)
cluster_name = 'aks-cluster'
compute_config = AksCompute.provisioning_configuration(location='eastus')
production_cluster.wait_for_completion(show_output=True)
cluster_name = 'aks-cluster'
compute_config = AksCompute.provisioning_configuration(location='eastus')
production_cluster.wait_for_completion(show_output=True)
Incorrect
Try going back and reviewing Deploy a model as a real-time service.
20.
Question 20
If you want to extract the parallel_run_step.txt file from the output of the step after the pipeline run
has ended, what code should you choose?
1 / 1 point
prediction_run = next(pipeline_run.get_children())
prediction_output = prediction_run.get_output_data('inferences')
prediction_output.download(local_path='results')
if file.endswith('parallel_run_step.txt'):
result_file = os.path.join(root,file)
print(df)
Correct
This code will find the parallel_run_step.txt file.
21.
Question 21
What code should you write using SDK if your goal is to extract the best run and its model?
0 / 1 point
best_run_metrics = best_run_get_metrics(1)
metric = best_run_metrics[metric_name]
print(metric_name, metric)
metric = best_run_metrics[metric_name]
print(metric_name, metric)
best_run_metrics = best_run.get_metrics()
metric = best_run_metrics[metric_name]
print(metric_name, metric)
best_run_metrics = best_run.get_metrics()
metric = best_run_metrics[metric_name]
print(metric_name, metric)
Incorrect
Try going back and reviewing Running automated machine learning experiments.
22.
Question 22
What code should you write for a PFIExplainer if you have a model entitled loan_model?
0 / 1 point
from interpret.ext.blackbox
features=['loan_amount','income','age','marital_status'],
classes=['reject', 'approve'])
features=['loan_amount','income','age','marital_status'],
classes=['reject', 'approve'])
initialization_examples=X_test,
classes=['loan_amount','income','age','marital_status'],
features=['reject', 'approve'])
explainable_model= DecisionTreeExplainableModel,
features=['loan_amount','income','age','marital_status'],
classes=['reject', 'approve'])
Incorrect
Try going back and reviewing Using explainers.
23.
Question 23
If you want to minimize disparity in combined true positive rate and false_positive_rate across
sensitive feature groups, what is the most suitable parity constraint that you should choose to use
with any of the mitigation algorithms?
0 / 1 point
Equalized odds
Incorrect
Try going back and reviewing Mitigate unfairness with Fairlearn.
24.
Question 24
You decided to preprocess and filter down only the relevant columns for your AirBnB housing
dataframe.
The columns that you kept are: id, host_name, bedrooms, neighbourhood_cleansed, price.
In order to obtain the first initial from the host_name column, you have written the following function
that you entitled firstInitialFunction:
def firstInitialFunction(name):
return name[0]
firstInitialFunction("George")
Your goal is to use the spark.sql.register in order to create a UDF from the function above, because
you want to ensure that the UDF will be created in the SQL namespace.
0 / 1 point
airbnbDF.createTempView("airbnbDF")
spark.udf.register(sql_udf = firstInitialFunction)
airbnbDF.createAndReplaceTempView("airbnbDF")
spark.udf.register(sql_udf.firstInitialFunction)
airbnbDF.replaceTempView("airbnbDF")
spark.udf.register("sql_udf", firstInitialFunction)
airbnbDF.createOrReplaceTempView("airbnbDF")
spark.udf.register("sql_udf", firstInitialFunction)
Incorrect
Try going back and reviewing Work with user-defined functions.
25.
Question 25
You decided to use the AirBnB Housing dataset and the Linear Regression algorithm for which you
want to tune the Hyperparameters.
At this point, for the Boston data set you have executed a test split and for the linear regression you
have built a pipeline.
You now want to test the maximum number of iterations by using the ParamGridBuilder() and you
can do this no matter if you want to use an intercept with the y axis or fi you want to standardize the
features.
0 / 1 point
paramGrid = (ParamGridBuilder(lr)
.run()
paramGrid = (ParamGridBuilder(lr)
.create()
paramGrid = (ParamGridBuilder()
.build()
paramGrid = (ParamGridBuilder()
.addGrid(lr.maxIter, [1, 10, 100])
.search()
Incorrect
Try going back and reviewing Perform model selection with hyperparameter tuning.
26.
Question 26
You decided to use Python code interactively in your Conda environment. You have all the required
Azure Machine Learning SDK and MLflow packages in the environment.
In order to log metrics in your Azure Machine Learning experiment entitled mlflow-experiment, you
have to use MLflow.
To give the correct answer, you have to replace the code comments that are bolded with some
suitable code options that you find in the answer area.
Considering this, what snippet should you choose to complete the code?
import mlflow
ws = Workspace.from_config()
print(“Finished!”)
0 / 1 point
#1 mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri()), #2 mlflow.get_run('mlflow-experiment), #3
mlflow.start_run(), #4 run.log()
#1 mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri()), #2 mlflow.set_experiment('mlflow-
experiment), #3 mlflow.start_run(), #4 mlflow.log_metric
#1 mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri()), #2 mlflow.get_run('mlflow-experiment), #3
mlflow.start_run(), #4 mlflow.log_metric
Incorrect
Try going back and reviewing Use MLflow to track experiments, log metrics, and compare runs.
27.
Question 27
You want to deploy in your Azure Container Instance a deep learning model.
In order to call the model API, you have to use the Azure Machine Learning SDK.
To invoke the deployed model, you have to use native SDK classes and methods.
To give the correct answer, you have to replace the code comments that are bolded with some
suitable code options that you find in the answer area.
Considering this, what snippet should you choose to complete the code?
Import json
ws = Workspace.from_config()
service_name = “mlmodel1-service”
1 / 1 point
Correct
These are the correct commands for this task.
28.
Question 28
One of the categorical variables of your AirBnB dataset is room type.
You have three room types, as follows: private room, entire home/apt, and shared room.
In order for the machine learning model to know how to handle the room types, you have to firstly
encode every unique string into a number.
0 / 1 point
uniqueTypesDF = airbnbDF.select("room_type").distinct()
indexerModel = indexer.transform(uniqueTypesDF)
indexedDF = indexerModel.transform(uniqueTypesDF)
display(indexedDF)
uniqueTypesDF = airbnbDF.select("room_type").distinct()
indexerModel = indexer.fit(uniqueTypesDF)
indexedDF = indexerModel.transform(uniqueTypesDF)
display(indexedDF)
uniqueTypesDF = airbnbDF.select("room_type").distinct()
indexer = StringIndexer(inputCol="room_type”)
indexerModel = indexer.fit(uniqueTypesDF)
indexedDF = indexerModel.transform(uniqueTypesDF)
display(indexedDF)
uniqueTypesDF = airbnbDF.select("room_type").distinct()
indexerModel = indexer.fit(uniqueTypesDF)
indexedDF = indexerModel.transform(uniqueTypesDF)
display(indexedDF)
Incorrect
Try going back and reviewing Perform featurization of the dataset.
29.
Question 29
You are able to use the the MlflowClient object as the pathway in order to query previous runs in a
programmatic manner.
1 / 1 point
client = MlflowClient()
client.list_experiments()
client = MlflowClient()
list.client_experiments()
client = MlflowClient()
client.list_experiments()
client = MlflowClient()
list.experiments()
Correct
This is the correct code syntax for this job.
30.
Question 30
In you want to explore the hyperparameters on a model while knowing that every algorithm uses a
different hyperparameter for tuning, what is the most appropriate method you should choose?
0 / 1 point
exploreParams()
explainParams()
showParams()
getParams()
Incorrect
Try going back and reviewing Describe model selection and hyperparameter tuning.
31.
Question 31
Your task is to clean up the deployments and terminate the “dev” ACI webservice by making use of
the Azure ML SDK after your work with Azure Machine Learning has ended.
0 / 1 point
dev_webservice.delete()
dev_webservice.remove()
dev_webservice.flush()
dev_webservice.terminate()
Incorrect
Try going back and reviewing Use Azure Machine Learning to deploy serving models.
32.
Question 32
The DataFrame you are currently working on contains data regarding the daily sales of ice cream. In
order to compare the avg_temp and units_sold columns you decided to use the corr method which
returned a result of 0.95.
0 / 1 point
Days with high avg_temp values tend to coincide with days that have high units_sold values
On the day with the maximum units_sold value, the avg_temp value was 0.95
Incorrect
Try going back and reviewing Exercise - Explore data.
33.
Question 33
You can enable the Application Insights when configuring the service deployment at the moment you
want to deploy a new real-time service.
By using the SDK, what code should you write to achieve this goal?
1 / 1 point
dep_config = AciWebservice.deploy_configuration(cpu_cores = 1,
memory_gb = 1,
appinsights=True)
dep_config = AciWebservice.deploy_configuration(cpu_cores = 1,
memory_gb = 1,
enable_app_insights=True)
dep_config = AciWebservice.deploy_configuration(cpu_cores = 1,
memory_gb = 1,
app_insights(True))
dep_config = AciWebservice.deploy_configuration(cpu_cores = 1,
memory_gb = 1,
app_insights=True)
Correct
This is the correct code.
34.
Question 34
You usually take the following steps when you use HorovodRunner in order to develop a distributed
training program:
2. While using the methods described in Horovod usage, define a Horovod training method for which
you want to ensure that import statements are added inside the method.
0 / 1 point
hr = HorovodRunner(tf)
def train():
import tensorflow as np
hvd.init(2)
hr.run(train)
hr = HorovodRunner()
def train():
import tensorflow as tf
hvd.init(np)
hr.run(train)
hr = HorovodRunner(np)
def train():
import tensorflow as tf
hvd.init()
hr.run(train)
hr = HorovodRunner(np=2)
def train():
import tensorflow as tf
hvd.init()
hr.run(train)
Incorrect
Try going back and reviewing Use Horovod to train a deep learning model.
35.
Question 35
You’re using the Azure Machine Learning Python SDK to define a pipeline to train a model.
The data used to train the model is read from a folder in a datastore.
You need to ensure the pipeline runs automatically whenever the data in the folder changes.
1 / 1 point
Create a PipelineParameter with a default value that references the location where the training data
is stored
Create a ScheduleRecurrence object with a Frequency of auto. Use the object to create a schedule
for the pipeline
Create a Schedule for the pipeline. Specify the datastore in the datastore property, and the folder
containing the training data in the path_on_datastore property
Correct
To schedule a pipeline to run whenever data changes, you must create a Schedule that monitors a
specified path on a datastore.