Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

The result of scaling is a variable in the range of [1 , 10].

– False
Input variables are also known as feature variables. – True
The objective function for linear regression is also known as Cost Function. – True
For different parameters of the hypothesis function, we get the same hypothesis function. – False
What is the process of dividing each feature by its range called? – Feature Scaling
What is the name of the function that takes the input and maps it to the output variable called? Hypothesis Function
How are the parameters updated during Gradient Descent process? – Simultaneously
What is the process of subtracting the mean of each variable from its variable called? – MeanNormalization
controls the magnitude of a step taken during Gradient Descent. – Learning Rate
Problems that predict real values outputs are called – Regression Problems
What is the Learning Technique in which the right answer is given for each example in the data called? – Supervised
Learning
Cost function in linear regression is also called squared error function – True
Overfitting and Underfitting are applicable only to linear regression problems. – False
High values of threshold are good for the classification problem. – False
A suggested approach for evaluating the hypothesis is to split the data into training and test set. – True
What is the range of the output values for a sigmoid function? 0,1
Underfit data has a high variance. – False

Overfit data has high bias. – False


For ____________, the error is calculated by finding the sum of squared distance between actual and predicted
values. – Regression
Linear Regression is an optimal function that can be used for classification problems. – False
is the line that separates y = 0 and y = 1 in a logistic function. – Decision Boundary
I have a scenario where my hypothesis fits my training set well but fails to generalize for the test set. What is this
scenario called? – Over-fitting
Where does the sigmoid function asymptote? 0 and 1

For an underfit data set, the training and the cross-validation error will be high. – True
For an overfit data set, the cross-validation error will be much bigger than the training error. - True
measures how far the predictions are from the actual values. – Bias

When an ML Model has a high bias, getting more training data will help in improving the model. – False
For ____________, the error is determined by getting the proportion of values misclassified by the model. –
Classification
____________ function is used as a mapping function for classification problems. – Sigmoid
What measures the extent to which the predictions change between various realizations of the model? – Variance
Problems, where discrete-valued outputs are predicted, are called? – Classification Problems

Axioms:
Which of the learning methodology applies conditional probability of all the variables with respective the dependent
variable? – Supervised
Do you think heuristic for rule learning and heuristics for decision trees are both same ? – False
Now Can you make quick guess where Decision tree will fall into _____ - Supervised
What is the benefit of Naïve Bayes ? – requires less training data
What is the advantage of using an iterative algorithm like gradient descent ? (select the best) - For Nonlinear
regression problems, there is no closed form solution
For which one of these relationships could we use a regression analysis? Choose the correct one - Relationship
between Height & weight (both Quantitative)

Does Logistic regression check for the linear relationship between dependent and independent variables ? – False
Which helps SVM to implement the algorithm in high dimensional space? – Kernel
Kernel methods can be used for supervised and unsupervised problems – True
Perceptron is _______________ a single layer feed-forward neural network
While running the same algorithm multiple times, which algorithm produces same results? Hierarchical clustering

Correlation and regression are concerned with the relationship between _________ - 2 quantitative variables
SVM will not perform well with data with more noise because - target classes could overlap
While running the same algorithm multiple times, which algorithm produces same results?- Hierarchical
clustering
The main problem with using single regression line - merging of groups
If the outcome is continuous, which model to be applied? - Linear Regression
The standard approach to supervised learning is to split the set of example
into the training set and the test – True
Which model helps SVM to implement the algorithm in high dimensional
space? – Kernel
Which methodology works with clear margins of separation points? – SVM
Which type of the clustering could handle Big Data? K Means clustering
SVM uses which method for pattern analysis in High dimensional space? –
Kernel

Which of them, best represents the property of Kernel? – Modularity


Consider a regression equation, Now which of the following could not be answered by regression? - Estimate
whether the association is linear or non-linear
Which of the following is not example of Clustering? – RFM
Which technique implicitly defines the class of possible patterns by introducing
a notion of similarity between data? - Kerneld

You might also like