Machine Learning MCQ'S

Number
Correct Negative
Sl.no Question Option - 1 Option - 2 Option - 3 Option - 4 of Marks
answer marks
options
1 MARKS Questions
The values of the
To find the minimum or maximum of a function, we set the depends on the type depends on the
1 gradient at extrema of a none 4 1 1 0
gradient to zero because: pf problem accuracy
function is always zero
The negitive of the gradient E(w) vector gives the direction of

2 Steepest decrease Steepest increase nothing 3 1 1 0
----------
error in output is
propagated there is no feed
it is also called as
3 What is true regarding back propagation rule? backwards only to back of signal at all the mentioned 4 4 1 0
generalized delta rule
determine weight any stage
updates
4 What are general limitations of back propagation rule? local minima problem slow convergence scaling all the mentioned 4 4 1 0
a single layer feed-
a double layer auto
An auto associative forward neural a neural network that
5 What is a perceptron? associative neural 4 2 1 0
neural network network with contains feed back
network
preprocessing
A single layer feed-

A neural network A neural network
A neural network that forward neural
6 What is an auto associative network? that has only one that contains 4 3 1 0
contains no loops network with
loop feedback
preprocessing
it is the
transmission of
it is the transmission
it is another name given error back through
of error back through
7 What is back propagation? to the curvy function in the network to none 4 3 1 0
the network to adjust
the perceptron allow weights to be
the inputs
adjusted so that the
network can learn
What is the logic function that can not be implemented by a

8 AND OR NOR XOR 4 4 1 0
perceptron having two inputs is ----
The fitness function defines the criterion for ---- potential
9 Symbolic Ranking candidate Mutate 4 2 1 0
hyothesis?
The hypothesis space of artificial neural network is neither discrete nor
10 Discrete Continious none 4 2 1 0
----------------- continious
11 One of the following is not an applications ANN Time series prediction Unmanned Vehicles Remote Control Robot Control 4 3 1 0
12 The representation of ANN is ______ type of graphs Cyclic Acyclic Directed All of the above 4 4 1 0
13 In Back propagation, α represents Learning rate Momentum constant simulation rate 4 2 1 0
The ncessary condition to apply Perceptron rule is that data should
14 be
Linearly separable free from empty values free from noise none of the above 4 1 1 0
Smooth-differential Smooth-continuous piece wise piece wise differentiable

15 Sigmoid function is used for multi-layer networks because
function function continuous function function
4 1 1 0
16 The following is not associated with GA Gradient descent Crossover Selection Mutation 4 1 1 0
moves smoothly form moves fastly from
jumping from hypothesis
17 In GA, the seraching process is performing like
to hypothesis
hypothesis to hypothesis to there is no search in GA 4 1 1 0
hyposthesis hypothesis
18 Various slection mechanisms in GA Propotionate selection tournament selection rank selection All of the above 4 4 1 0
If OUTLOOK has three values sunny, overcast, rain then

19 OUTLOOK = sunny or overcast or rain represented as bit string
111 110 100 none of the above 4 1 1 0
20 In a typical GA, howmany inputs we have to provide? 5 4 3 2 4 1 1 0
ANN provide a general, practical method for learning real-valued, discrete-

21 ---------- functions from examples. valued, and vector-valued
only real-valued only discrete-valued only vector-valued 4 1 1 0
22 ANN learning is robust to ------------ in the training data. Semantics errors syntaxes consistent 4 2 1 0
The human brain is estimated to contain a densely interconnected
23 network of approximately ------------ neurons.
10^10 10^9 10^11 10^8 4 3 1 0
if the training example is correctly classified by the perceptron then

24 the value of (t-o) is ----------
2 1 1 0 4 4 1 0
If xi = 0.8, ƞ=0.1, t=1 and o=-1 then the weight update will be
25 ---------
0.15 0.14 0.13 0.16 4 4 1 0
The perceptron rule finds a successful weight vector when the

26 training examples are -------------- separable.
Linearly Non-linearly Propabilistically 3 1 1 0
The learning rate determines the -------- in the gradient descent

27 search.
Vector step size optimal solution Global Minima 4 2 1 0
Stochastic gradient standard gradient

28 The incremental gradient descent is also called as ------------ Delata rule perceptron rule
descent descent
4 3 1 0
In standard gradient descent, the error is ----------- over all

29 examples before updating weights.
Conjection Desjection summed Multiplication 4 3 1 0
both in perceptron neither perceptron nor

30 Squashing function is used to calculate the output of ---------- unit. perceptron sigmoid
and sigmoid sigmoid
4 2 1 0
31 tanh sigmoid logarithmic relu 4 1 0

Identify which of the following is not a valid activation function.
32 The process which searches through the hypothesis space to find gradient descent concept learning feed-forward network activation function 4 1 0
the weights that best fit the training examples.
33 The error of the output units are adjusted among the network Backward Pass Forward Pass perceptron Back Propagation 4 1 0
weights of an Artificial Neural Network through ______ algorithm.
34 A method of representing the error of the entire distribution interms Confidence Intervals Normal Distribution chi-squared test pearson’s coefficient 4 1 0
of sample error on the training data is _________.
35 Central Limit Theorem is applicable for examples taken from binomial distribution log-linear distribution non-linear normal distribution 4 1 0
___________ distribution.
36 The operator not used in genetic algorithms is _________ mutation selection crossover evaluation 4 1 0
37 “Add Alternative” and “Drop Condition”are employed in Genetic Programming GABIL Genetic Algorithms Generation 4 1 0
__________ system
only some values are
38 no value is valid any legal value is valid
valid
exponentiation 4 1 0
“*”is used to represent ________ in schema theorem.
39 The population size of future generation with respect to the current increases decreases remains constant increases or decreases 4 1 0
generation ____.
40 Optimised Computer Programs can be automatically generated Genetic Algorithms Genetic Programming Schema Theorem Candidate Generation 4 1 0
using __________ approach.
Number
Correct Negative
Sl.no Question Option - 1 Option - 2 Option - 3 Option - 4 of Marks
answer marks
options
2 Marks Questions
A 3-input neuron is trained to ouput a zero when the input is
110 and a one when the input is 111. After generalization, the 000 or 110 or 011 or 010 or 100 or 110 or 000 or 010 or 110 100 or 111 or101 or
41 4 3 1 0
output will be zero when and only when the input is 101 101 or 100 001
---------------?
A 4-input neuron has weights 1,2,3, and 4. The transfer

function is linear with the constant of proportionally being
42 238 76 119 123 4 1 1 0
equal to 2. The inputs are 4,10, 5 and 20 respectivelly, what
will be the output?
Whar are the offsprings after applying 2-point cross over on off spring 1= offspring1= offsping1=
below two chromosomes given as : chromosome 1= 11011 | 1101100100011110 and 1101111000110110 11011011011011
43 none 4 2 1 0
00100 | 110110 and chromosome 2= 10101 | 11000 | offspring2= offspring 2= offspring2=
011110 1010111000110110 1010100100011110 101010111100101
The ---------- operator produces small random changes to the

cross over and
44 bit string by choosing a singlebit atr random,then changing its The cross over mutation selection 4 2 1 0
mutation
value?
offspring1= offspring1=
What are the offsprings after applying 1-point cross over on
offspring1= 10001011 10011101 10001001
45 the below given chromosomes: chromosome 1= 100 | 11101 none 4 1 1 0
offspring2=10111101 offspring2= ofspring2=
and chromosome 2= 101 | 01011
10101011 10111111
Neural networks are complex ------------ with many
46 linear function non linear function discrete function exponential function 4 1 1 0
parameters?
In which cross over, all the data beyond that point in the
47 singlepoint cross over two point cross over uniform cross over mutation 4 1 1 0
organism string is swapped between the two parent organisms
two randompoints
are choosen on the
individual a cross over point
two random points are
chromosoms on the parent
48 What is two point cross over? used from one none 4 2 1 0
(strings) and the organism string is
generation to next
genetic material is selected
exchanged at these
points
An artificial neuron receives n inputs x1,x2,…..xn with

weights w1,w2,…wn attached to the input links. The ∑wi *
49 ∑wi ∑wi + 4 4 1 0
weighted sum ---------- is computed to be passed on to a non ∑xi ∑xi
∑xi
linear filter ∅ called activation function to relese the output.
Farward from
Back propagation is a learning technique that adjust weights Farword from source to Backward fromsink Backward from sink
50 source to hidden 4 2 1 0
in the neuralnetwork by propagating weight changes sink to source to hidden nodes
nodes
ANN cann't provide learning mechanisms for _______ hypothesis Discrete values Random valued
51 functions
Real Valued functions
functions functions
none of the above 4 4 2 0
Information Solving complex

52 In which aspect ANN are more efficent than bilogical nuerons Parallel processing distributed processing
transmission time problems
4 3 2 0
In multi layer nueral network, if there are 30 units in input layer

53 and there are 10 units in hidden layer then the size of weight matrix 31X5 5X31 30X5 5X30 4 2 2 0
used to transfer the data
Weight updation is Tries to minimize the

performing after may be effected by squared error It doest give gurantte for
54 Which of the following is not true for standard Back propation
processing of complete local minima, oftenly between target and solution
4 4 2 0
training data estimated output
the out put in

Directed cyclic graphs are They have more previous instant will Nuerons will be added
55 The following is not true for Recurrent NN (RNN)
used to depict representional power be consider as input dynamically
4 4 2 0
to next instance
restricting on
Crowding is an issue in GA, which of following is not used to
56 resolve it
Tournment Mechanism rank selection members having None of the above 4 4 2 0
highest fitnest value
mutation is used to
Crossover creates new off Selection is used to create new off fitness function is used
57 Which of the following is not correct springs by considering to perform next springs by to calculate fittness 4 3 2 0
parents generation considering two value
parents
Whale colony
58 The alogirthm which is not Evolutionary algorithm Ant colony algorithm Wolf colony algorithm
algorithm
Lion colony algorithm 4 4 2 0
Can simulate more

Adopted from natural
59 GA are more popular. But the following reason not suitable
species
Highly parallized complex hypothesis Highly randomized 4 4 2 0
space
Individual learning impacts on the rate of evolutionary process is
60 called as
Darwin evolution lamarckain evolution baldwin evolution Mendal evolution 4 3 2 0
Genetic Programming is a variant of Genetic Algorithms in which

61 the hypotheses being manipulated are ------------ rather than bit Computer Programs Nodes Edges Trees 4 1 2 0
strings
The most fit members of the population are selected to produce
62 -----------
Results new offspring two hypotheses two nodes 4 2 2 0
Genetic algorithms conduct a randomized, parallel, ----------- for

63 hypotheses that optimize a predefined fitness function.
DFS BFS Hill-climbing 3 3 2 0
64 output of DU (MT CS)(NOT CS) ------------ Stack full condition stack not full condition stack empty stack with one block 4 3 2 0
65 m(1**1,t)=---------- 12 4 8 16 4 2 2 0
Practical difficulty in some GA applications is the problem of
66 --------
Crowding Crossover Mutation Uniqueness 4 1 2 0
Two-point crossover on 1 0 [0 1]1 and 1 [1 1 0] 1= ---------- and

67 --------------
11100, 10101 10000, 11111 10111, 10101 101101, 1011 4 4 2 0
The number of operands taken by the operator mutation

68 ---------------
2 3 4 1 4 4 2 0
69 Find the selection function in the following fitness function roulette wheel probability threshold 4 2 2 0
70 In GA the number of pairs to be replaced in populations is ------- r*p/2 (1-r)*p/2 p/2 P*r 4 1 2 0
The parameter which determines the step size in gradient descent is non-linear activation convex hypothesis
71 termed as ____________.
linear separability learning rate
function space.
4 2 0
Sum of absolute values Sum of squared

The error of an Artificial Neural Network after every pass is Sum of errors of all Sum ofcubed errors of
72 evaluated by computing ____________.
of errors of all output
output units
errors of all output
all output units
4 2 0
units units
During backpropagation, the error term provided is estimated for
updation on the ____________.
73 Hidden unit output unit Input unit network weights 4 2 0
Suppose a data sample contains n=50 training examples and the

given hypothesis commits 10 errors over this data. Considering a
74 sample error, then with 95% confidence, the lower bound on the
0.356 0.3723 0.1729 0.427 4 2 0
error of entire distribution is _________ approximately.
One of the parameters determining normal distribution is

75 ____________.
median mode recall mean 4 2 0
two samples drawn from two samples drawn two samples drawn one sample drawn from
76 Paired t-tests are not applicable to _____. the same normal from different from the same two different 4 2 0
distribution distributions binomial distribution distributions
represented by represented by
77 Perceptrons are effective for problems that are ____________. Linearly separable not linearly separable polynomial target exponential target 4 2 0
function function.
Binary “AND” function is an example of ______________
78 problem.
linearly separable not linearly separable polynomial exponential 4 2 0
When the output of every unit in an artificial neural network is
79 connected as input to every unit in the next layer, then the neural connected connection-less partially connected fully connected 4 2 0
network is said to be ________________ in nature.
Using Gradient Descent Algorithm in Neural networks does not
80 always guarantee ___________.
Global optima Local Minima Local Maxima Local Optima 4 2 0

Machine Learning MCQ'S

Uploaded by

Copyright:

Available Formats

You might also like

Machine Learning MCQ'S

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning MCQ'S

Uploaded by

Copyright:

Available Formats

Number

The negitive of the gradient E(w) vector gives the direction of

A single layer feed-

What is the logic function that can not be implemented by a

Smooth-differential Smooth-continuous piece wise piece wise differentiable

If OUTLOOK has three values sunny, overcast, rain then

20 In a typical GA, howmany inputs we have to provide? 5 4 3 2 4 1 1 0

ANN provide a general, practical method for learning real-valued, discrete-

if the training example is correctly classified by the perceptron then

The perceptron rule finds a successful weight vector when the

The learning rate determines the -------- in the gradient descent

Stochastic gradient standard gradient

In standard gradient descent, the error is ----------- over all

both in perceptron neither perceptron nor

31 tanh sigmoid logarithmic relu 4 1 0

A 4-input neuron has weights 1,2,3, and 4. The transfer

The ---------- operator produces small random changes to the

An artificial neuron receives n inputs x1,x2,…..xn with

Information Solving complex

In multi layer nueral network, if there are 30 units in input layer

Weight updation is Tries to minimize the

the out put in

Can simulate more

Genetic Programming is a variant of Genetic Algorithms in which

Genetic algorithms conduct a randomized, parallel, ----------- for

Two-point crossover on 1 0 [0 1]1 and 1 [1 1 0] 1= ---------- and

The number of operands taken by the operator mutation

Sum of absolute values Sum of squared

73 Hidden unit output unit Input unit network weights 4 2 0

Suppose a data sample contains n=50 training examples and the

One of the parameters determining normal distribution is

You might also like