Professional Documents
Culture Documents
Hatdog 1.2
Hatdog 1.2
Hatdog 1.2
The main difference between the ID3 and C4.5 algorithms lies in the evaluation criteria of node classification
True
o False
If there is a complex non-linear relationship between independent variable x and dependent variable y, the tree model
may be used as a regression method.
True
o False
The C4.5 algorithm uses the Gini index as the evaluation criteria for node classification.
True
o False
When a function is called in Python, the immutable objects such as number and character are called by value.
True
o False
For data with two dimensions, when k-means is used for clustering, the clustering result is displayed as a sphere in the
space.
True
o False
After training the support vector machine (SVM), you can only retain the support vector and discard all non-support
vectors. The classification capability of the model remains unchanged.
True
o False
In the convolutional neural network (CNN), convolutional layers and pooling layers must appear alternately.
True
o False
Principal component analysis (PCA) can greatly reduce the data dimension when most information of the original
dataset is contained.
True
o False
In Python, the title() function can capitalize the initial letter of a string.
True
o False
In Python, when an object is deleted, the destructor function is automatically called.
True
o False
True
o False
o True
False
Convolutional neural network (CNN) can only be used to solve visual problems and cannot be used for natural language
processing.
o True
False
Support vector machine (SVM) has a good effect in dealing with high-dimensional nonlinear problems.
True
o False
If the number of layers of a neural network is too large, gradient disappearance or gradient explosion may occur.
True
o False
In Python, a static method can be directly accessed and does not need to be called using
CLASSNAME.STATIC_METHOD_NAME().
True
o False
When a function is called in Python, mutable objects such as list and dictionary are called by reference.
True
o False
In Python, the string function capitalize() can capitalize the initial letter of a string.
True
o False
True
o False
Multiple Choice Single Answer.
Assume that the statement print(6.3 – 5.9 == 0.4) is executed in the Python interpreter, and the result is False. Which of
the following statements about the result is true?
A. The Boolean operation cannot be used for comparing floating-point numbers
B. It is caused by the priority of operators
C. Python cannot exactly represent floating-point numbers
D. In Python, the non-zero value is interpreted as false
Data scientists may use multiple algorithm (models) at the same time for prediction, and integrate the results of these
algorithms for final prediction (ensemble learning). Which of the following statements about ensemble learning is true?
Assume that training data is sufficient, and the dataset is used to train a decision tree. To reduce the time required for
model training, which of the following statements is true?
Imbalanced data of binary classification refers to the dataset with a large difference between the proportion of positive
samples and the proportion of negative samples, for example, 9:1. If a classification model is trained based on the
dataset and the accuracy of the model on training samples is 90%, which of the following statements is true?
A. The accuracy of the model is high, and the model does not need to be optimized.
B. The accuracy of the model is not satisfactory, and the model needs to be retrained after data sampling.
B. Logistic regression
D. Random forest
Which of the following statements about support vector machines (SVM) is false?
B. In high-dimensional space, SVM uses hyperplanes with the maximum interval for classification.
The data output by binary classification can be considered as a probability value. Generally, a threshold is set, for
example, 0.5. If the value is greater than the threshold, it is a positive category. Otherwise, it is a negative category. If
the threshold is increased from 0.5 to 0.7, which of the following changes will occur in the precision and recall rate of
the model?
A. The precision increases or remains unchanged, and the recall rate increases or remains unchanged.
B. The precision increases or remains unchanged, and the recall rate decreases or remains unchanged.
C. The precision decreases or remains unchanged, and the recall rate increases or remains unchanged.
D. The precision decreases or remains unchanged, and the recall rate decreases or remains unchanged.
C. Linear regression assumes that data does not have multiple linear correlations.
Which of the following procedures is not a procedure for building a decision tree?
A. Feature selection
D. Pruning
When decision tree is used for classification, if the value of an input feature is continuous, the dichotomy is used to
discretize the continuous attribute. It means that the classification is performed based on whether the value is greater
than or less than a threshold. If the multi-path division is used, each value is divided into a branch. What is the biggest
problem of this method?
B. The performance of both the training set and the test set is poor.
C. The performance of the training set is good, but the performance of the test set is poor.
D. The performance of the training set is poor, and the performance of the test set is good.
For a dataset with only one dependent variable x, what is the number of coefficient(s) required to construct a simplest
linear regression model?
A. 1
B. 2
C. 3
D. 4
Which of the following algorithms is not an ensemble algorithm?
A. XGBoost
B. GBDT
C. Random forest
Assume that a classification model is built using logistic regression to obtain the accuracy of training samples and test
samples. Then, add a new feature to the data, keep other features unchanged, and train the model again. Which of the
following statement is true?
About the values of four variables a, b, c, and d after executing the following code, which of the following statements is
false?
import copy
a = [1, 2, 3, 4, [‘a’,’b’]
b=a
c = copy.copy(a)
d = copy.deepcopy(a)
a.append(5)
a[4].append(‘c’)
A. a == [1,2,3,4,[‘a’,’b’,’c’],5]
B. b == [1,2,3,4,[‘a’,’b’,’c’],5]
C. c == [1,2,3,4,[‘a’,’b’,’c’]]
D. d == [1,2,3,4,[‘a’,’b’,’c’]]
A. Increasing the number of neural network layers may increase the classification error rate of a test set.
B. Reducing the number of neural network layers can always reduce the classification error rate of a test set.
C. Increasing the number of neural network layers can always reduce the classification error rate of a training set.
For a multi-layer perceptron (MLP), the number of nodes at the input layer is 10, and the number of nodes at the hidden
layer is 5. The maximum number of connections from the input layer to the hidden layer is?
B. Less than 50
C. Equal to 50
D. Greater than 50
Assume that there is a trained deep neural network model for identifying cats and dogs, and now this model will be used
to detect the locations of cats in a new dataset. Which of the following statements is true?
B. Remove the last layer of the network and retrain the existing model.
C. Adjust the last several layers of the network and change the last layer to the regression layer.
Which of the following statements about the k-nearest neighbor (KNN) algorithm is false?
A. KNN is a non-parametric method which is usually used in datasets with irregular decision boundaries.
Assume the training data is sufficient, and the dataset is used to train a decision tree. To reduce the time required for
model training, which of the following statements is true?
If you want to predict the probability of n classes (p1, p2, …, pk), and the sum of probabilities of n classes is equal to 1,
which of the following functions can be used as the activation function in the output layer?
A. softmax
B. ReLu
C. sigmoid
D. tanh
In which case can a neural network model be called a deep learning model?
D. Python first
During the training of a convolutional neural network (CNN), it is often found that the precision of a model in a test set
gradually increases as the number of parameter increases. However, when a certain value is reached, the precision
decreases. What is the cause of this phenomenon?
A. Although the number of convolutional kernel increases, only a small number of convolutional kernels participate in
the prediction.
B. When the number of convolutional kernels increases, the prediction capability of the neural network decreases.
When data is too large to be processed at the same time in the RAM, which of the following gradient descent methods is
more effective?
C. Both A and B
D. Neither A nor B
To resolve an image recognition problem, such as finding out a cat in a photo, which of the following neural networks
offers the best solution?
A. Perceptron
A. x = y = z =1
B. x = (y = z + 1)
C. x, y = y, x
D. x + =y
Assume that the independent variable x is a continuous variable. To observe the relationship between the dependent
variable y and the independent variable x, which of the following graphs should be used?
A. Scatter chart
B. Histogram
C. Pie chart
a = ‘a’
A. True
B. False
C. a > ‘b’
D. c
def basefunc(first):
def innerfunc(second):
return innerfunc
A. base(2)(3) == 8
B. base(2)(3) == 6
C. base(3)(2) == 8
D. base(3)(2) == 6
A. (1)
B. (1,)
C. (1, 2)
Which of the following statements about the TensorFlow development framework is false?
A. TensorFlow supports various devices from small mobile phones to large computer clusters
A. 3
B. 3.0
C. 4
D. 4.0
B. A string with three single quotations (“”) can contain special characters such as line feed and carriage return.
D. A string can be created by using a single-quotation mark(‘) or double quotation marks (‘’).
A. int
B. float
C. 0
D. 0.5
Is it necessary to increase the size of a convolutional kernel to improve the effect of a convolutional neural network
(CNN)?
A. Yes
B. No
D. Uncertain
Deep learning can be used in which of the following natural language tasks?
A. Sentimental Analysis
B. Q&A system
C. Machine translation
A. ReLuo
B. tanho
C. sigmoido
Which of the following functions cannot be used as an activation function of a neural network?
A. y = sin(x)
B. y = tanh(x)
C. y = max(0, x)
D. y = 2x
Polysemy can be defined as the coexistence of multiple meanings of a word or phrase in a text object. Which of the
following methods is the best choice to solve this problem?
B. Gradient explosion
C. Gradient disappearance
In deep learning, a large number of matrix operations are involved. Now the product ABC of three dense matrices A, B
and C needs to be calculated. Assume that sizes of the three matrices are m x n, n x p, and p x q respectively, and m < n <
p < q, then which of the following calculation sequences is the most efficient one?
A. (AB)C
B. A(BC)
C. (AC)B
D. A(CB)
Assume that there are two neural networks with different output layers, There is one output node in the output layer of
network 1, whereas there are two output nodes in the output layer of network 2. For a binary classification problem,
which of the following methods do you choose?
A. Use network 1
B. Use network 2
When a pooling layer is added to a convolutional neural network (CNN), will the translation invariance be retained?
A. Uncertain
A. data?
B. ?data
C. _data
D. 9data
A. True
B. False
C. ‘a’<’b’
D. ‘ab’
data = [1, 3, 5, 7]
data.append([2, 4, 6, 8])
print(len(data))
A. 4
B. 5
C. 8
D. An error occurred
The result of executing the following code is?
for i in range(1,3):
print(i)
for j in range(2):
print(j)
A. 1 3 2
B. 1 2 0 1
C. 1 3 0 1
D. 1 3 0 2
Assume that there is a simple multi-layer perceptron (MLP) model with three neurons and the input [1,2,3], and the
weights of the neuron are 4,5,6 respectively. If the activation function is a linear constant value 3 (the activation function
is y =3x), which of the following values is the output?
A. 32
B. 48
C. 96
D. 128
Generally, which of the following methods is used to predict continuous independent variables?
A. Linear regression
B. Logistic regression
About the single-underscored member_proc, double-underscored _proc member, and _proc_in Python, which of the
following statements are true?
A. from module import * can be directly used to import the single-underscored member _proc
B. from module import * cannot be directly used to import the double-underscored member _proc
C. In Python, the parser uses_classname_proc to replace the double-underscored member _proc
D. In Python, _proc_ is a specific indicator to magic methods
B. With the increasing traffic volume, the prediction rate of the model is still acceptable
D. The user interface of the service system where the model is located is user-friendly.
Which of the following statements about the gradient boosting decision tree (GBDT) algorithm are true?
A. Increasing the minimum number of samples used for segmentation helps prevent overfitti
B. Increasing the minimum number of samples used for segmentation may cause overfitting.
C. Reducing the sample ratio of each basic tree helps reduce the variance.
D. Reducing the sample ratio of each basic tree helps reduce the deviation.
Variable selection is used to select the best discriminator subset. What need to be considered to ensure the efficiency of
the model?
Which of the following activation functions can be used for image classification at the output layer?
A. sigmoid
B. tanh
C. ReLu
D. Piecewise functions
Which of the following assumptions are used to derive linear regression parameters?
C. The error generally obeys the normal distribution of 0 and the standard deviation of the fixed average value
Which of the following methods can reduce the overfitting problem of a deep learning model?
Which of the following measures can be taken to prevent overfitting in the neural network?
A. Dropout
B. Data augmentation
C. Weight sharing
D. Early stopping
Which of the following statements about the convolutional neural network (CNN) are true?
A. Increasing the size of convolutional kernels can significantly improve the performance of the CNN.
C. Parameter sharing
Feature selection is necessary before model training. Which of the following statements are the advantages of feature
selection?
C. Learning rate
Which of the following statements about long short-term memory (LSTM) are true?
A. The forget phase of LSTM is to selectively forget the input transferred from the previous node.
D. The output phase of LSTM is to determine which will be considered as the output of current state.
Which of the following statements about generative adversarial network (GAN) are true?
A. The GAN contains a generative model (generator) that takes a random vector as input and decodes it as a specific
output.
B. The GAN contains an adversarial model (adversarial device) that transforms specific input and outputs adversarial
data that contradicts the input.
C. The GAN contains a discriminative model (discriminator) that can determine whether the input data is from the
training set or synthesized through data.
D. The GAN is a dynamic system. Its optimization process is not to find a minimum value, but to find a balance between
two forces.
The neural network is inspired by the human brain. A neural network consists of many neurons, and each neuron
receives an input and provides an output after processing the input. Which of the following statements about neurons
are true?
Which of the following layers are usually included in a deep neural network used for image recognition?
A. Convolutional layer
B. Pooling layer
C. Recurrent layer
A. TensorFlow 2.0 requires the construction of a computational graph at first, then you can start a session, import
data to the session, and perform training.
B. Eager execution is enabled in TensorFlow 2.0 by default. It is a type of command line programming, making the
execution simpler.
C. In TensorFlow 2.0, if you want to build a new layer, you can directly inherit tf.keras.layers.Layer
Which of the following statements about the application of deep learning methods are true?
A. Massive discrete data can be encoded using embedded mode as input of the neural network, which greatly
improves the effect of data analysis.
B. The convolutional neural network (CNN) is well applied in the field of image processing, but it cannot be used in
natural language processing.
C. The recurrent neural network (RNN) is mainly used to deal with sequence-to-sequence problems, but it often
encounters the problems of gradient disappearance and gradient explosion.
D. The generative adversarial network (GAN) is a method used for model generation.
A. The negative side of ReLu is a dead zone, leading to the gradient becomes 0.
B. The sigmoid function is better than the ReLu function in preventing the gradient disappearance problem.
C. The long short term memory (LSTM) adds several channels and gates based on the recurrent neural network (RNN)
Which of the following statements are the functions of the pooling layer in a convolutional neural network (CNN)?
D. Preventing overfitting
During neural network training, which of the following phenomena indicate that gradient explosion problem may occur?
Which of the following statements about the recurrent neural network (RNN) are true?
A. The standard RNN solves the problem of information memory. Its advantage is that even if the number of memory
units is limited, the RNN can keep the long-term information.
B. The standard RNN can store context states and can extend on the time sequences.
C. The standard RNN captures dynamic information in serialized data by periodical connection of nodes at the hidden
layer
D. Intuitively, there is no need to connect nodes between the hidden layer at the current moment and the hidden layer
at the next moment in the RNN.
A. data[1 : -1]
B. data[1 : 7]
C. list(data)
D. data * 3
A. a[-1]
B. a[2 : 99]
C. a[ : - 1 : 2]
D. a[5 - 7]
Data cleansing is to clear dirty data in a dataset. The dirty data refers to?
Features selection is necessary before model training. Which of the following statements are the advantages of feature
selection?
D. Combining data
Which of the following assumptions are used to derive linear regression parameters?
C. The error generally obeys the normal distribution of 0 and the standard deviation of the fixed average value.
Principal component analysis (PCA) is a common and effective method for dimensionality reduction. Which of the
following statements about PCA are true?
When the parameters are the same in all cases, and how does the number of sample observation times affect
overfitting?
B. The number of observation times is small, and overfitting is not likely to occur.
D. The number of observation times is large, and overfitting is not likely to occur