Professional Documents
Culture Documents
IBM Big Data Engineer C2090-101 Exam Dumps Questions Updated
IBM Big Data Engineer C2090-101 Exam Dumps Questions Updated
IBM Big Data Engineer C2090-101 Exam Dumps Questions Updated
You can improve the performance by increasing the number of map tasks assigned to the
load
When loading large files the number of files that you load does not impact the performance
of the LOAD HADOOP statement
You can improve the performance by decreasing the number of map tasks that are assigned
to the load and adjusting the heap size
It is advantageous to run the LOAD HADOOP statement directly pointing to large files
located in the host file system as opposed to copying the files to the DFS prior to load
2. Which of the following statements are TRUE regarding the use of Data Click to load data into
BigInsights? (Choose two.)
Big SQL cannot be used to access the data moved in by Data Click because the data is in
Hive
You must import metadata for all sources and targets that you want to make available for
Data Click activities
Connections from the relational database source to HDFS are discovered automatically from
within Data Click
Hive tables are automatically created every time you run an activity that moves data from a
relational database into HDFS
HBase tables are automatically created every time you ran an activity that moves data from a
relational database into HDFS
3. Which of the following statements regarding importing streaming data from InfoSphere
Streams into Hadoop is TRUE?
InfoSphere Streams can both read from and write data to HDFS
The Streams Big Data toolkit operators that interface with HDFS uses Apache Flume to
integrate with Hadoop
Streams applications never need to be concerned with making the data schemas consistent
with those on Hadoop
Big SQL can be used to preprocess the data as it flows through InfoSphere Streams before
the data lands in HDFS
4. Which of the following is TRUE about storing an Apache Spark object in serialized form?
hive.exec.max.dynamic.partitions.pernode
hive.exec.max.dynamic.partitions
hive.exec.max.created.files
All of the above
9. Which of the following Jaq operators groups one or more arrays based on key values and
applies an aggregate expression?
join
group
expand
transform
10. Which of the following are CRUD operations available in HBase? (Choose two.)
HTable.Put
HTable.Read
HTable.Delete
HTable.Update
HTable.Remove
The table definition can include other attributes such as the primary key or check constraints
When using Big SQL, the CREATE TABLE statement cannot be embedded in an application
program
If a sub-table is being defined, the authorization ID can be either the same as the owner of the
root table or an equivalent
When defining a staging table associated with a materialized query table, the privileges held
by the authorization ID of the statement only works with DBADM authority
12. Which of the following statements is TRUE regarding search visualization with Apache
Hue?
14. In order for an SPSS Modeler stream to be incorporated for use in an InfoSphere Streams
application leveraging SPSS Modeler Solution Publisher, you need to:
15. Which of the following Hive data types is directly supported in Big SQL without any
changes?
INT
STRING
STRUCT
BOOLEAN
16. Which parameters are considered when configuring Big Match algorithm?
17. The GPFS implementation of Data Management API is compliant to which Open Group
storage management Standard?
XSH
XBD
XDSM
X /Open
18. Which file format support Column data compression? (Choose two.)
Text
Avro
RCFile
Parquet
Sequence_text
20. When we create a new table in Hive, which clause can be used in HiveSQL to indicate the
storage file format?
SAVE AS
MAKE AS
FORMAT AS
STORED AS
Low-latency queries
Schemas are optional
Nested relational data model
A high level abstraction on top of MapReduce
22. Given a file named readme.txt, which command will copy the readme.txt file to the <user>
directory on the HDFS?
23. Which of the following is the most effective method for improving query performance on
large Hive tables?
Indexing
Bucketing
Partitioning
De-normalizing data
24. Which one of the following is NOT provided by the SerDe interface?
25. Which of the following are capabilities of the Apache Spark project?
27. Which of the following techniques is NOT employed by Big SQL to improve performance?
Query Optimization
Predicate Push down
Compression efficiency
Load data into DB2 and return the data
28. When embedding SPSS models within InfoSphere Streams, what SPSS product must be
installed on the same machine with InfoSphere Streams?
SPSS Modeler
SPSS Solution Publisher
SPSS Accelerator for InfoSphere Streams
None, the SPSS software runs remotely to the Streams machine
29. Which of the following statements regarding Sqoop is TRUE? (Choose two.)
30. Use of Bulk Load in HBase for loading large volume of data will result in which of the
following?
It will use less CPU but will use more network resource
It will use less network resource but more CPU
It will behave same way as using HBase API for loading large volume of data
None of the above