IBM Big Data Engineer C2090-101 Exam Dumps Questions Updated

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

May 16, 2020

IBM Big Data Engineer C2090-101 Exam


Dumps Questions Updated
The Big Data Engineer works directly with the Data Architect and hands-on Developers to
convert the architect’s Big Data vision and blueprint into a Big Data reality. It must be good
news for all IBM Big Data Engineer exam candidates, we have updated the IBM C2090-101
exam dumps questions to ensure that candidates can pass IBM Big Data Engineer certification
exam smoothly. More, we also have C2090-101 free dumps questions and answers online, which
you can test by yourselves.

Big Data C2090-101 Free Exam Dumps Questions Online


1. Which statement is TRUE concerning optimizing the load performance?

You can improve the performance by increasing the number of map tasks assigned to the
load
When loading large files the number of files that you load does not impact the performance
of the LOAD HADOOP statement
You can improve the performance by decreasing the number of map tasks that are assigned
to the load and adjusting the heap size
It is advantageous to run the LOAD HADOOP statement directly pointing to large files
located in the host file system as opposed to copying the files to the DFS prior to load

2. Which of the following statements are TRUE regarding the use of Data Click to load data into
BigInsights? (Choose two.)

Big SQL cannot be used to access the data moved in by Data Click because the data is in
Hive
You must import metadata for all sources and targets that you want to make available for
Data Click activities
Connections from the relational database source to HDFS are discovered automatically from
within Data Click
Hive tables are automatically created every time you run an activity that moves data from a
relational database into HDFS
HBase tables are automatically created every time you ran an activity that moves data from a
relational database into HDFS

3. Which of the following statements regarding importing streaming data from InfoSphere
Streams into Hadoop is TRUE?
InfoSphere Streams can both read from and write data to HDFS
The Streams Big Data toolkit operators that interface with HDFS uses Apache Flume to
integrate with Hadoop
Streams applications never need to be concerned with making the data schemas consistent
with those on Hadoop
Big SQL can be used to preprocess the data as it flows through InfoSphere Streams before
the data lands in HDFS

4. Which of the following is TRUE about storing an Apache Spark object in serialized form?

It is advised to use Java serialization over Kryo serialization


Storing the object in serialized from will lead to faster access times
Storing the object in serialized from will lead to slower access times
All of the above

5. Which ONE of the following statements regarding Sqoop is TRUE?

HBase is not supported as an import target


Data imported using Sqoop is always written to a single Hive partition
Sqoop can be used to retrieve rows newer than some previously imported set of rows
Sqoop can only append new rows to a database table when exporting back to a database

6. Which one of the following statements is TRUE?

Spark SQL does not support HiveQL


Spark SQL does not support ANSI SQL
To use Spark with Hive, HiveQL queries have to rewritten in Scala
Spark SQL allows relational queries expressed in SQL, HiveQL, or Scala

7. Which of the following statements regarding Big SQL is TRUE?

Big SQL doesn’t support stored procedures


Big SQL can be deployed on a subset of data nodes in the BigInsights cluster
Big SQL provides a SQL-on-Hadoop environment based on map reduce
Only tables created or loaded via Big SQL can be accessed via Big SQL

8. The number of partitions created by DynamicPartitions in Hive can be controlled by which of


the following?

hive.exec.max.dynamic.partitions.pernode
hive.exec.max.dynamic.partitions
hive.exec.max.created.files
All of the above

9. Which of the following Jaq operators groups one or more arrays based on key values and
applies an aggregate expression?

join
group
expand
transform

10. Which of the following are CRUD operations available in HBase? (Choose two.)

HTable.Put
HTable.Read
HTable.Delete
HTable.Update
HTable.Remove

11. Which statement is TRUE about Big SQL?

The table definition can include other attributes such as the primary key or check constraints
When using Big SQL, the CREATE TABLE statement cannot be embedded in an application
program
If a sub-table is being defined, the authorization ID can be either the same as the owner of the
root table or an equivalent
When defining a staging table associated with a materialized query table, the privileges held
by the authorization ID of the statement only works with DBADM authority

12. Which of the following statements is TRUE regarding search visualization with Apache
Hue?

Hue submits MapReduce jobs to Oozie


No additional setup is required to secure your session cookies
Hue applications require some code to be installed on the client
The File Browser application allows you to perform keyword searches across your Hadoop
data

13. A Resilient Distributed Dataset supports which of the following?


Creating a new dataset from an old one
Returning a computed value to the driver program
Both “Creating a new dataset from an old one” and “Returning a computed value to the
driver program”
Neither “Creating a new dataset from an old one” nor “Returning a computed value to the
driver program”

14. In order for an SPSS Modeler stream to be incorporated for use in an InfoSphere Streams
application leveraging SPSS Modeler Solution Publisher, you need to:

add a Type node


insert any Output node
add a Table node as the terminal node
Make the terminal node a scoring branch

15. Which of the following Hive data types is directly supported in Big SQL without any
changes?

INT
STRING
STRUCT
BOOLEAN

16. Which parameters are considered when configuring Big Match algorithm?

Search and custom requirements


Accuracy, search, and performance
Adaptive weighting and standardization
Empirical components, accuracy, and performance

17. The GPFS implementation of Data Management API is compliant to which Open Group
storage management Standard?

XSH
XBD
XDSM
X /Open

18. Which file format support Column data compression? (Choose two.)
Text
Avro
RCFile
Parquet
Sequence_text

19. Which statement about the Jaqi Programming Language is TRUE?

Jaqi always produces a MapReduce job, but Combiner functionality is optional


Jaqi includes the following operators: filter, extend, groupby, combine, and transform
Data that is read from multiple blocks (splits) is always processed in parallel by MapReduce
The read operator loads data from different source and formats, and then converts this data
into JSON format for internal processing by the Jaqi interpreter

20. When we create a new table in Hive, which clause can be used in HiveSQL to indicate the
storage file format?

SAVE AS
MAKE AS
FORMAT AS
STORED AS

21. Which of the following is not a capability of Pig?

Low-latency queries
Schemas are optional
Nested relational data model
A high level abstraction on top of MapReduce

22. Given a file named readme.txt, which command will copy the readme.txt file to the <user>
directory on the HDFS?

hadoop fs Ccp readme.txt hdfs://test.ibm.com:9000/<user>


hadoop fs Ccp hdfs://test.ibm.com:9000/<user> readme.txt
hadoop fs Cput readme.txt hdfs://test.ibm.com:9000/<user>
hadoop fs Cput hdfs://test.ibm.com:9000/<user> readme.text

23. Which of the following is the most effective method for improving query performance on
large Hive tables?
Indexing
Bucketing
Partitioning
De-normalizing data

24. Which one of the following is NOT provided by the SerDe interface?

SerDe interface has to be built using C or C++ language


Allows SQL-style queries across data that is often not appropriate for a relational database
Serializer takes a Java object that Big SQL has been working with, and turns it into a format
that BigSQL can write to HDFS
Deserializer interface takes a string or binary representation of a record, and translates it into
a Java object that Big SQL can manipulate

25. Which of the following are capabilities of the Apache Spark project?

Large scale machine learning


Large scale graph processing
Live data stream processing
All of the above

26. Which of the following Big SQL statements is valid?

CREATE TABLE t1 WITH CS;


WITH t1 AS (…) (SELECT * FROM t1 WITH RR USE AND KEEP SHARE LOCKS)
UNION ALL (SELECT * FROM t1 WITH UR);
SELECT deptno, deptname, mgrno FROM t1 WHERE admrdept =‘A00’ FOR READ ONLY
WITH RS USE AND KEEP EXCLUSIVE LOCKS
ALTER TABLE t1 ALTER COLUMN deptname SET DATA TYPE VARCHAR(100) USE
AND KEEP UPDATE LOCKS

27. Which of the following techniques is NOT employed by Big SQL to improve performance?

Query Optimization
Predicate Push down
Compression efficiency
Load data into DB2 and return the data

28. When embedding SPSS models within InfoSphere Streams, what SPSS product must be
installed on the same machine with InfoSphere Streams?
SPSS Modeler
SPSS Solution Publisher
SPSS Accelerator for InfoSphere Streams
None, the SPSS software runs remotely to the Streams machine

29. Which of the following statements regarding Sqoop is TRUE? (Choose two.)

All columns in a table must be imported


Sqoop bypasses MapReduce for enhanced performance
Each row from a source table is represented as a separate record in HDFS
When using a password file, the file containing the password must reside in HDFS
Multiple options files can be specified when invoking Sqoop from the command line

30. Use of Bulk Load in HBase for loading large volume of data will result in which of the
following?

It will use less CPU but will use more network resource
It will use less network resource but more CPU
It will behave same way as using HBase API for loading large volume of data
None of the above

You might also like