Class On DDL Commands

You might also like

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 1

DDL Statements

DDL refers to "Data Definition Language", a subset of SQL statements that change
the structure of the database schema in some way, typically by creating, deleting,
or modifying schema objects such as databases, tables, and views. Most Impala DDL
statements start with the keywords CREATE, DROP, or ALTER.

The Impala DDL statements are:

ALTER TABLE Statement


ALTER VIEW Statement
COMPUTE STATS Statement
CREATE DATABASE Statement
CREATE FUNCTION Statement
CREATE ROLE Statement (CDH 5.2 or higher only)
CREATE TABLE Statement
CREATE VIEW Statement
DROP DATABASE Statement
DROP FUNCTION Statement
DROP ROLE Statement (CDH 5.2 or higher only)
DROP TABLE Statement
DROP VIEW Statement
GRANT Statement (CDH 5.2 or higher only)
REVOKE Statement (CDH 5.2 or higher only)
After Impala executes a DDL command, information about available tables, columns,
views, partitions, and so on is automatically synchronized between all the Impala
nodes in a cluster. (Prior to Impala 1.2, you had to issue a REFRESH or INVALIDATE
METADATA statement manually on the other nodes to make them aware of the changes.)

If the timing of metadata updates is significant, for example if you use round-
robin scheduling where each query could be issued through a different Impala node,
you can enable the SYNC_DDL query option to make the DDL statement wait until all
nodes have been notified about the metadata changes.

Although the INSERT statement is officially classified as a DML (data manipulation


language) statement, it also involves metadata changes that must be broadcast to
all Impala nodes, and so is also affected by the SYNC_DDL query option.

Because the SYNC_DDL query option makes each DDL operation take longer than normal,
you might only enable it before the last DDL operation in a sequence. For example,
if you are running a script that issues multiple of DDL operations to set up an
entire new schema, add several new partitions, and so on, you might minimize the
performance overhead by enabling the query option only before the last CREATE,
DROP, ALTER, or INSERT statement. The script only finishes when all the relevant
metadata changes are recognized by all the Impala nodes, so you could connect to
any node and issue queries through it.

The classification of DDL, DML, and other statements is not necessarily the same
between Impala and Hive. Impala organizes these statements in a way intended to be
familiar to people familiar with relational databases or data warehouse products.
Statements that modify the metastore database, such as COMPUTE STATS, are
classified as DDL. Statements that only query the metastore database, such as SHOW
or DESCRIBE, are put into a separate category of utility statements.

You might also like