Emc Isilon Onefs With Hadoop and Hortonworks Installation Guide - Docu71396

EMC ISILON ONEFS WITH HADOOP AND
HORTONWORKS INSTALLATION GUIDE

VERSION 1.07
Abstract
This guide walks you through the process of installing Isilon OneFS with
Hadoop for use with the Hortonworks Data Platform (HDP) and Apache
Ambari manager.
Copyright © 2017 Dell Inc. or its subsidiaries. All rights reserved.
April 2017
Dell believes the information in this publication is accurate as of its publication date. The information
is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS-IS.“ DELL MAKES NO REPRESENTATIONS OR

WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND
SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE. USE, COPYING, AND DISTRIBUTION OF ANY DELL SOFTWARE DESCRIBED IN THIS
PUBLICATION REQUIRES AN APPLICABLE SOFTWARE LICENSE.
Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may
be the property of their respective owners.
Published in the USA.
EMC Corporation
Hopkinton, Massachusetts 01748-9103
1-508-435-1000 In North America 1-866-464-7381
www.EMC.com
Isilon OneFS Hadoop and Hortonworks Installation Guide 2

Publication History
Version Date Description

1.01 July 20, 2016 Initial version.
1.02 August 23, 2016 Corrections to Ambari wizard procedures,
including HTTPS instructions.
1.03 November 22, 2016 Corrections to the Customize Services screen for
the HDFS service.
1.04 February 14, 2017 Added the Addendum for OneFS 8.0.1 with
security features. Also, reorganized and updated
information in the Installing OneFS with Hadoop
and Configuring Ambari and Hortonworks
sections.
1.05 March 22, 2017 Changed Ambari version “2.5 or later” to Ambari
version “2.4 or later” in the Prerequisites section
for Apache Ranger.
1.06 March 31, 2017  Changed the isi license activate
command to isi license licenses
activate.
 Fixed the syntax to modify existing subnets
and moved it to the Configure SmartConnect
section.
 Documented a setting that prevents an error
in the Ambari Metrics.
1.07 October 5, 2017  Added “Updates and Additional Information
on Isilon Hadoop Installs” and “Isilon OneFS
cluster configuration” sections.

Contents
Introduction .................................................................................................................................................... 6
Audience ........................................................................................................................................................ 6
Overview ........................................................................................................................................................ 6
Updates and additional information on Isilon Hadoop installs......................................................................... 8
Prerequisites .................................................................................................................................................. 8
Ambari and Hortonworks Data Platform ....................................................................................................... 8
Isilon Multi-tenant Installations ............................................................................................................... 8
Isilon OneFS cluster configuration ............................................................................................................... 9
Installing OneFS with Hadoop ......................................................................................................................... 9
Preparing OneFS ....................................................................................................................................... 10
Validate OneFS version and license activation ....................................................................................... 10
Configure OneFS components ................................................................................................................ 11
Create an access zone ........................................................................................................................... 11
Configure SmartConnect ........................................................................................................................ 12
Configure DNS for Isilon ......................................................................................................................... 13
Verify the SmartConnect configuration ................................................................................................... 14
Create and configure Isilon HDFS root .................................................................................................... 14
Modify the access control list (ACL) setting for OneFS ............................................................................ 15
Create HDFS users and groups ............................................................................................................... 15
Preparing Ambari ...................................................................................................................................... 20
Configuring Ambari and Hortonworks ........................................................................................................ 21
Verify the deployment ............................................................................................................................... 34
Addendum.................................................................................................................................................... 36
Apache Ranger authorization policy integration ......................................................................................... 36
Prerequisites ......................................................................................................................................... 38
Deploy Apache Ranger ........................................................................................................................... 38
Ambari metrics and alerts overview ........................................................................................................... 39
Prerequisites ......................................................................................................................................... 39
View Ambari metrics .............................................................................................................................. 39
View Ambari alerts ................................................................................................................................. 41
HDFS wire encryption overview .................................................................................................................. 41
Configure HDFS Wire Encryption with OneFS .......................................................................................... 42

Configure Apache Hadoop for HDFS wire encryption ............................................................................... 42
Integrating HTTPFS and Knox with OneFS ................................................................................................... 42
Contacting EMC Isilon Technical Support ...................................................................................................... 43

Introduction
Hadoop is an open-source framework that enables the distributed processing of large data sets across
clusters of computers. Hadoop clusters use the Hadoop Distributed File System (HDFS) to provide high-
throughput access to application data. You can follow the steps in this guide to install EMC Isilon OneFS with
Hadoop for use with the Hortonworks Data Platform (HDP) and the Apache Ambari manager.
Before you begin, you must install an Isilon OneFS cluster.
Audience
This guide is intended for systems administrators, IT program managers, IT architects, and IT managers who
will be installing OneFS with Hadoop.
Overview
The EMC Isilon OneFS scale-out network-attached storage (NAS) platform provides Hadoop clients with direct
access to Big Data through a Hadoop Distributed File System (HDFS) interface. An Isilon OneFS cluster
powered by the OneFS operating system delivers a scalable pool of storage with a global namespace.
Hadoop compute clients can access the data that is stored in an Isilon OneFS cluster by connecting to any
node over the HDFS protocol. All nodes configured for HDFS provide NameNode and DataNode functionality.
Each node boosts performance and expands the cluster's capacity. For Hadoop analytics, the Isilon scale-out
distributed architecture minimizes bottlenecks, rapidly serves Big Data, and optimizes performance for
MapReduce jobs.
In a traditional Hadoop deployment, the Hadoop compute nodes run analytics jobs against large sets of data.
A NameNode directs the nodes to the data stored on a series of DataNodes. The NameNode is a separate
server that holds metadata for every file that is stored on the DataNode. Often data is stored in production
environments and then copied to a landing zone server to be loaded on to HDFS. This process is network
intensive and exposes the NameNode as a potential single point of failure.

In an Isilon OneFS cluster with Hadoop deployment, OneFS serves as the file system for Hadoop compute
clients. On an Isilon OneFS cluster, every node in the cluster acts as a NameNode and DataNode, providing
automated failover protection.
When a Hadoop client runs a job, the clients access the data that is stored on an Isilon OneFS cluster by
connecting over HDFS. The HDFS protocol is native to the OneFS operating system, and no data migration is
required.
The Hortonworks distribution is stored on the compute cluster, and the clients connect to the Isilon OneFS
cluster over the HDFS protocol to store and access Hadoop data.

Updates and additional information on Isilon Hadoop installs
The rapid release of new features and versions of Hadoop projects can introduce new behaviors and
requirements. We recommend that you review the latest updates on the Using Hadoop with Isilon - Isilon Info
Hub for updates and known issues while deploying Isilon and Hadoop.
Prerequisites
For supported versions, see the Hadoop Distributions and Products Supported by OneFS compatibility matrix.
Ambari and Hortonworks Data Platform

Isilon Multi-tenant Installations
If this Hadoop cluster is to be installed as a multi-tenant Isilon implementation, leveraging multiple Access
Zones utilizing the same Active Directory provider: Additional configurations and considerations must be
addressed before installing HDP and Ambari. We recommend that you engage Dell EMC Professional Services
to understand the requirements and deployment strategies available before deploying Ambari.
Ensure that the following requirements are met:
 Ambari 2.1.x or later.

 Hortonworks Data Platform (HDP) 2.3.x or later.
 Password-less SSH configured. See section 4 of the Hortonworks documentation for configuring
Password-less SSH.
 Familiarity with the Ambari and Hortonworks documentation and the installation instructions.
o To view the Ambari and the Hortonworks Data Platform (HDP) documents, go to
http://docs.hortonworks.com/index.html
o Use the following table to record the components that you have installed.

Component Version
Ambari version
HDP stack version
Isilon OneFS cluster name
Ambari server (FQDN)
Isilon OneFS cluster configuration

Ensure that the following requirements are met:
 An Isilon OneFS cluster running OneFS 8.0.0.1 or greater. OneFS 8.0.0.1 contains a number of
updates to facilitate the integration and deployment of Hadoop. We strongly recommend that you
use OneFS 8.0.0.1 or greater.
 The Isilon OneFS cluster has access to your network and your network has access to the Internet.
Internet access is required to download components.
 SmartConnect Advanced, a separately licensed Isilon module, is activated and SmartConnect is
configured on your Isilon OneFS cluster.
 HDFS, a separately licensed Isilon module, is activated on your Isilon OneFS cluster. Contact your
EMC Isilon sales representative for more information about receiving your license keys.
 A valid Isilon SmartConnect SSIP and Domain Name System (DNS) delegation is in place to
provide name resolution services for a SmartConnect zone. For more information, see Isilon
External Network Connectivity Guide—Routing, Network Topologies, and Best Practices for
SmartConnect.
 A dedicated Isilon Access Zone is in use. This is not the same as the System Zone.
 An Isilon HDFS root directory in the Access Zone.
 A simple access model between Hadoop and Isilon; UID and GUID, with parity.
 Use the following table to record the components that you have installed:
Component Version/License
Isilon OneFS
SmartConnect module
HDFS module
Isilon OneFS cluster name
Installing OneFS with Hadoop

The installation of OneFS with Hadoop can be separated into four stages as represented in the following
illustration.

To complete the stages, you must perform tasks on both the Hortonworks/Ambari cluster and the Isilon OneFS
cluster.
1 Preparing OneFS
2 Preparing Ambari
3 Configuring Ambari
4 Verifying the Deployment
Preparing OneFS
Complete the following steps to configure your Isilon OneFS cluster for use with Ambari and Hortonworks Data
Platform. Preparing OneFS requires you to configure DNS, SmartConnect, and Access Zones to allow for the
Hadoop cluster to connect to the Isilon OneFS cluster. If these preparation steps are not successful, the
subsequent configuration steps might fail.
Validate OneFS version and license activation

You must validate your OneFS version, check your licenses, and confirm that they are activated.
1. From a node in your Isilon OneFS cluster, confirm that the cluster is running OneFS 8.0.0.x by typing the
following command:
isi version
Note: If you are using a version of OneFS earlier than 8.0.0.0, the implementation steps in this guide will
not work.
2. Confirm that licenses for HDFS and SmartConnect Advanced are operational. If these licenses are not
active and valid, some commands in this guide might not work.
Run the following commands to confirm that HDFS and SmartConnect Advanced are installed:
isi license licenses list
isi license licenses view HDFS
isi license licenses view "SmartConnect Advanced"
3. If your modules are not licensed, obtain a license key from your EMC Isilon sales representative. Type the
following command to activate the license:

isi license licenses activate <key>
Configure OneFS components

After you configure DNS for OneFS, set up and configure the following OneFS components.
 Create an access zone
 Create a SmartConnect zone
 Create and configure the HDFS root in the access zone
 Create users and groups
 Create a basic HDFS folder structure for use with HDFS
Use the following table to record the configuration information for the integration of Isilon OneFS cluster with
HDP Ambari:
Parameter Value
Access zone name
Access zone path
SmartConnect zone name (FQDN)
IP range for IP pool (ranges)
SmartConnect pool name (subnet pool)
Node and interfaces in the pool
HDFS root path
Ambari server
Ambari NameNode
Open Data Platform (ODP) version
Create an access zone

On one of the Isilon nodes, you must define an access zone on the Isilon OneFS cluster and enable the
Hadoop node to connect to it.
1. On a node in the Isilon OneFS cluster, create your Hadoop access zone.
isi zone zones create -–name=zone1 -–path=/ifs/zone1/hdp --create-path
2. Verify that the access zones are set up correctly.

isi zone zones list --verbose
Output similar to the following appears:

Name: System
Path: /ifs
Groupnet: groupnet0
Map Untrusted: -
Auth Providers: lsa-local-provider:System, lsa-file-provider:System
NetBIOS Name: -
User Mapping Rules: -
Home Directory Umask: 0077
Skeleton Directory: /usr/share/skel
Cache Entry Expiry: 4H
Zone ID: 1
--------------------------------------------------------------------------------
Name: zone1-hdp
Path: /ifs/zone1/hdp
Groupnet: groupnet0
Map Untrusted: -
Auth Providers: lsa-local-provider:zone1-hdp
NetBIOS Name: -
User Mapping Rules: -
Home Directory Umask: 0077
Skeleton Directory: /usr/share/skel
Cache Entry Expiry: 4H
Zone ID: 2
3. Create the HDFS root directory within the access zone that you created.
mkdir -p /ifs/zone1/hdp/hadoop
4. List the contents of the Hadoop access zone root directory.

ls –al /ifs/zone1/hdp
Configure SmartConnect
On a node in the Isilon OneFS cluster, add a dynamic IP address pool and associate it with the access zone
you created earlier.
1. Modify your existing subnets and specify a service address.

isi network subnets modify groupnet0.subnet0 -–sc-service-addr=x.x.x.x
2. Create an access zone.

isi network pools create --id=<groupnet>:<subnet>:<name> --ranges=x.x.x.x-x.x.x.x -
-access-zone=<my-access-zone> --alloc-method=dynamic -–ifaces=X-Y: <your
interfaces> --sc-subnet=subnet0 --sc-dns-zone=<my-smartconnectzone-name> --
description=hadoop

Where:
--name subnet: <poolname>—New IP pool in subnet (for example, subnet0:pool1).
--ranges—IP range that is assigned to the IP pool
--ifaces—Node interfaces that are added to the pool
--access-zone—Access zone that the pool is assigned to
--sc-dns-zone—SmartConnect zone name
--sc-subnet—SmartConnect service subnet that is responsible for this zone
3. View the properties of the existing pool.
isi network pools view --id=groupnet0:subnet0:Hadoop-pool-hdp

ID: groupnet0.subnet0.hadoop-pool-hdp
Groupnet: groupnet0
Subnet: subnet0
Name: hadoop-pool-hdp
Rules: -
Access Zone: zone1-hdp
Allocation Method: dynamic
Aggregation Mode: lacp
SC Suspended Nodes: -
Description: hdp_hadoop_access_zone
Ifaces: 1:ext-1, 2:ext-1, 3:ext-1, 4:ext-1
IP Ranges: x.x.x.x-x.x.x.x
Rebalance Policy: auto
SC Auto Unsuspend Delay: 0
SC Connect Policy: round_robin
SC Zone: isilonsczone-hdp.FQDN
SC DNS Zone Aliases: -
SC Failover Policy: round_robin
SC Subnet: subnet0
SC Ttl: 0
Static Routes: -
Configure DNS for Isilon

Before you begin, the Isilon OneFS cluster must already be implemented according to EMC Isilon best
practices. For more information, see the HDFS Setup section of the EMC Isilon Best Practices for Hadoop Data
Storage.

Set up DNS records for a SmartConnect zone. You must create the required DNS records that are used to
access your Isilon OneFS cluster from the Hadoop cluster. All hosts in your Hadoop cluster must be configured
for both forward and reverse DNS lookups Hadoop relies heavily on DNS and performs many DNS lookups
during normal operation.
You can set up a SmartConnect zone for the connections from Hadoop compute clients. SmartConnect is a
module that specifies how the Isilon OneFS cluster handles connection requests from clients. For additional
information and best practices for SmartConnect, see the EMC Isilon External Network Connectivity Guide.
Each SmartConnect zone represents a specific pool of IP addresses. When you associate a SmartConnect zone
with an access zone, OneFS allows only clients that connect through the IP addresses in the SmartConnect
zone to reach the HDFS data in the access zone. A root HDFS directory is specified for each access zone. This
configuration isolates data within access zones and allows you to restrict client access to the data.
A SmartConnect zone distributes NameNode requests from Hadoop compute clients across the node
interfaces in the IP pool. Each node's NameNode process will reply with the IP address of any OneFS node
where the client can access the data. When a Hadoop compute client makes an initial DNS request to connect
to the SmartConnect zone FQDN, the Hadoop client requests are delegated to the SmartConnect Service IP,
which responds with a valid node to connect to. The client will then connect to an Isilon node that serves as a
NameNode. When a second Hadoop client makes a DNS request to connect to the SmartConnect zone, the
SmartConnect Service routes the client connection to a different node than the one used by the previous
Hadoop compute client.
When you create a SmartConnect zone, you must add a Name Server (NS) record as a delegated domain to the
authoritative DNS zone that contains the Isilon OneFS cluster.
For additional information and best practices, see the "DNS delegation best practices" section of the EMC
Isilon External Network Connectivity Guide.
Verify the SmartConnect configuration

You must validate that SmartConnect is set up correctly by pinging the SmartConnect zone FQDN several times
from the Hadoop client.
ping isilonsczone-hdp.FQDN
When you view the output of this command, note that different IP addresses are returned for each ping
command, because with each DNS response, the IP addresses are returned through rotating round-robin DNS
from the list of potential IP addresses. This validates that the SmartConnect zone name FQDN is operating
correctly.
Create and configure Isilon HDFS root

On a node in the Isilon OneFS cluster, create and configure the HDFS root directory.
1. View the HDFS service settings.

isi hdfs settings view -–zone=zone1-hdp
2. Set the HDFS root directory for the access zone.

isi hdfs settings modify --zone=zone1-hdp –-root-directory=/ifs/zone1/hdp/hadoop

3. Map the HDFS user to Isilon root. Create a user mapping rule to map the HDFS user to the OneFS root
account. This mapping enables the services from the Hadoop cluster to communicate with the OneFS
cluster using the correct credentials.
isi zone zones modify --user-mapping-rules=”hdfs=>root” --zone=zone1-hdp
4. Assign the Ambari NameNode in the access zone and associate the SmartConnect name with it.
isi hdfs settings modify -–zone=zone1 --ambari-namenode=<my-smartconnectzone-name>
5. Assign the Ambari server in the access zone.

Note that the host FQDN specified here is the one running the Ambari server.
isi hdfs settings modify -–zone=zone1 --ambari-server=hadoop-node.my.domain.com
6. Create an indicator file in the Hadoop directory to view your Isilon OneFS cluster and access zone through
HDFS.
touch /ifs/zone1/hdp/hadoop/THIS_IS_ISILON-hdp.txt
Modify the access control list (ACL) setting for OneFS

Run the following command on a node in the Isilon OneFS cluster to modify ACL settings before you create
directories or files in the next section. This creates the correct permission behavior on the cluster for HDFS.
Note: ACL policies are cluster-wide, so you should understand this change before performing it on production
clusters.
isi auth settings acls modify --group-owner-inheritance=parent
isi auth settings acls view < -- validate it worked
Create HDFS users and groups

For each Hadoop system account that will submit HDFS jobs or access the file system, you must create local
users on the Isilon OneFS cluster. You can add Hadoop users to the Isilon OneFS cluster manually or by using
the script provided at: https://github.com/Isilon/isilon_hadoop_tools
Note: The script creates the service accounts that are required for Hadoop.
Important
EMC Isilon recommends that you maintain consistent names and numeric IDs for all users and groups on the
Isilon OneFS cluster and your Hadoop clients. This consistency is important in multi-protocol environments
because the HDFS protocol refers to users and groups by name, and NFS refers to users and groups by their
numeric IDs (UIDs and GIDs). Maintaining this parity is critical in the behavior of OneFS multiprotocol file
access.
When installing Hadoop, the installer creates all of the required system accounts. For example, a Hadoop
system account, yarn, is created with the UID of 502 and the GID of 500 on the Hadoop cluster nodes. Since
the Hadoop installer cannot create the local accounts directly on OneFS, they need to be created manually.
You must create the Isilon yarn local account user in the Isilon access zone in which yarn will access data. You
need to create a local user yarn with the UID of 502 and the GID of 500 to ensure consistency of access and
permissions.

For guidance and more information about maintaining parity between Isilon OneFS and Hadoop local users
and UIDs, see the following blog post:
https://community.emc.com/community/products/isilon/blog/2016/06/22/isilon-and-hadoop-user-uid-
parity
There are many methods of achieving UID and GID parity. You can leverage the create user and directory
scripts, perform manual matching, or create scripts that parse users and create the equivalent users. However
you choose to achieve this, the sequence will depend on your deployment methodology and management
practices. EMC highly recommends that you to achieve consistency between the Hadoop cluster and OneFS,
for example, hdfs=hdfs, yarn=yarn, hbase=hbase, and so on from a UID and GID consistency perspective.
Create users and directories on the Isilon OneFS cluster using Isilon scripts
This methodology achieves parity by executing user creation in the following sequence:
1. Create local users and groups on Isilon OneFS.
2. Collect the UIDs and GIDs of the users.
3. Pre-create local users and groups on all HDP hosts to be deployed.
If the local Hadoop system users and groups already exist on the Linux host, then the Ambari wizard does not
create them. If you created them with the same UIDs and GIDs as on OneFS, you will maintain parity.
You must add a user on the Isilon OneFS cluster for each user that runs Hadoop services and user that submits
jobs, and you must create any additional users that may access the Isilon OneFS cluster.
In the following steps, you can run an Isilon script to create local user and group accounts on the Isilon OneFS
cluster. The Isilon script adds a list of default Hadoop users to the Isilon OneFS cluster that are mapped to the
services and applications on the Hadoop cluster.
Note: If the users and groups must be defined by your directory services, such as Active Directory or LDAP, you
must create the users and groups manually as outlined later in Create users on the Isilon OneFS cluster
manually.
The isilon_create_users.sh script creates users that can be listed by running the following command:
cat zone1-hdp.passwd
For more information on the list of accounts you need based on services, go to
http://docs.hortonworks.com/index.html
Steps to perform on the Isilon OneFS cluster
1. On a node in the Isilon OneFS cluster, create a scripts directory. You will extract the scripts to this
directory.
mkdir -p /ifs/zone1/hdp/scripts
2. Change directories to the scripts directory that you created and change the execute permissions.
cd /ifs/zone1/hdp/scripts
3. Clone or download the latest version of the Isilon Hadoop tools as a ZIP file from
https://github.com/Isilon/isilon_hadoop_tools into the /ifs/zone1/hdp/scripts directory.

4. Unzip the files.
5. Use WinSCP to transfer the isilon_create_users.sh and isilon_create_directories.sh script

files to the /ifs/zone1/hdp/scripts directory or use wget to get the scripts directly.
Alternatively, run the following commands if the Isilon OneFS cluster has a direct internet access:
wget
https://raw.githubusercontent.com/Isilon/isilon_hadoop_tools/master/isilon_create_u
sers.sh --no-check-certificate
wget
https://raw.githubusercontent.com/Isilon/isilon_hadoop_tools/master/isilon_create_d
irectories.sh --no-check-certificate
cd /ifs/zone1/hdp/scripts
chmod u+x *
6. To create all required local users and groups on your Isilon OneFS cluster for the Hadoop services and
applications, run the following script:
Script usage:
bash isilon_create_users.sh –-dist=hwx –-startgid <GID> -–startuid <UID> -–
zone=<ZONE>
Where:
dist This will correspond to your Hadoop distribution - cdh, hwx, phd, phd3, or
bi.
startgid The beginning UID range for the creation of users (the default is 1000).
startuid The beginning GID range for the creation off users (the default is 1000).
zone The name of the access zone where the users should be created (useful for
multi-tenant environments that will use a single KDC). The System zone is
the default.
Example:
bash isilon_create_users.sh --dist hwx --startgid 501 –startuid=501 -–zone=zone1

Info: Hadoop distribution: hwx
Info: groups will start at GID 501
Info: users will start at UID 501
Info: will put users in zone: zone1
Info: HDFS root: /ifs/zone1/hdp
SUCCESS -- Hadoop users created successfully!
Done!
7. To create directories to map to the Hadoop users with appropriate ownership and permissions, download
the isilon_create_directories.sh from https://github.com/Isilon/isilon_hadoop_tools and run the
following script:

Script usage:
bash isilon_create_directories.sh –dist=hwx -–zone=<ZONE> -–fixperm
Where:
dist The distribution of Hadoop for which the script should create an NDFS
directory skeleton.
zone The name of the access zone where the directory skeleton should be
created.
fixperm The argument to alter the permissions ownership of the directory skeleton
according to the users created by the isilon-create-users.sh script.
Example:
bash isilon_create_directories.sh -dist hwx --fixperm –-zone=zone1

Info: Hadoop distribution: hwx
Info: will fix permissions and owners on existing directories
Info: will use users in zone: zone1
Info: Access Zone ID is 2
Info: HDFS root dir is /ifs/zone1/hdp
###############################################################################
###
## Creates Hadoop directory structure on Isilon system HDFS.
###############################################################################
###
DEBUG: specs dirname /; perm 755; owner hdfs; group hadoop
DEBUG: specs dirname /app-logs; perm 777; owner yarn; group hadoop
DEBUG: specs dirname /app-logs/ambari-qa; perm 770; owner ambari-qa; group
hadoop
DEBUG: specs dirname /app-logs/ambari-qa/logs; perm 770; owner ambari-qa; group
hadoop
DEBUG: specs dirname /tmp; perm 1777; owner hdfs; group hdfs
DEBUG: specs dirname /apps; perm 755; owner hdfs; group hadoop
DEBUG: specs dirname /apps/falcon; perm 777; owner falcon; group hdfs
DEBUG: specs dirname /apps/accumulo/; perm 750; owner accumulo; group hadoop
DEBUG: specs dirname /apps/hbase; perm 755; owner hdfs; group hadoop
DEBUG: specs dirname /apps/hbase/data; perm 775; owner hbase; group hadoop
DEBUG: specs dirname /apps/hbase/staging; perm 711; owner hbase; group hadoop
DEBUG: specs dirname /apps/hive; perm 755; owner hdfs; group hdfs
DEBUG: specs dirname /apps/hive/warehouse; perm 777; owner hive; group hdfs
DEBUG: specs dirname /apps/tez; perm 755; owner tez; group hdfs

DEBUG: specs dirname /apps/webhcat; perm 755; owner hcat; group hdfs
DEBUG: specs dirname /mapred; perm 755; owner mapred; group hadoop
DEBUG: specs dirname /mapred/system; perm 755; owner mapred; group hadoop
DEBUG: specs dirname /user; perm 755; owner hdfs; group hdfs
DEBUG: specs dirname /user/ambari-qa; perm 770; owner ambari-qa; group hdfs
DEBUG: specs dirname /user/hcat; perm 755; owner hcat; group hdfs
DEBUG: specs dirname /user/hdfs; perm 755; owner hdfs; group hdfs
DEBUG: specs dirname /user/hive; perm 700; owner hive; group hdfs
DEBUG: specs dirname /user/hue; perm 755; owner hue; group hue
DEBUG: specs dirname /user/oozie; perm 775; owner oozie; group hdfs
DEBUG: specs dirname /user/yarn; perm 755; owner yarn; group hdfs
DEBUG: specs dirname /system/yarn/node-labels; perm 700; owner yarn; group
hadoop
mkdir: /ifs/zone1/hdp/system/yarn: No such file or directory
chown: /ifs/zone1/hdp/system/yarn/node-labels: No such file or directory
chown: /ifs/zone1/hdp/system/yarn/node-labels: No such file or directory
chmod: /ifs/zone1/hdp/system/yarn/node-labels: No such file or directory
SUCCESS -- Hadoop admin directory structure exists and has correct ownership
and permissions
Done!
8. List the HDFS base directories.

ls -al /ifs/zone1/hdp/hadoop
ls -al /ifs/zone1/hdp/hadoop/user
9. On the Isilon OneFS cluster, collect the created local users and group UID & GID’s.
10. List out and copy the contents of the zone1-hdp.passwd and zone1-hdp.group files that are created
when the user_create script is executed.
cat zone1-hdp.passwd
cat zone1-hdp.group
Steps to perform on the Hadoop client
For each Hadoop client:
1. Edit the /etc/passwd file on the Hadoop client and cut and paste the contents of the zone1-
hdp.passwd file from Isilon to the end of the existing HDFS passwd file.
2. Edit the /etc/group file on the Hadoop client and cut and paste the contents of the zone1-
hdp.group file from Isilon to the end of the existing HDFS group file.
We have now created all the relevant users and groups on Isilon and all our Hadoop clients with uid/gid parity.
As the local accounts exist on the Hadoop clients, the Ambari will not create them but leverage the existed
ones maintaining ID parity with the Isilon. Other methods to achieve these exist but are beyond the scope of
this document.

Create users on the Isilon OneFS cluster manually
To add more users, in addition to the users that the Isilon script configures, on the Isilon OneFS cluster you
can add a user for each additional Hadoop user that submits MapReduce jobs. The following procedures show
how to manually add a single user called hduser1.
Warning
If you want the users and groups to be defined by your directory service, such as Active Directory or LDAP, do
not run these commands. This section addresses setting permissions of the HDFS root files or membership to
run jobs. These steps create users, but will likely fail when you run jobs with this configuration.
Steps to perform on the Isilon OneFS cluster
1. Add a group to the Isilon OneFS cluster.

isi auth groups create hduser1 -–zone=zone1 --provider local –-gid <GID>
2. Create the user and the user's Hadoop home directories on the Isilon OneFS cluster.
isi auth users create hduser1 --primary-group hduser1 –zone=zone1 --provider local
--home-directory /ifs/zone1/hdp/user/hduser1 –-uid <UID>
3. Assign permissions to the user's home directory on the Hadoop cluster. The ID 2 in the example below is
from when you previously ran the isi zone zones view zone1 command.
isi_run -z2 chown hduser1:hduser1 /ifs/zone1/hdp/user/hduser1
chmod 755 /ifs/zone1/hdp/hadoop/user/hduser1
Since you created a new user on OneFS to run jobs, you must create the same user with UID parity on any
Linux hosts that the user will access to run jobs.
1. Add the user to the Hadoop cluster.

adduser hduser1 –u <UID>
Preparing Ambari
The steps in this phase will occur on the Ambari hosts which will become your Hadoop servers and clients.
Hadoop clusters and services rely heavily on DNS. All client hosts in your system must be configured for both
forward and reverse DNS lookups. Validate that all hosts can resolve each other's hostnames and IP
addresses.
Before you begin the installation of Ambari, ensure that all your hosts meet the requirements needed by
Ambari and Hortonworks to complete a successful Hadoop cluster installation. For more information and these
installation guides, go to the Hortonworks website: http://docs.hortonworks.com/index.html
1. To prepare Ambari for implementation, follow the instructions for your version of Ambari in the
Hortonworks guide. In the Automated Install with Ambari guide (PDF), see the Install & Upgrade section.
The Automated Install with Ambari guide provides the steps that you must perform to prepare the Ambari
environment install Ambari, and installing and configure HDP.

Important
Complete the steps in the Hortonworks guide in section 1, "Getting Ready", and section 2, "Installing Ambari".
After you start the Ambari server, do not continue with section 3 of the Hortonworks Guide until after you have
completed the Preparing OneFS section of this guide.
You must complete the following steps that are described in the Hortonworks Guide:
1. Download the Ambari repository for the operating system that runs your installation host.
2. Set up the Ambari server.
3. Start the Ambari server.
4. After you complete sections 1 and 2 of the Hortonworks Guide, Ambari should be operational, with the
Ambari server running on one of the Hadoop nodes.
5. Logging in to the Ambari server will bring you to the create cluster page.
6. Do not continue to section 3 of the Hortonworks Guide until the Isilon OneFS cluster is prepared in the
following steps and is ready to integrate into Ambari during the installation.
Configuring Ambari and Hortonworks

Perform the steps in section 3 of the Hortonworks Ambari HDP installation guide, "Installing, Configuring, and
Deploying a HDP Cluster." An outline of the steps that you must perform to configure Ambari and Hortonworks
for OneFS is as follows:
1. Specify the Hadoop client and Isilon SmartConnect zone as a Target Hosts.
2. Assign NameNode and Secondary NameNode components to Isilon SmartConnect.

3. Assign DataNodes only to Isilon SmartConnect. Do not add DataNodes to any of the HDP hosts.
4. Remove any client components for Isilon.
Steps to perform in your browser from your client
1. After starting the Ambari service, open Ambari Web using a web browser.
2. Point your browser to http://<your.ambari.server>:8080, where <your.ambari.server> is

the name of your Ambari server host.
3. Log in to the Ambari server using admin as the default user name and password. You can change
these credentials later.
4. For a new cluster, the Ambari install wizard displays a Welcome page from which you launch the
Ambari Install wizard.
5. Select Launch Install Wizard, and complete the installation steps in the Hortonworks guide, including
the following steps that are specific to Isilon.

6. In the Name your cluster field, type a unique name for the cluster and then click Next. On the Select
Version screen, select the desired version of the HDP stack and the repository to configure it. Consult
the compatibility matrix for the version that you must configure with your OneFS cluster.
7. Click one of Use Public Repository or Use Local Repository based on the availability of Internet access,
and in the next section, remove all the operating systems except redhat6 as shown:

8. Click Next to proceed to the Install Options screen.

9. On the Install Options screen, under Target Hosts, type the fully qualified name of all the Linux hosts
that participate in the Hadoop cluster. Do not include Sialon at this time.
10. Under Host Registration Information attach the SSH private key.
The creation of this host and key was performed prior to running the Ambari wizard.
See the Ambari prerequisites and Hortonworks Ambari documentation for additional details.
Important
Do not include the Isilon OneFS cluster in the target hosts.
To obtain the private key:
a. Log in to your Ambari host server.
b. Generate the private key as shown:

ssh-keygen
c. Copy the key to all the hosts as shown:

ssh-copy-id root@XXXXX
d. Change to the /root/.ssh directory.

e. Output the contents of the id_rsa file.
cat /root/.ssh/id_rsa
f. Copy the output to a file and save the file on your desktop.
g. Copy the file to the machine on which you are running the web-based Ambari Install Wizard.

h. Copy the entire contents of the id_rsa file, including the leading dash characters, into the Host
Registration Information box in Install Options screen of the Ambari wizard.
If no keys exist, see the Ambari documentation on generating a private key and setting up keyless
SSH access on hosts in the cluster.
i. Select Register and Confirm.

11. On the Confirm Hosts screen, after the Host’s Status shows Success, click the Back button to add the
Isilon SmartConnect name.

12. On the Install Options screen, add a line in the Target Hosts box and type the fully qualified name of
the Isilon SmartConnect zone name, and then select Perform manual registration on hosts and do not
use SSH.
13. On the Warning screen, click OK.

14. Click Register and Confirm.
15. On the Before You Proceed screen, note that you must install Ambari Agents on each host that you
want to manage before you continue.
16. Click OK.
17. All clients and the Isilon OneFS cluster should register successfully.
18. Click Next.
19. On the Choose Services screen, select or clear the Knox option based on your requirement. Refer to
the Integrating HTTPFS and Knox with OneFS section for details about integrating HTTPFS and Knox
with OneFS in case you have selected Knox.
20. Accept the default settings, and click Next.

21. On the Assign Masters screen, assign the NameNode and SNameNode components to the Isilon
OneFS cluster. Remove the other Isilon OneFS cluster components.
22. Assign all other master components to Hadoop client host(s).

For guidance and more information about allocation and assignment, see the Hortonworks Ambari
documentation.
23. On the Assign Slaves and Clients screen, ensure that DataNode is selected for the Isilon OneFS
cluster, and cleared for all the other hosts.
24. Scroll to the right, and ensure that the Client option is selected for all the other desired hosts and
cleared for the Isilon OneFS cluster. Clear any other components that are assigned to the Isilon OneFS
cluster.

Note: The Isilon OneFS cluster should be assigned only to the DataNode.
Other hosts should have all required clients and slaves as required for your deployment. Refer to the
Hortonworks deployment guide for further details.
25. Click Next.
26. On the Customize Services screen, set the passwords for the service configurations that display a red
alert with the properties that require your attention.
If you see red alerts, select each service tab that has an alert and enter the relevant information.

27. On the Customize Services screen, for the HDFS service, on the Advanced settings tab, update the
following settings in the Advanced hdfs site settings section:
a. Change the dfs.namenode.http-address PORT to the FQDN for the SmartConnect Zone Name
followed by port 8082 from 50070.
b. Change the dfs.namenode.https-address to the FQDN for the SmartConnect Zone Name followed
by port 8080 from 50470.
c. Add a property in the Custom hdfs-site field by the name dfs.client-write-packet-size.
d. Set dfs.client-write-packet-size to 131072.
e. Change the dfs.datanode.http.address port from 50075 to 8082. This setting prevents an error
that generates a traceback in ambari-server.log each time you log in to the Ambari server.
28. On the Customize Services screen, for the Yarn service, on the Advanced settings tab, set
yarn.scheduler.capacity.node-locality-delay to 0
For more information, see "Ambari Customizing HDP Services" in the Hortonworks guide.
29. Click Next.
You might see some warnings depending on your configuration settings. You can ignore some of the
warning messages related to HDFS that are shown in the following sample screen:

30. On the Review screen, review the configuration for accuracy. Click Deploy.
31. On the Install, Start and Test screen, wait for the test results. The tests should result in success for
both the Isilon and compute nodes.
A progress indicator appears.

32. Wait until the status is 100% complete and then click Next.
Refer to the Hortonworks documentation for guidance on how to manually restart the services in case
some services fail to install and start up.

33. On the Summary screen, click Complete.
Ambari starts automatically. All services on the left pane should display as a green checkmark.

34. Set the ODP version:
a. In Ambari Web, click Admin > Stack and Versions and then click the Versions tab.
b. Click Show Details under the HDP version.
c. Make note of the full HDP version number which appears within parentheses and contains a
dash. For example, “2.4.3.0-37”.
d. Connect to a node in the Isilon OneFS cluster and run the following command through the
OneFS command-line interface:
isi hdfs settings modify --odp-version=<f.u.l.l-version> --zone=<zone>
Note: You can also perform the above action through the OneFS web administration interface.
On the Versions tab, you will observe that the hosts count under Current increases by one.
Also, a previous occurrence of 1 under Not Installed now changes to 0 as shown:
You are now configured to use an Isilon OneFS cluster with Hortonworks for Apache Hadoop.
Verify the deployment

Test the basic functionality of the Isilon and Ambari integration through the following steps.
1. Browse the Isilon HDFS root.

hadoop fs -ls /

[root@rs-cd1 tmp]# hadoop fs –ls /
Found 4 items

drwxr-xr-x - hbase hbase 0 2017-01-20 16:58 /hbase
drwxrwxr-x - sp;r solr 0 2016-06-20 12:53 /solr
drwxrwxrwt - hdfs supergroup 0 2016-06-20 12:53 /tmp
drwxr-xr-x - hdfs supergroup 0 2016-06-20 12:53 /user
2. Write to the Isilon HDFS root create a test directory.

[root@rs-cd1 tmp]# hadoop fs –mkdir /This_is_Isilon
[root@rs-cd1 tmp]# hadoop fs –ls /
Found 5 items
drwxr-xr-x - root hadoop 0 2017-01-20 17:14 /This_is_Isilon_zone1-
hdp.txt
drwxr-xr-x - hbase hbase 0 2017-01-20 16:58 /hbase
drwxrwxr-x - solr solr 0 2017-01-20 12:53 /solr
drwxrwxr-x - hdfs supergroup 0 2017-01-20 12:53 /tmp
drwxr-xr-x - hdfs supergroup 0 2017-01-20 12:53 /user
3. To test MapReduce, on the Ambari dashboard click the Services tab and then click MapReduce2 > Service
Actions > Run Service Check. The following screen appears with an indication of the progress of the service
check:
This completes the high level overview of deploying Ambari with OneFS.

Addendum
This section discusses some of the security, management, and troubleshooting features that are available in
an installation of OneFS with Hadoop.
Apache Ranger authorization policy integration

Apache Ranger is a centralized management console that enables you to monitor and manage data security
across the Hortonworks Hadoop distribution system. A Ranger administrator can define and apply
authorization policies across Hadoop components including HDFS.
With releases of OneFS earlier than 8.0.1.0, you can deploy Hortonworks with Isilon as a HDFS layer and use
Ranger to create authorization policies for all the supported Hadoop components such as Hive. However,
HDFS policies are not recognized by OneFS with this implementation. With OneFS 8.0.1.0 and later releases,
HDFS policies that are defined in Ranger are checked before the native file access control is applied. This two-
layered authorization model differs in the way the standard Ranger HDFS policies are checked with Directed
Attached Storage (DAS) but the model is suitable for using OneFS as a multiprotocol datalake with Hadoop.
OneFS native file system ACL allows a storage administrator to correctly set up access control for multiple
workloads and with multiprotocol access to the HDFS dataset. A Ranger administrator can apply a further
restrictive Hadoop user access control to the same HDFS dataset, thus providing the administrators the
appropriate control span within their management domains.
In an Isilon OneFS cluster with Hadoop deployment, Ranger authorization policies serve as a filter before
applying the native file access control.
Note: The Ranger Audit and Transparent Data Encryption components are not currently supported.
You can enable Apache Ranger on Isilon OneFS clusters and then check for new authorization policies, receive
HDFS requests from clients, and apply authorization policies to the HDFS requests which can be one of DENY,
ALLOW, or UNDETERMINED.
Note:
 The Ranger DENY policy takes precedence over the ALLOW policy.
 The Ranger DENY policy prevents user or group access to files and directories in OneFS that the file
system would have otherwise allowed the users or groups to access.
To understand how the Ranger policies are applied, consider an example where a user in the Sales group
requires access to certain files and directories that have specific HDFS file system ownership and permissions
as shown:

Scenario Files and directories HDFS file system ownership and permissions
A /ifs/<zone>/<hdfs-root> (the HDFS root read, write, and execute (rwx) permissions for
directory for the zone) everyone.
B /ifs/<zone>/<hdfs-root>/corp- Neither the user nor the Sales group have read,
confidential write, and execute (rwx) permissions.
C /ifs/<zone>/<hdfs-root>/hdfs- The Sales group owns the folder, and the folder
confidential has read, write, and execute (rwx) permissions
for the group including the user.
D /ifs/<zone>/<hdfs-root>/hdfs-not- No read, write, and execute permissions for
for-user hidden files.
If Ranger policies are further applied, the user’s access to the files and directories changes as indicated in the
following table:
Ranger HDFS service instance User’s view and edit permissions

per scenario
None A—Yes
B—No
C—Yes
D—Yes
One policy that provides everyone in the Hadoop group, access to the root A—No
directory. B—No
C— No
D— No
One policy that provides everyone including the Sales group, access to the A—Yes
root directory. B—No
C— Yes
D— Yes
Two policies that are defined as follows: A—Yes

 Provide everyone access to the root directory. B—No
C— No
 Deny the Sales group, access to the hdfs-confidential
directory. D— Yes
Three policies that are defined as follows: A—Yes

 Provide everyone access to the root directory. B—No
C— No
 Deny the Sales group, access to the hdfs-confidential
directory. D— No
 Deny the user, access to the hdfs-not-for-user directory.

Prerequisites
To enable Ranger authorization on an Isilon OneFS cluster, make sure that you have installed the following
components:
 Apache Ranger 0.6.0 or later versions because the Ranger DENY policy is supported only with these
versions
 OneFS 8.0.1.0 and later releases
 Ambari version 2.4 or later
Deploy Apache Ranger

Deploy the Apache Ranger service and enable or create a deny policy using the steps outlined in the Apache
Ranger deny policies with OneFS 8.0.1.0 ECN blog post.
A summary of the workflow follows:
1. On a per-access zone basis, perform the following steps using the OneFS Web administration interface or
the OneFS command-line administration interface to configure Ranger:
a. Enable the Ranger plugin on OneFS.
b. Specify the URL of the Apache Ranger Management console to get the policies.
c. Specify the name of the service instance.
Refer to the OneFS 8.0.1 Web Administration Guide and OneFS 8.0.1 CLI Administration Guide for
guidance and information on configuring Ranger with OneFS.
2. Make sure that a Ranger service account user is configured on OneFS within your access zone.
3. Install Ranger using the steps outlined in the Hortonworks Security Guide.
Note: Do not assign Ranger components to the OneFS host as mentioned in step 5 of section 1.3.1 in the
above-mentioned guide.
4. Enable the Apache Ranger HDFS plugin using the steps outlined in the Hortonworks Security Guide.
5. If you have a Kerberos-enabled cluster, follow the instructions in the Hortonworks Security Guide to
enable the Ranger HDFS plugin on the cluster.
6. Enable the Ranger Deny policy using the instructions in the Apache Ranger deny policies with OneFS
8.0.1.0 ECN blog post.
Create a new service instance for OneFS using the Create Service page in Ranger. See the Hortonworks
Security Guide for details. Specify values in the following fields to create a new service instance for OneFS
and make note of the values:
 Specify a value in the Service Name field and make note of it because you must use the same
value in OneFS.

 Specify a username and in the Config Properties section specific to the service instance. The Test
Connection option will continue to fail until you have saved and reopened the service instance.
 Specify the Namenode URL as FQDN: hdfs://onefs.smartconnect.name:8020.
A service instance is created with a default policy all - path, granting access to all the files for
the user that you included in the Service Details page.
7. Add all of your groups and individual users associated with an access zone within OneFS to the default
policy in order to grant access to the groups and users. If you create local users in OneFS, or use Active
Directory, you must change the UserSync settings in Ambari or add the users in the Ranger interface.
Note: OneFS file system permissions take precedence even if the policy indicates that the user or group
can access everything.
8. Using the Edit Policy page in the Ranger interface, specify the group or users who will have limited access
to the repository within OneFS and indicate the permissions that must be denied to that path.
9. Create a DENY policy in Ranger using the steps outlined in the Hortonworks Security Guide, if required.
After you have saved the policy, OneFS enforces the policy at the next download. If you attempt to take
action on a path that is denied access by the Ranger policy, this will be reported in the OneFS HDFS log at
/var/log/hdfs.log. For more information, refer to the Apache Ranger deny policies with OneFS 8.0.1.0
ECN blog post.
Ambari metrics and alerts overview

In a Hadoop deployment with OneFS 8.0.1.0 or later releases, a OneFS node can monitor, collect, and push
metrics data at one-minute intervals to the Ambari Metrics Collector which is one of the components of the
Ambari Metrics System. All of the OneFS metrics and alert data provided to Ambari are cluster-wide. For
example, for a three-node Isilon OneFS cluster, the network HDFS traffic aggregated across all the three nodes
is reported to Ambari.
Note: OneFS metrics for specific Access Zones that contain HDFS dataset is not currently supported.
Prerequisites
In order to administer Ambari with OneFS, make sure that you have installed the following components:
 OneFS 8.0.1.0 or later versions.
 Ambari 2.0 or later versions.
 Ambari Metrics must be installed on the HDP cluster.
View Ambari metrics

In order to view the Ambari metrics, follow the steps outlined in the Ambari metrics and alerts with EMC Isilon
OneFS ECN blog post. A summary of the workflow is given below:
1. Access Ambari Web by opening a supported browser and entering the Ambari Web URL.
2. Click Ambari Metrics Service > Metrics Collector to determine the host name where Ambari Metrics
Collector has been installed.

3. From a node in your Isilon OneFS cluster, set the access zone and specify the name of the external Ambari
host where the Ambari Metrics Collector component is installed. Refer to the OneFS 8.0.1 Web
Administration Guide and OneFS 8.0.1 CLI Administration Guide for details on specifying the external
Ambari host name and setting the access zone.
4. On the Ambari Web home page, click the Dashboard tab and then click one of the following widgets to
view details related to the metrics data reported by OneFS:
Widget OneFS data reported
HDFS Disk Usage The usable storage capacity

DataNodes Live Always one because OneFS represents a single
DataNode
Memory Usage Swap: Always zero. OneFS does not have swap
space
Share: Always zero. This value is not available in
OneFS
Cache: Always zero. This value is not available in
OneFS
Cluster Load Procs: The number of Isilon nodes
CPUs: Number of cores in the cluster
Nodes: The number of cores in the cluster
1-min: The average workload across all OneFS
nodes during the last minute
NameNode Heap Used: The sum of the current memory allocated by
the HDFS process (cluster-wide)
Total: The sum of the memory limit that the HDFS
process can allocate
NameNode CPU WIO Always zero. This data is unavailable in OneFS
NameNode Uptime The OneFS node that is running for the longest time
To view details related to the HDFS services, click the Services tab on the Ambari Web home page and
then click HDFS on the left pane. Click one of the following widgets to view details related to the
OneFS metrics data:

DataNodes Live Always one because OneFS represents a single
DataNode.
Block, Blocks(Total), Block Errors Always one because there are no blocks in OneFS.
Total Files + Directories Always zero. The file system does not maintain the
file count.
NameNode GC count Always zero because OneFS does not run Java.
NameNode GC time Always zero because OneFS does not run Java.

NameNode Heap JVM heap used: The sum of the current memory
allocated by the HDFS process.
JVM heap committed: The sum of the memory limit
that the HDFS process can allocate.
Failed disk volumes Always zero because bad drives are reported on the
OneFS user interface.
Corrupted Blocks Always zero because OneFS maintains its own data
protection mechanism.
Under Replicated Blocks Always zero because there are no blocks in OneFS.
HDFS Space Utilization Represents the usable storage capacity in OneFS.
To view the NameNode details, click NameNode in the HDFS area on the Ambari Web user interface.
Note:
 The Cores (CPU) and Memory values are not supported in OneFS 8.0.1.0.
 All of the Ambari metrics data points reported by OneFS are also available within the Ambari metrics
Grafana dashboards that are available with Ambari 2.2.2 and later versions.
View Ambari alerts

OneFS reports some data to Ambari that can be used in the Ambari alerts system. You can view the alerts
related to OneFS in the Ambari user interface. These alerts are different than the regular alerts that the OneFS
alerts system reports. On the Ambari Web home page, click Alerts and then filter by HDFS in the Service field to
view the alerts. You can enable or disable the supported OneFS alerts using the Ambari Web user interface.
Note:
 Custom alerts that are created for HDFS are not recognized and reported.
 The Namenode Host CPU utilization alert defaults to wrong thresholds. Hence, manually set the
threshold to a reasonable value or set the Check Interval option to 1 minute to receive frequent
updates from this alert.
For more information, refer to the Ambari metrics and alerts with EMC Isilon OneFS ECN blog post.
HDFS wire encryption overview

Using OneFS 8.0.1.0 and later releases, you can protect data that is transmitted between an HDFS client and
OneFS through the data in-flight encryption also known as the HDFS wire encryption. In a Kerberos-enabled
Hadoop environment, you can enable this feature on all of the HDFS clients and on OneFS. Wire encryption
manages the negotiations between an HDFS client and OneFS to encrypt and decrypt data.
Note:
 You can enable wire encryption per access zone in OneFS.

 Enabling HDFS wire encryption with an Access Zone could result in HDFS traffic performance
degradation while accessing data in that zone. You should characterize the performance impact as
wire encryption enabled to determine whether this is acceptable to your workload.
Configure HDFS Wire Encryption with OneFS

To use the wire encryption feature, you must deploy Kerberos on your Hadoop cluster. The following
instructions assume that you have already deployed and enabled Kerberos on your cluster. You can then
enable HDFS wire encryption in OneFS either using the OneFS web administration interface or the OneFS
command-line administration interface.
Note: HDFS wire encryption supported by EMC Isilon is different from the Apache HDFS Transparent Data
Encryption technology. For more information, refer to the Enhanced Hadoop security with OneFS 8.0.1 and
Hortonworks HDP ECN blog post.
You can configure HDFS wire encryption using the OneFS web administration interface or command-line
administration interface. Refer to the OneFS 8.0.1 Web Administration Guide and OneFS 8.0.1 CLI
Administration Guide for details.
Configure Apache Hadoop for HDFS wire encryption

To enable HDFS wire encryption with OneFS, edit the following attributes associated with Apache Hadoop:
Properties in Value Properties in hdfs-site.xml Value

core-site.xml
hadoop.rpc.p privacy dfs.encrypt.data.transfer true
rotection
dfs.encrypt.data.transfer 3des (Default value)
.algorithm
dfs.encrypt.data.transfer AES/CTR/NoPadding (Default
.cipher.suites value)
dfs.encrypt.data.transfer Select one of 128,192,356. The
.cipher.key.bitlength default value is 128.
You can also edit the previous attributes directly using the Ambari Admin console or Cloudera Manager
depending on how you are managing your Hadoop cluster.
Save the settings in Ambari or Cloudera and restart the affected services. You must also restart all the
components of ZooKeeper even though you might not be prompted to do so. Also, if you have installed Hive
on your cluster, restart the Hive service completely or restart the Hive Metastore component individually.
For more information, refer to the Enhanced Hadoop security with OneFS 8.0.1 and Hortonworks HDP ECN blog
post.
Integrating HTTPFS and Knox with OneFS

With OneFS 8.0.0.3 and later releases, you can use HTTPFS as a gateway interface to OneFS and limit broad
HDFS access from external clients to the Hadoop cluster. You can also integrate HTTPFS with Knox to enable
the following features:

 Integration with enterprise identity management solutions.
 Improve service-level authorization
 Integrate LDAP and AD
 Enhance HDFS access security
Both HTTPFS and Knox can be installed on Kerberized and Non-Kerberized Hadoop clusters. Refer to the white
paper IMPLEMENTING HTTPFS & KNOX WITH ISILON ONEFS TO ENHANCE HDFS ACCESS SECURITY for
information about integrating HTTPFS and Knox with OneFS.
Contacting EMC Isilon Technical Support

Online Support: https://support.emc.com/
Telephone Support:
United States: 800-782-4362 (800-SVC-4EMC)
Canada: 800-543-4782
Worldwide: +1-508-497-7901
Additional worldwide access numbers
Help with Online Support Tools:
For questions specific to HDFS, contact support@emc.com

Emc Isilon Onefs With Hadoop and Hortonworks Installation Guide - Docu71396

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Emc Isilon Onefs With Hadoop and Hortonworks Installation Guide - Docu71396

Uploaded by

Copyright:

Available Formats

EMC ISILON ONEFS WITH HADOOP AND

HORTONWORKS INSTALLATION GUIDE

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS-IS.“ DELL MAKES NO REPRESENTATIONS OR

Isilon OneFS Hadoop and Hortonworks Installation Guide 2

Version Date Description

Isilon OneFS Hadoop and Hortonworks Installation Guide 3

Isilon OneFS Hadoop and Hortonworks Installation Guide 4

Isilon OneFS Hadoop and Hortonworks Installation Guide 5

Isilon OneFS Hadoop and Hortonworks Installation Guide 6

Isilon OneFS Hadoop and Hortonworks Installation Guide 7

Ambari and Hortonworks Data Platform

Ensure that the following requirements are met:

 Ambari 2.1.x or later.

Isilon OneFS Hadoop and Hortonworks Installation Guide 8

HDP stack version

Isilon OneFS cluster name

Ambari server (FQDN)

Isilon OneFS cluster configuration

Isilon OneFS cluster name

Installing OneFS with Hadoop

Isilon OneFS Hadoop and Hortonworks Installation Guide 9

4 Verifying the Deployment

Validate OneFS version and license activation

Isilon OneFS Hadoop and Hortonworks Installation Guide 10

Configure OneFS components

 Create an access zone

 Create a SmartConnect zone

 Create and configure the HDFS root in the access zone

 Create users and groups

 Create a basic HDFS folder structure for use with HDFS

Access zone name

Access zone path

SmartConnect zone name (FQDN)

IP range for IP pool (ranges)

SmartConnect pool name (subnet pool)

Node and interfaces in the pool

HDFS root path

Open Data Platform (ODP) version

Create an access zone

2. Verify that the access zones are set up correctly.

Output similar to the following appears:

Isilon OneFS Hadoop and Hortonworks Installation Guide 11

4. List the contents of the Hadoop access zone root directory.

1. Modify your existing subnets and specify a service address.

2. Create an access zone.

Isilon OneFS Hadoop and Hortonworks Installation Guide 12

Output similar to the following appears:

Configure DNS for Isilon

Isilon OneFS Hadoop and Hortonworks Installation Guide 13

Verify the SmartConnect configuration

Create and configure Isilon HDFS root

1. View the HDFS service settings.

2. Set the HDFS root directory for the access zone.

Isilon OneFS Hadoop and Hortonworks Installation Guide 14

5. Assign the Ambari server in the access zone.

Modify the access control list (ACL) setting for OneFS

Create HDFS users and groups

Isilon OneFS Hadoop and Hortonworks Installation Guide 15

1. Create local users and groups on Isilon OneFS.

2. Collect the UIDs and GIDs of the users.

3. Pre-create local users and groups on all HDP hosts to be deployed.

Steps to perform on the Isilon OneFS cluster

Isilon OneFS Hadoop and Hortonworks Installation Guide 16