Professional Documents
Culture Documents
Practical Filefilr
Practical Filefilr
PRACTICAL FILE
1
PRACTICAL - 1
AIM: WORKING WITH AMAZON WEB ENGINE.
2. S3
Browse buckets and view their properties.
View properties of objects.
4. Services Dashboard
Provides information of available services and their status.
All information related to the billing of the user.
Switch the users to see the resources in multiple accounts.
2
Step 2: Create and Configure Your Virtual Machine
3
c) Now choose an instance type. Instance types comprise of varying combinations of
CPU, memory, storage, and networking capacity. Then click Review and Launch
d) click Launch
a) select Create a new key pair and name it MyFirstKey. Then click Download Key Pair.
b) After you have downloaded and saved your key pair, click Launch Instance
4
c) click View Instances to view the instance you have just created
a) Select the Windows Server instance you just created and click Connect
5
b) to connect to your Windows virtual machine instance,we need a user name and password:
o The User name defaults to A d ministrator
o To receive your password, click Get Password
c) Click Choose File and browse to the directory you stored My First Key . Your Key Pair will
surface in the text box. Click Decrypt Password.
6
d) It is your Windows Server admin login credentials.
7
f) When prompted log in to the instance, use the User Name and Password you generated in to
connect to your virtual machine.
a) click the Actions button, navigate to Instante Stat e , and click Terminate.
8
c) Store and Retrieve a File with Amazon
9
c) Leave these Options disabled and select Next.
10
e) select Create bucket.
11
c) Click on your Folders name to navigate to the Folder.
12
f) Leave the default values and select Next.
13
Step 4: Retrieve the Object
a) Select the checkbox next to the file you want to delete and select More > Delete.
14
b) Review and confirm the object you want to delete. Select Delete.
c) Select the bucket you created and select Delete. Type in the name of your bucket and select Confirm.
15
d) Create and Connect to a MySQL Database with Amazon RDS
16
d) Select the MySQL option under Dev/Test and click Next Step.
17
f) You are now on the Configure Advanced Settings page where you can provide additional
information that RDS needs to launch the MySQL DB instance.Click Launch DB Instance
18
Step 2: Download a SQL Client
19
a) Launch the MySQL Workbench application and go to Database > Connect to Database
c) Now you can start creating tables, insert data, and run queries.
20
i. Create Table
21
ii. Insert Values Into Table
22
iii. Select Values
a) Go back to your Amazon RDS Console. Select Instance Actions and click Delete
from the dropdown menu.
23
b) check the acknowledgment box, and click Delete.
24
Step 2: Add Data to the NoSQL Table
a) Click the Items tab. Under the Items tab, click Create item.
25
d) Repeat the process to add a few more items to the table.
a) Using the drop-down list in the dark gray banner above the items, change Scan to
Query.
26
Step 4: Delete an Existing Item
27
Step 5: Delete a NoSQL Table
a) In the Amazon DynamoDB console, click the Actions dropdown and click Delete
table
28
PRACTICAL 2
AIM: WORKING WITH MICROSOFT AZURE.
29
1. Click on Virtual Machines in the Navigation pane on the left. The Virtual Machines
panel will open.
2. Click on Create Virtual Machine button. This will give you the options to choose the
type of Virtual Machine you want.
30
3. Click on Windows Server. A navigation pane with various OS options opens on the
right side. Choose the OS as required.
4. After selecting the OS, it is time to configure your virtual machine. First a basic
configuration page will open. Here, you can set the name of the virtual machine,
select the location, set the username and password, select subscription type, the type
of disk and resource group. Then click OK.
31
5. On the next screen, you will be asked to select the configuration based on size of
machine, memory available, number of CPUs, number of IOPS.
6. Next is configuring some optional features for the virtual machine such as High
availability, managed storage, network etc.
32
7. After clicking OK, the final page opens showing the details of the virtual machine,
asking to agree to the Terms and Conditions and then Purchase the virtual machine.
When Purchase is clicked, the deployment starts.
8. After waiting for some time, the machine gets deployed and is ready for use.
33
9. Click on Connect button in the top bar. It downloads a Remote Desktop Connection
file. Open the file and it asks for the Username and Password.
34
35
PRACTICAL 3
INTRODUCTION:
Google Cloud Platform, offered by Google, is a suite of cloud computing services that runs
on the same infrastructure that Google uses internally for its end-user products, such
as Google Search and YouTube. Alongside a set of management tools, it provides a series of
modular cloud services including computing, data storage, data analytics and machine
learning
A sample of products are listed below, this is not an exhaustive list.
Google Compute Engine IaaS providing virtual machines.
Google App Engine PaaS for application hosting.
Bigtable IaaS massively scalable NoSQL database.
BigQuery SaaS large scale database analytics.
Google Cloud Functions As of August 2017 is in beta
testing. FaaS providing serverless functions to be triggered by cloud events.
Google Cloud Datastore - DBaaS providing a document-oriented database.
Cloud Pub/Sub - a service for publishing and subscribing to data streams and
messages.Applications can communicate via Pub/Sub, without direct integration
between the applications themselves.
Google Storage - IaaS providing RESTful online file and object storage.
36
2. Click the Create instance button.
3. In the Boot disk section, click Change to begin configuring your boot disk.
37
6. In the Firewall section, select Allow HTTP traffic.
38
2. Under the Name column, click the name of your virtual machine instance.
3. At the top of the VM instance's details page, click the Create or reset Windows
Password button.
4. Specify a username, then click Set to generate a new password for this Windows
instance. Save the username and password so you can log into the instance.
CLEAN UP
To avoid incurring charges to your Google Cloud Platform account for the resources used in
this quickstart:
1. Install a google plugin for ellipse IDE .After installation click on create new
project and select Google folder and then Web Application Project
39
2. Create a new project in Google cloud console and note down the project id.
3. Also click on next in ellipse. Now enter the same project name as created in GCP
and also enter the project id. Click on Finish.
4. Open the index.html file under war folder and change the text you want to display
in the app engine.
Also the java file in the src folder enter the message which will be displayed after
clicking the link.
40
5. In the project properties change the compiler compliance level to 1.7 from 1.8
6. Under the google icon select option deploy Project to Google App Engine. Click
on Deploy option.
41
7. In the browser your first google app will be displayed. Click on the link.
8. The final text will be displayed. Save the app link and open your app any time.
http://1-dot-starry-expanse-137723.appspot.com/my_project
42
(C) CLOUD STORAGE
CREATE A BUCKET
1. Open the Cloud Storage browser in the Google Cloud Platform Console.
OPEN THE CLOUD STORAGE BROWSER
43
2. Click CREATEBUCKET.
44
That's it you've just created a Cloud Storage bucket!
1. Click UPLOADFILES.
2. In the file dialog, navigate to the file that you downloaded and select it.
45
After the upload completes, you should see the file name, size, type, and last modified date in
the bucket.
Try these ways of interacting with the objects from the Cloud Storage browser:
1. Right-click the file and select the option to save it.
46
CREATE FOLDERS
You should see the folder in the bucket with an image of a folder icon to distinguish it
from objects.
DELETE OBJECTS
47
CLEAN UP
To avoid incurring charges to your Google Cloud Platform account for the resources
used in this quickstart:
1. Open the Cloud Storage browser in the Google Cloud Platform Console.
OPEN THE CLOUD STORAGE BROWSER
2. Select the checkbox next to the bucket that you created.
3. Click DELETE.
4. Click Delete to permanently delete the bucket and its contents.
48
2. Select your project and click Continue.
3. Click Create Instance.
4. Click MySQL.
49
6. Enter myinstance for Instance ID.
You are returned to the instances list; your new instance is greyed out while it
initializes and starts.
50
CONNECT TO YOUR INSTANCE USING THE MYSQL CLIENT IN THE CLOUD
SHELL
1. In the Google Cloud Platform Console, click the Cloud Shell icon ( ) in the
upper right corner.
When the Cloud Shell finishes initializing, you should see:
Welcome to Cloud Shell! Type "help" to get started.
username@example-id:~$
51
3. Enter your root password.
You should see the mysql prompt.
INSERT INTO entries (guestName, content) values ("first guest", "I got here!");
INSERT INTO entries (guestName, content) values ("second guest", "Me too!");
CLEAN UP
To avoid incurring charges to your Google Cloud Platform account for the resources
used in this quickstart:
52
1. Go to the Cloud SQL Instances page in the Google Cloud Platform Console.
GO TO THE CLOUD SQL INSTANCES PAG
2. Select the myinstance instance to open the Instance details page.
4. In the Delete instance window, type myinstance, then click Delete to delete the
instance.
You cannot reuse an instance name for approximately 7 days after an instance is
deleted.
53
(E) CLOUD DATASTORE
STORE DATA
3. If you see the following page, you need to select a location. (Go to the next step if you
do not see this page.)
54
The location applies to both Cloud Datastore and Google App Engine for your Google
Cloud Platform project. You cannot change the location after it has been saved.
To save a location, select one of the location values and click Next.
7. Click Create. The console displays the Task entity that you just created.
55
You just stored data in Cloud Datastore!
RUN A QUERY
Cloud Datastore supports querying data by kind or by Google Query Language
(GQL); the instructions below walk you through the steps of doing both.
Run kind queries
1. Click Query by kind.
2. Select Task as the kind.
The query results show the Task entity that you created.
Next, add a query filter to restrict the results to entities that meet specific criteria:
1. Click Filter entities.
2. In the dropdown lists, select done, is a boolean, and that is false.
3. Click Apply filters. The results show the Task entity that you created, since
its done value is false
4. Now try a query of done, is a boolean, and that is true. The results do not include
the Task entity that you created, because its done value is not true.
56
Again, add a query filter to restrict the results to entities that meet specific criteria:
2.
3. Now run a query such as SELECT * FROM Task WHERE done=true. The results do
not include the Taskentity that you created, because its done value is not true.
CLEAN UP
1. Click Query by kind and ensure Task is the selected kind.
2. Click Clear filters.
57
3. Select the Task entity that you created
4. Click Delete, and then confirm you want to delete the Task entity. Once deleted, the
entity is permanently removed from Cloud Datastore.
The Task entity that you previously created is deleted from Cloud Datastore.
58
PRACTICAL 4
Aim: INSTALLATION OF APACHE HADOOP
Installing Hadoop 2.7.1
This method of install Hadoop is to install any version of Hadoop 2.x.x . As we know that
Hadoop requires JVM to run. So we need to install Java before installing Hadoop. So before
installing java let us update our package list doing this will automatically give latest version
of Java form the Linux vender.
Doing this will require internet connection. Once you complete the above step you can install
java by typing below command in terminal. (Note- you can use any other command to install
java)
when you are done to check which java version to do that type
$ java version
Above result describes that our installed version in 1.7.65 make sure that
you have java 1.6 or above.
59
ssh is Secure shell. This application allows us to get remote access of any machine(or Local
host) by different password other then root and also allows us to bypass the password by
setting it to empty. To install ssh use following command :
if we try to connect local host or local machine though ssh it will ask user password. To check
this you can type this command in terminal.
$ ssh localhost
Note-Before going further we need to exit ssh just type exit in same terminal.
So we need to set our ssh for password less communication. To do that execute following
command in terminal.
Please note that there are two single quotes after 'P' in command without space. After entering
this command it will ask Enter file in which to save the key (/home/abc/.ssh/id_rsa): press
Enter without typing any single word. You will get Image after entering this doing this, this
image is called as randomart image. This image will vary machine to machine and this key
will be used to communicate between any two machine for authentication.
This command will create an RSA key pair with an empty password. Generally, using an
empty password is not recommended, but in this case it is needed to unlock the key without
your interaction (you dont want to enter the passphrase every time Hadoop interacts with its
nodes).
Now we need to save this generated key to local machines host key fingerprint to the users
known hosts file. To do this use this command.
60
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
$ssh localhost
If this step asks you for a password that means you have done something wrong. So to repair
this you need to repeat the above steps from next to installing ssh again. Note-Before going
further we need to exit ssh just type exit in same terminal.
Once we have completed this we will need to download Hadoop 2.6 or any version from its
official site http://hadoop.apache.org/ then extract this Hadoop tar.gz manually or through
terminal. Now we need to move Hadoop folder to root this step is optional but its recommend
that you may move file to root. To move Hadoop folder to its appropriate location use
following command (note this command is only use to move folder to root if you are placing
to other location you can do it manually).
sudo : This is keyword which allows user to grant super user permission temporary. This
command is Linux native command. It means super user do.
mv : This is Linux native command to move any file or directory to any location. it has two
parameters as
parameter 1: source Address
parameter 2:destination Address
61
In above command my source address is Desktop/hadoop-2.6.0 you can change this
according to your source location and my destination address is /usr/local/hadoop
Note: I haven't given '/' after my destination this means that I am renaming my source Folder
name from hadoop-2.6.0 to hadoop.
Now we need to set system environment variable so that our system identifies Hadoop. To do
this open bashrc file as a root in any text editor.(in my case I am using gedit).
Note some time you get blank file please make sure that this file is ~/.bashrc
Line 3 to 8: These are Hadoop components locations, We are defining these to reduce or
work later, I will explain the use of these lines later in depth. Save and close this ~/.bashrc.
As we have add successfully added the environment variable we need to reflect these to our
system for this you can do two things :
$source ~/.bashrc
Once you have done this type this command to check we have installed our Hadoop properly
or not you can use this command.
62
$hadoop version
if you get something like this it means you have successful set up Hadoop in your system.
Now the last thing we need to update JAVA_HOME in Hadoop so open
/hadoop/etc/hadoop/hadoop-env.sh from your installed Hadoop path and find these line in it
63
save it and exit.
3)Distributed Mode
This mode is also called as Multinode node. This mode needs some changes to be done in
Psedudistrbuted mode along with ssh. This mode is generally used for commercial purpose.
Hadoop is by default is configured in Standalone mode. This stand alone mode is used only
for debugging purpose but to develops any application we need to configure hadoop in
Pseudo Distributed mode.
To configure hadoop in Pseudo Distributed mode we need to edit following files :
64
1)core-site.xml
2)hdfs-site.xml
3)mapred-site.xml
4)yarn-site.xml
Please note that we need to carry out the steps as explained in Previous section of Setting up
hadoop 2.6.0 on Linux.
All mentioned files are present in hadoop installation directory under /etc/hadoop in my
case as per previous document its address is /usr/local/hadoop/etc/hadoop
1) configuring core-site.xml
core site xml is a file containing all core property of hadoop. For example, Namenode url,
Temporary storage directory path, etc. Hadoop has predefined configuration which we need
to override them if we mention any of the configuration in core-site.xml then during startup
of hadoop, hadoop will read these configuration an run hadoop using this. To get more details
of default configuration in hadoop you can visit
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop- common/core-
default.xml . So let us configure some of our requirements.
Open this file in any of the text editor and add these contents in it between
<configurations></configurations>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/abc/tmp</value>
</property>
property 1: fs.defaultFS
This property overrides the default namenode url its syntax is hdfs://<ip-address of
namenode>:<port number> .This property was named as fs.default.name in hadoop 1.x.x
version. Note: Port number
can be any number above 255 to 65536
property 2: hadoop.tmp.dir
This property is used to change the temporary storage directory during execution of any
algorithm in hadoop by default its location is /tmp/hadoop-${user.name} in my case I have
created this directory in my home folder name tmp so its /home/abc/tmp.
2) Configuring hdfs-site.xml
This file contains all configuration about hadoop distributed file system also called as HDFS
such as storage location for namenode, storage location for datanode, replication factor of
HDFS, etc. Similar to core-site.xml we need to place below content between configuration
65
fields to get more information on this we can visit above mentioned link.
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/abc/tmp/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/abc/tmp/datanode</value>
</property>
Property 1: dfs.replication
This property overrides the replication factor in hadoop. By default its value is 3 but in single
node cluster it is recommended to be 1.
Property 2: dfs.namenode.name.dir
This property overrides storage location of namenode data by default its storage location is
inside /tmp/hadoop-${user.name}. To change this we have set value of our folder location
66
in my case it is inside tmp directory created during core-site.xml
Property 3: dfs.datanode.data.dir
This property overrides storage location of datanode data by default its storage location is
inside /tmp/hadoop-${user.name}. To change this we have set value of our folder location
in my case it is also inside tmp directory created during core-site.xml
Please make sure if our location of both datanode and namenode is in our root directory then
we should change its ownership and read write access using chown and chmode command in
Linux. Also we can create these directory manually before this setting them to our path else
hadoop will create them for you.
3) Configuring mapred-site.xml
This file contain all configuration about Map Reduce component in hadoop. Please note that
this file doesn't exist but you can copy or rename it from mapred-site.xml.template.
Configuration for this file is should be as followed.
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
As we know that from hadoop 2.x.x hadoop has introduced new layer of technology
developed by hadoop to improve performance of map reduce algorithm this layer is called as
yarn that is Yet Another Resource Negotiator. So here we are configuring that our hadoop
framework is yarn if we don't specify this property then our hadoop will use Map reduce 1
also called as MR1.
4) Configuring yarn-site.xml
This file contains all information about YARN as we will be using MR2 we need to specify
the auxiliary services that need to be used with MR2 so add these lines to yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
Now we have successfully configured hadoop 2.6.0 or say hadoop 2.x.x in Pseudo distributed
mode.
Before starting hadoop we need to format our namenode. Execute this command to format
namenode.
67
$hdfs namenode -format
$ start-dfs.sh
$ start-yarn.sh
$ start-all.sh
To check the which components are working you can use bellow command
$ jps
68