Grid Topologies Unit I

INTRODUCTION & WORKING
 Grid Computing can be defined as a network of computers working

together to perform a task that would rather be difficult for a single
machine.
 All machines on that network work under the same protocol to act like a
virtual supercomputer.
 The task that they work on may include analysing huge datasets or
simulating situations which require high computing power.
 Computers on the network contribute resources like processing power
and storage capacity to the network.
 Grid Computing is a subset of distributed computing, where a virtual
super computer comprises of machines on a network connected by some
bus, mostly Ethernet or sometimes the Internet.
 It can also be seen as a form of Parallel Computing where instead of
many CPU cores on a single machine, it contains multiple cores spread
across various locations.
 The concept of grid computing isn’t new, but it is not yet perfected as
there are no standard rules and protocols established and accepted by
people.
Working:
A Grid computing network mainly consists of these three types of machines
1. Control Node: A computer, usually a server or a group of servers which

administrates the whole network and keeps the account of the resources in
the network pool.
2. Provider: The computer which contributes it’s resources in the network
resource pool.
3. User: The computer that uses the resources on the network.
 When a computer makes a request for resources to the control node,
control node gives the user access to the resources available on the
network.
 When it is not in use it should ideally contribute it’s resources to the
network. Hence a normal computer on the node can swing in between
being a user or a provider based on it’s needs.
 The nodes may consist of machines with similar platforms using same OS
called homogenous networks, else machines with different platforms
running on various different OS called heterogenous networks.
 This is the distinguishing part of grid computing from other distributed
computing architectures.
 For controlling the network and it’s resources a software/networking
protocol is used generaly known as Middleware.
 This is responsible for administrating the network and the control nodes
are merely it’s executors.
 As a grid computing system should use only unused resources of a
computer, it is the job of the control node that any provider is not
overloaded with tasks.
Applications of Grid computing
1. General Applications of Grid Computing

 Distributed Supercomputing
 High-throughput Supercomputing
 On-demand Supercomputing
 Data-intensive Supercomputing
 Collaborative Supercomputing
2. Applications of Grid Computing Across Different Sectors

 Movie Industry
 Gaming Industry
 Life Sciences
 Engineering and Design
 Government
Distributed Supercomputing
Distributed supercomputing may seem like a fancy term but it’s very easy to understand. Let me put
it this way, a grid computing network spread across different geographical areas, even different
countries is distributed supercomputing. Easy right!
High-throughput Supercomputing
High throughput tasks give a little bit away by its name only, doesn’t it? Well, it can be characterized
by the tasks wherein you need a large amount of processing power for an extended amount of time.
Time can vary from few months to years. So to accomplish these tasks, there is a need for high
throughput supercomputing.
On-demand Supercomputing
One of the applications of grid computing is on-demand supercomputing. Came into existence to
overcome the problems that enterprises faced during fluctuating demand.
This happens when computing services are provided by the third party.
This model can be characterized by three attributes namely pay-per-use, self-service, and scalability.
On-demand supercomputing increases a business’ agility.
Data-intensive Supercomputing
This is a kind of parallel computing wherein a massive volume of data is divided into chunks of data
which are then processed simultaneously.
The data can be as big as the big data! This type of computing is very resourceful when there’s a
time constraint associated with the task/project as the operations work simultaneously.
So, the data is divided and worked upon at the same time.
Collaborative Supercomputing
 As the name suggests, this kind of computing happens when one organization wants to
collaborate with another to make use of its supercomputing abilities.
 Doesn’t matter if you’re a mid-size or a large size enterprise, if you don’t have the required
skill and resources this is the only way out.
 Also, Rolls-Royce, a British aerospace and aviation firm, is one of the first companies to join
this collaborative supercomputing initiative.
 So these were a few applications of grid computing. Now let me take you to a tour of how
some industries are applying grid computing.
Applications of Grid Computing Across Different Sectors
Grid Computing in the Movie Industry

 Nowadays, being in an IT position in the media industry is also a glamorous position to be in.
Large studios are trying hard to recruit the best IT specialists and programmers out there.
 One of the major reasons for recruiting the best talent is the increased need for realism in
the movies or what we call special effects.
 Many films can’t be made without the use of grid computing not only because of special
effects but also because of the fact that grid computing enables faster production of a film.
 Grid computing is an enhancement tool for the studios.
Grid Computing in the Gaming Industry
 Traditionally, the gaming industry was an in-house scenario.

 But with the growing popularity of online gaming, networking is becoming pivotal, thus
comes the grid computing.
 Despite all this hesitation, There are few areas where grid computing is gaining popularity,
• In-game art internal creation
• In-game cut scenes rendering
• Packaging game assets for multiple platforms
• Distribution of online program
• Hosting of a massively multiplayer online game
Grid Computing in Life Sciences
 The advances in the life sciences sector have led to some accelerated changes in the ways of
conducting drug discovery and drug treatment.
 With these rapid changes, new challenges are also being surfaced like massive amounts of
data analysis, data caching, data mining, and data movement.
 With great power comes great responsibility, likewise with these complexities came a few
surrounding requirements such as secured storage, privacy, secured data access, and more.
 This requires a grid computing architecture that can manage all these complexities and all
the while accurately analyze the data.
 It gives the ability to afford top-notch information while providing faster responses and
accurate results.
Grid Computing in Engineering and Design
 The pressure on the engineering firms and industry as a whole is increasing day by day
resulting in lesser turnaround time.
 The industry is in grave need of capturing data and speeding up the analysis part.
Grid computing can answer these complexities:
o Analysis of real-time data to find a particular pattern.

o Experiment modeling to create new designs.
o Verifying existing models for accuracy using simulation activities.
Grid Computing in Government
 We all can imagine the amount of data that the government has to process. Grid computing
can help in coordinating all the data, which is held across government agencies.
 This will make way for clear coordination, not only in case of emergencies but in normal
situations as well.
 Grid computing will enable virtual organizations which may include many government
agencies as participants.
 This is an essential step to provide the required data in real-time and simultaneously analyze
the data to detect any problem and its solution.
Grid Topologies
A grid topology refers to the structure and organization of resources
within a grid infrastructure. There are three basic topologies associated with
grid computing: intra-grids, extra-grids and inter-grids.
INTRA GRID
Intra-grids are the simplest of the three grid topologies. A typical intra-grid
topology exists within a single organization, providing a basic set of Grid
services. The primary characteristics of an intra-grid are a single security
provider, bandwidth on the private network is high and always available, and
there is a single environment within a single network.
EXTRA GRID
Extra-grids are more complicated in that they involve a consolidation of
different intragrids. An extra-grid is characterized by dispersed security,

multiple organization and remote/wide area network (WAN) connectivity. In
simple way, An extra-grid, typically involves more than one security provider,
and the leveL management complexity increases. Security is an increased
concern because data passes beyond organizational boundaries. Resources are
more heterogeneous (due to multiple organizations being involved), more
dynamic in nature (organizations do not have control over each other’s
resources) and typically require policies in order to control utilization of the
resources. Data and resources in an extragrid environment are confined to an
organization and the partners it wishes to share with.

INTER GRID
Inter-grids are the most complicated of the three topologies. Inter-grids have the
same characteristics as extra-grids, except that data and resources within the
environment are global and are available to the public.
Regardless of the topology being used, the user is still presented with the
same view of the system. That is, grid data and grid resources are made to
appear as if they are part of the local machine.

GRID INFRASTRUCTURES
An infrastructure is an underlying foundation that provides basic
facilities, services and installations needed for the functioning of an
organization or system. For example, the road system allows people to travel by
vehicle, the banking system allows people to transfer funds across borders, the
Internet allows people to communicate with each other and virtually any
electronic device, etc
Grid computing is an emerging infrastructure that provides scalable and
secure mechanisms for the access, sharing and discovery of resources amongst
dynamic collections of individuals and institutions. One of its goals is to
provide these services in a manner that hides the implementation details from
the user. The availability of high-speed networks, low cost components and the
ubiquity of the Internet and Web technologies make it feasible to realize this
vision.
Components and Layers

Components
There are some of the key software components that must be considered when
designing a grid infrastructure. These components are outlined in:
Management: All grid systems consist of some form of management software.
At the very least, a grid has a management component responsible for keeping
track of resource availability and the users of the grid. Some, if not most grids
contain software components that deal with the management of distributed
resources, workload and other grid components.
Contribution: Machines shared as part of a grid require a mechanism which
allows them to “contribute” themselves as a grid resource. This is typically
achieved through some form of contribution software. Contribution software
registers a machine as part of the grid, enables grid management functions, and
can accept and execute applications submitted to it by grid management
software. In addition, contribution software usually monitors resource
utilization on the local machine (e.g., processor utilization, disk space, etc.) and
application execution (i.e., did a submitted application execute successfully?)
and reports it to higher level management components. This information is used
to make high level resource utilization and scheduling decisions.
Submission: Submission software enables a grid user to submit applications to
the grid and perform grid queries. For example, the user might want to find out
if certain resources are present on the grid, or query the status of a submitted
application. These actions typically require some form of authentication on the
part of the user, such as requiring a username and password or obtaining some
type of authentication “document” such as one provided by a Certificate
Authority (CA). Submission software can be bundled as part of the contribution
software or implemented as a separate entity.
Communication: Communication software is responsible for making sure that
applications running on the grid are capable of locating and communicating
with one another. For example, an application might consist of several sub-jobs
that perform a distributed computation. Once these sub-jobs have been
scheduled and deployed onto local machines on the grid, they need to be able to
locate and establish a communication link between each other in order to
perform the computation.
Scheduling: Grid systems that support the execution of grid applications
typically contain software components that handle application scheduling.
Scheduling software is responsible for locating the appropriate machines on
which to run applications. A simple scheduler queues requests to execute an
application and schedules these requests in the order of arrival. A more
advanced scheduler would support application priority. An advanced scheduler
might react to grid workload, using information about machine utilization, by
scheduling applications to less utilized machines. Other advanced features

include detecting and re-submitting applications upon failure and a reservation
system that guarantees that an application is able to execute at a specified date
and time.
Monitoring: Monitoring software is responsible for monitoring the “health” of
the grid. It is sometimes referred to as “sensor” software. grids contain tools and
components that monitor the load and activity on each of its machines. Load
information is used in order to discover usage patterns and to make intelligent
scheduling and resource utilization decisions. Information received from
monitoring software can also be used for accounting purposes, for recording
application profiles, and many other things. Application profiles are used in
order to predict the run time and usage patterns for certain types of applications.
GRID ARCHITECTURE
Building a grid infrastructure requires the design and development of protocols
and services which address issues of security, resource aggregation, resource
discovery, resource selection, job scheduling, job execution management, and
more.
The key Layers involved in the architecture are:
 Fabric: This consists of all the distributed resources, owned by different
individuals and organizations, shared on the grid. This includes
workstations, resource management systems, storage systems, specialized
devices, etc.
 Resource and Connectivity Protocols: This contains core communication
and authentication protocols that provide secure mechanisms for
verifying the identity of users and resources and allow data to be shared
between resources.
 Collective Services: This contains Application Programming Interfaces
(APIs) and services that implement interactions across collections of
resources. This includes directory and brokering services for resource
allocation and discovery, monitoring and diagnostic services, application
scheduling and execution, and more.
 User Applications: This contains programming tools and user
applications that depend on grid resources and services during their
execution.
WEB SERVICES
Service-Oriented Architecture (SOA):A Service-Oriented Architecture
(SOA) is an architectural style involving a collection of services capable of
communicating with one another. This communication can involve either data
passing or it could involve two or more services coordinating some activity.
Web services are a form of SOA. Web service interfaces describe a
collection of operations that are network-accessible through the use of XML
messaging. The purpose of Web services is to facilitate application-to-
application communication. An example of this communication using the
Simple Object Access Protocol (SOAP).
Web Services define techniques for describing software components that
allow access to themselves, methods for accessing these components, and

techniques that allow for the identification and discovery of relevant service
providers.
There are three roles that are assumed in the Web Services model.
The requester is an entity that requests the use of a particular service. Services
are created and deployed on a server that is capable of receiving messages in a
particular encoding over a transfer protocol (e.g., Hypertext Transfer Protocol
(HTTP)). The server might support several styles of encoding and/or protocol
pairs, each of which is called a binding because it binds a high-level service
definition to a low-level means of invoking the service. From the Requester’s
point of view, a Registry represents a collection of services which may provide
suitable implementations of the interfaces that it needs to use. The requester can
search for the interface it needs by filtering out interfaces based on criteria
associated with the binding.
Web services are based on three key technologies:

 Simple Object Access Protocol (SOAP)
 Web Services Description Language (WSDL)
 Web Service Inspection (WSI)
SOAP is an XML-based messaging protocol used for encoding Web
Service request and response messages before sending them over a network.
SOAP defines a convention for Remote Procedure Calls (RPC) and a
convention for messaging independent of the underlying transport protocol.
The messages can be transported over a variety of protocols such as FTP,
SMTP, MIME and HTTP.
WSDL is an XML-based language that defines the set of messages, the
encodings and the protocols that are used to communicate with a service. It
allows multiple bindings for a single interface.
WSI is an XML-based language designed to assist in locating service
descriptions and involves rules that define how inspection-related
information should be made available. A WSI document is a collection of
references to WSDL documents.

TRADITIONAL PARADIGMS FOR DISTRIBUTED COMPUTING
1. Socket programming: Sockets provide a low-level API for writing distributed
client/server applications. Before a client communicates with a server, a socket endpoint
needs to be created. The transport protocol chosen for communications can be either
TCP or UDP in the TCP/IP protocol stack. The client also needs to specify the hostname
and port number that the server process is listening on. The standard socket API is well-
defined, however the implementation is language dependant. So, this means socket-
based programs can be written in any language, but the socket APIs will vary with each
language use. Typically, the socket client and server will be implemented in the same
language and use the same socket package, but can run on different operating systems
(i.e. in the Java case). Socket programming is a low-level communication
technique, but has the advantage of a low latency and high-bandwidth
mechanism for transferring large amount of data compared with other
paradigms. However, sockets are designed for the client/server
paradigm, and today many applications have multiple components
interacting in complex ways, which means that application

development can be an onerous and time-consuming task. This is due
to the need for the developer to explicitly create, maintain, manipulate
and close multiple sockets.
2. RPC: RPC is another mechanism that can be used to construct
distributed client/server applications. RPC can use either TCP or UDP
for its transport protocol. RPC relies heavily on an Interface Definition
Language (IDL) interface to describe the remote procedures executing
on the server-side. From an RPC IDL interface, an RPC compiler can
automatically generate a client-side stub and a server-side skeleton.
With the help of the stub and skeleton, RPC hides the low-level
communication and provides a high- level communication abstraction
for a client to directly call a remote procedure as if the procedure were
local. RPC itself is a specification and implementations such as Open
Network Computing (ONC) RPC from Sun Microsystems and
Distributed Computing Environment(DCE)RPC from the Open
Software Foundation (OSF) can be used directly for implementing
RPC-based client/server applications.

The steps to implement and run a client/server application
with RPC are:
 Write an RPC interface in RPC IDL;
 Use an RPC compiler to compile the interface to generate a
client-side stub and a server-side skeleton;
 Implement the server;
 Implement the client;
 Compile all the code with a RPC library;
 Start the server;
 Start the client with the IP address of the server.
3. Java RMI: The Java RMI is an object-oriented mechanism from Sun
Microsystems for building distributed client/server applications. Java
RMI is an RPC implementation in Java. Similar to RPC, Java RMI
hides the low-level communications between client and server by
using a client-side stub and a server-side skeleton (which is not needed

in Java 1.2 or later) that are automatically generated from a class that
extends java.rmi.UnicastRemoteObject and implements an RMI
Remote interface. At run time there are three interacting entities
involved in an RMI application.
These are:
i. A client that invokes a method on a remote object.
ii. A server that runs the remote object which is an ordinary object
in the address space of the server process.
iii. The object registry (rmi registry), which is a name server that
relates objects with names. Remote objects need to be registered
with the registry. Once an object has been registered, the
registry can be used to obtain access to a remote object using
the name of that object.
The steps to implement and run a Java RMI client/server app are:
• Write an RMI interface;
• Write an RMI object to implement the interface;
• Use RMI compiler (rmic) to compile the RMI object to generate
a client-side stub and an server-side skeleton;
• Write an RMI server to register the RMI object;
• Write an RMI client;
• Use Java compiler (javac) to compile all the Java source codes;
• Start the RMI name server (rmiregistry);
• Start the RMI server; • Start the RMI client.
4. DCOM: The Component Object Model (COM) is a binary standard
for building Microsoft-based component applications, which is
independent of the implementation language. DCOM is an extension
to COM for distributed client/server applications. Similar to RPC,
DCOM hides the low-level communication by automatically
generating a client-side stub (called proxy in DCOM) and a server-
side skeleton (called stub in DCOM) using Microsoft’s Interface
Definition Language (MIDL) interfaces. DCOM uses a protocol called
the Object Remote Procedure Call (ORPC) to invoke remote COM
components. DCOM is language independent; clients and DCOM
components can be implemented in different languages. Although
DCOM is available on non-Microsoft platforms, it has only achieved

broad popularity on Windows. Another drawback of DCOM is that it
only supports synchronous communications.
The steps to implement and run a DCOM client/server applications
are:
• Write an MIDL interface;
• Use an interface compiler (midl) to compile the interface to
 generate a client-side stub and a server-side skeleton;
• Write the COM component to implement the interface;
• Write a DCOM client; • Compile all the codes;
• Register the COM component with a DCOM server;
• Start the DCOM server; • Start the DCOM client.
5. CORBA: CORBA is an object-oriented middleware infrastructure
from Object Management Group (OMG) for building distributed
client/server applications. Similar to Java RMI and DCOM, CORBA

hides the low-level communication between the client and server by
automatically generating a client-side stub and a server-side skeleton
through an Interface Definition Language (IDL) interface. CORBA
uses Internet-Inter ORB Protocol (IIOP) to invoke remote CORBA
objects.The Object Request Broker (ORB) is the core of CORBA; it
performs data marshaling and unmarshalling between CORBA clients
and objects. Compared with Java RMI and DCOM, CORBA is
independent of location, a particular platform or programming
language. CORBA supports both synchronous and asynchronous
communications. CORBA has an advanced directory service called
COSNaming, which provides the mechanisms to allow the transparent
location of objects. However, CORBA itself is only an OMG
specification.
The steps to implement and run a CORBA client/server
application are:
• Write a CORBA IDL interface;
• Use an IDL compiler to compile the interface to generate a
client side stub and a server-side skeleton;
• Write a CORBA object to implement the interface;
• Write a CORBA server to register the CORBA object;
• Write a CORBA client; • Compile all the source codes;
• Start a CORBA name server; • Start the CORBA server;

• Start the CORBA client.

Grid Topologies Unit I

Uploaded by

Copyright:

Available Formats

You might also like

Grid Topologies Unit I

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Grid Topologies Unit I

Uploaded by

Copyright:

Available Formats

INTRODUCTION & WORKING

 Grid Computing can be defined as a network of computers working

A Grid computing network mainly consists of these three types of machines

1. Control Node: A computer, usually a server or a group of servers which

1. General Applications of Grid Computing

2. Applications of Grid Computing Across Different Sectors

Applications of Grid Computing Across Different Sectors

Grid Computing in the Movie Industry

Grid Computing in the Gaming Industry

 Traditionally, the gaming industry was an in-house scenario.

• In-game art internal creation

• In-game cut scenes rendering

• Packaging game assets for multiple platforms

• Distribution of online program

• Hosting of a massively multiplayer online game

Grid Computing in Life Sciences

Grid Computing in Engineering and Design

Grid computing can answer these complexities:

o Analysis of real-time data to find a particular pattern.

Extra-grids are more complicated in that they involve a consolidation of

different intragrids. An extra-grid is characterized by dispersed security,

and the leveL management complexity increases. Security is an increased

concern because data passes beyond organizational boundaries. Resources are

more heterogeneous (due to multiple organizations being involved), more

dynamic in nature (organizations do not have control over each other’s

resources) and typically require policies in order to control utilization of the

resources. Data and resources in an extragrid environment are confined to an

organization and the partners it wishes to share with.

environment are global and are available to the public.

appear as if they are part of the local machine.

An infrastructure is an underlying foundation that provides basic

facilities, services and installations needed for the functioning of an

electronic device, etc

Grid computing is an emerging infrastructure that provides scalable and

dynamic collections of individuals and institutions. One of its goals is to

Components and Layers

designing a grid infrastructure. These components are outlined in:

Management: All grid systems consist of some form of management software.

contain software components that deal with the management of distributed

resources, workload and other grid components.

Contribution: Machines shared as part of a grid require a mechanism which

allows them to “contribute” themselves as a grid resource. This is typically

achieved through some form of contribution software. Contribution software

can accept and execute applications submitted to it by grid management

software. In addition, contribution software usually monitors resource

application execution (i.e., did a submitted application execute successfully?)

and reports it to higher level management components. This information is used

to make high level resource utilization and scheduling decisions.

Submission: Submission software enables a grid user to submit applications to

application. These actions typically require some form of authentication on the

type of authentication “document” such as one provided by a Certificate

Authority (CA). Submission software can be bundled as part of the contribution

software or implemented as a separate entity.

Communication: Communication software is responsible for making sure that

applications running on the grid are capable of locating and communicating

that perform a distributed computation. Once these sub-jobs have been

locate and establish a communication link between each other in order to

perform the computation.