Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

Distributed System Unit No 2

What is Middleware?
Middleware is a type of computer software program that provides services to software
applications beyond those available from the operating system. It can be described as
"software glue".
Middleware makes it easier for software developers to implement communication and
input/output, so they can focus on the specific purpose of their application.

Role of middleware in distributed systems –


https://www.tutorchase.com/answers/ib/computer-science/what-is-the-role-of-
middleware-in-distributed-systems

Types of MiddleWare –
Message-Oriented Middleware (MOM):
Message-oriented middleware is designed to facilitate communication between
distributed applications through asynchronous message passing. Here's a more
detailed breakdown of its features and functionalities:
Asynchronous Communication: MOM systems allow applications to communicate
asynchronously, meaning that senders and receivers do not need to interact with each
other in real-time. Messages can be sent and received independently, enhancing
system flexibility and scalability.
Message Queuing: MOM typically employs message queues, which act as temporary
storage for messages sent from producers (senders) to consumers (receivers). This
decouples the communication process, enabling producers to send messages even if
consumers are temporarily unavailable.
Message Routing: MOM systems provide mechanisms for routing messages to their
intended destinations based on predefined rules or criteria. This ensures that messages
are delivered to the appropriate recipients efficiently.
Some of the routing mechanism are – P2P Messaging, Content base Messaging,
Publisher-Subscriber Messaging.
Message Transformation: MOM may offer features for transforming messages
between different formats or protocols. For instance, it can convert messages from one
data format to another to accommodate the requirements of different applications or
systems.
Reliability and Fault Tolerance: MOM systems often incorporate mechanisms for
ensuring message delivery reliability and fault tolerance. They may include features
such as message persistence, acknowledgment mechanisms, and failover capabilities
to handle system failures gracefully.
Examples of Message-Oriented Middleware include:
 Apache Kafka
 RabbitMQ
 IBM MQ (formerly known as WebSphere MQ)
 ActiveMQ
 ZeroMQ
Advantages of Message-Oriented Middleware (MOM):
 Asynchronous Communication
 Reliability
 Decoupling
 Message Transformation
 Scalability

Disadvantages of Message-Oriented Middleware (MOM):


Complexity: Implementing and managing MOM systems can be complex, especially
in scenarios involving high message volumes or complex routing requirements.
Latency: Asynchronous communication in MOM may introduce latency, as messages
are queued and processed at different times by producers and consumers.
Message Ordering: While MOM supports asynchronous communication, ensuring
strict message ordering can be challenging, particularly in distributed environments
with multiple message brokers or queues.
Overhead: Message queuing and processing overheads can impact system
performance, especially in scenarios where low-latency communication is essential.

Content-Centric Middleware:
Content-centric middleware focuses on managing and manipulating data or content
across distributed systems. Unlike MOM, which primarily deals with message
passing, content-centric middleware emphasizes operations related to data
dissemination, storage, and retrieval. Here's a closer look at its functionalities:
Content Discovery: Content-centric middleware enables applications to discover and
access content efficiently across distributed environments. It provides mechanisms for
indexing, cataloging, and searching content repositories, allowing users to locate
relevant data quickly.
Caching: Content-centric middleware often includes caching mechanisms to improve
data access performance and reduce network latency. Cached copies of frequently
accessed content are stored closer to the consumers, minimizing the need to retrieve
data from distant sources repeatedly.
Replication: To enhance data availability and resilience, content-centric middleware
supports content replication across distributed nodes or servers. Replication ensures
that multiple copies of data are available across the network, reducing the risk of data
loss or unavailability due to system failures.
Synchronization: Content-centric middleware facilitates data synchronization
between distributed replicas to ensure consistency and coherence. It manages
synchronization processes to update copies of content with the latest changes,
maintaining data integrity across the system.
Examples of Content-Centric Middleware include:
1. Apache CouchDB
2. Riak
3. Amazon DynamoDB
4. Cassandra
5. Redis

Functionalities of Middleware –
1. Data Transformation and Formatting
Middleware often includes features for transforming and formatting data to ensure that
it conforms to the requirements of the receiving application or system. For instance,
consider a middleware component that receives data in JSON format from a web
application and needs to transform it into XML format before sending it to a legacy
system that only accepts XML. The middleware can perform this transformation
seamlessly, ensuring compatibility between the two systems.

2. Security and Authentication


Middleware provides security mechanisms to safeguard data and communication
channels. It often includes features such as encryption, authentication, and
authorization to ensure that only authorized entities can access and manipulate the
shared resources.
For example, in an enterprise system where sensitive customer information is
exchanged between different departments, middleware can enforce authentication and
encryption protocols to ensure that only authorized users can access the data, thereby
maintaining confidentiality and integrity.
3. Load Balancing
In distributed systems, middleware may offer load balancing capabilities to evenly
distribute workloads across multiple servers or components. This helps optimize
resource utilization and improves system performance. For instance, a middleware
load balancer can monitor the workload on each server and dynamically route requests
to the least busy server, optimizing resource utilization and improving overall system
performance.
4. Event Handling and Notification
Middleware often supports event-driven architectures by providing mechanisms for
handling events and notifying interested parties about changes or specific occurrences.
This is important for real-time systems and applications.
5. Caching and Data Replication
Middleware can incorporate caching mechanisms to store frequently accessed data
locally, reducing latency and improving performance. For instance, in a content
delivery network (CDN), middleware caches frequently requested web content closer
to end-users, speeding up page load times. Additionally, middleware may support data
replication to ensure fault tolerance and high availability. For example, in a distributed
database system, middleware can replicate data across multiple nodes to prevent data
loss in case of hardware failures.
6. Monitoring and Logging
Middleware often includes monitoring and logging functionalities to track system
performance, detect anomalies, and assist debugging For example, in a cloud-based
application deployment, middleware can log metrics such as CPU usage, memory
consumption, and network traffic, allowing administrators to monitor system health
and identify potential issues proactively. Additionally, middleware can generate logs
for debugging purposes, helping developers diagnose and troubleshoot errors in
distributed systems

What is CORBA ?
https://datascientest.com/en/corba-infrastructure-definition-and-benefits

What is RMI ?
RMI stands for Remote Method Invocation. It is a mechanism that allows an object
residing in one system (JVM) to access/invoke an object running on another JVM.
RMI is used to build distributed applications; it provides remote communication
between Java programs. It is provided in the package java.rmi.
Goals of RMI
Following are the goals of RMI −
 To minimize the complexity of the application.
 To preserve type safety.
 Distributed garbage collection.
 Minimize the difference between working with local and remote objects.
Architecture of an RMI Application –
In an RMI application, we write two programs, a server program (resides on the
server) and a client program (resides on the client).
 Inside the server program, a remote object is created and reference of that
object is made available for the client (using the registry).
 The client program requests the remote objects on the server and tries to invoke
its methods.

The components of this architecture –


Transport Layer − This layer connects the client and the server. It manages the
existing connection and also sets up new connections.
Stub − A stub is a representation (proxy) of the remote object at client. It resides in
the client system; it acts as a gateway for the client program.
Skeleton − This is the object which resides on the server side. stub communicates
with this skeleton to pass request to the remote object.
RRL(Remote Reference Layer) − It is the layer that manages the references made by
the client to the remote object.
Working of an RMI Application -
 When the client makes a call to the remote object, it is received by the stub
which eventually passes this request to the RRL.
 When the client-side RRL receives the request, it invokes a method
called invoke() of the object remoteRef. It passes the request to the RRL on the
server side.
 The RRL on the server side passes the request to the Skeleton (proxy on the
server) which finally invokes the required object on the server.
 The result is passed all the way back to the client.

DCE –
Distributed Computing Environment(DCE) is an integrated set of services and tools
which are used for building and running Distributed Applications. It is a collection of
integrated software components/frameworks that can be installed as a coherent
environment on top of the existing Operating System and serve as a platform for
building and running Distributed Applications.
Components of DCE –
Remote Procedure Call(RPC): It is a call made when a Computer program wants to
execute a subroutine in a different computer(another computer on a shared network).
Distributed File System(DFS): DCE includes a distributed file system that allows
clients to access files and directories on remote servers as if they were local.
Directory Service: It is used to keep track location of Virtual Resources in the
Distributed System. These Resources include Files, Printers, Servers, Scanner, and
other machines. This service prompts the user to ask for resources(through the
process) and provide them with convenience. Processes are unaware of the actual
location of resources.
Security Service: It allows the process to check for User Authenticity. Only an
authorized person can have access to protected and secured resources. It allows only
an authorized computer on a network of Distributed Systems to have access to secured
resources.
Distributed Time Service: Inter-process communication between different system
components requires synchronization so that communication takes place in a
designated order only. This service is responsible for maintaining a global clock and
hence synchronizing the local clocks with the notion of time.
Thread Service: The Thread Service provides the implementation of lightweight
processes (threads). Helps in the synchronization of multiple threads within a shared
address space.
Advantages of DCE:
 Security
 Lower Maintenance Cost
 Scalability and Availability
 Reduced Risks

Middleware Issues-
Complexity: Middleware systems can introduce complexity to the software
architecture, especially in large-scale distributed environments. Managing middleware
components, configurations, and interactions can be challenging, requiring specialized
knowledge and expertise.
Performance Overheads: Middleware introduces additional layers of abstraction and
processing overhead, which can impact system performance. Message queuing, data
transformation, and protocol mediation can introduce latency and reduce throughput,
especially in high-volume environments.
Scalability Limitations: Some middleware solutions may have scalability limitations,
particularly when scaling horizontally across distributed nodes or servers. Bottlenecks
in message processing, resource contention, or communication overheads can hinder
the ability to scale the system effectively.
Maintenance and Upgrades: Middleware systems require ongoing maintenance,
updates, and patches to address security vulnerabilities, performance issues, and
compatibility issues with evolving technologies. Managing these maintenance tasks
across distributed deployments can be time-consuming and resource-intensive.
Cost: Deploying and maintaining middleware solutions can incur significant costs,
including licensing fees, infrastructure expenses, and operational overheads.
Organizations need to carefully evaluate the return on investment (ROI) and total cost
of ownership (TCO) of middleware solutions to justify their adoption.
Learning Curve: Mastering middleware technologies and best practices may require
a steep learning curve for developers, administrators, and IT personnel. Training and
skill development programs may be necessary to ensure effective utilization and
management of middleware resources.

Apache Kafka –
Apache Kafka is an open-source streaming data platform originally developed by
LinkedIn. As it expanded Kafka’s capabilities, LinkedIn donated it to Apache for
further development.

Kafka operates like a traditional pub-sub message queue, such as RabbitMQ, in that it
enables you to publish and subscribe to streams of messages. But it differs from
traditional message queues in 3 key ways:
1. Kafka operates as a modern distributed system that runs as a cluster and can
scale to handle any number of applications.
2. Kafka is designed to serve as a storage system and can store data as long as
necessary; most message queues remove messages immediately after the
consumer confirms receipt.
3. Kafka handles stream processing, computing derived streams and datasets
dynamically, rather than just passing batches of messages.
Core Concept of Kafka –
Broadly, Kafka accepts streams of events written by data producers. Kafka stores
records chronologically in partitions across brokers (servers); multiple brokers
comprise a cluster. Each record contains information about an event and consists of a
key-value pair; timestamp and header are optional additional information. Kafka
groups records into topics; data consumers get their data by subscribing to the topics
they want.
Kafka Use Cases –
 Real-time Data Ingestion
 Log Aggregation
 Event Sourcing
 Real-time Analytics
 Stream Processing
 Machine Learning Pipelines
 Commit Log

Adaptive software -
Adaptive software refers to software systems or applications that can modify their behavior,
structure, or configuration dynamically in response to changes in the environment, user
preferences, or system requirements. Adaptive software aims to improve flexibility,
adaptability, and responsiveness by adjusting its behavior autonomously without manual
intervention.

The concept of Separation of Concerns (SoC) and Computational Reflection plays a


crucial role in the design and implementation of adaptive software:
Separation of Concerns (SoC):
Separation of Concerns is a software engineering principle that advocates for dividing
a software system into distinct modules or components, each responsible for handling
a specific concern or aspect of the system. By separating concerns, developers can
manage complexity, promote modularity, and facilitate maintenance and evolution of
the software.
Computational Reflection:
Computational Reflection is a programming paradigm that enables a software system
to observe, introspect, and modify its own structure, behavior, or state at runtime.
Reflection mechanisms allow software components to inspect and manipulate
themselves, other components, or the execution environment dynamically.
Example –
Dynamic Resource Allocation in Cloud Computing:
Separation of Concerns: In cloud computing platforms, dynamic resource allocation
is a crucial aspect of adaptive software. Cloud providers separate concerns by
decoupling resource allocation mechanisms from application logic. They provide
separate modules or services for resource provisioning, monitoring, and management,
allowing users to adjust resource allocations independently of their applications' core
functionality.
Computational Reflection: Cloud platforms use computational reflection to monitor
resource usage, analyze performance metrics, and make runtime decisions on resource
allocation. Reflection mechanisms enable the cloud infrastructure to introspect its own
state, detect resource bottlenecks or underutilization, and dynamically adjust resource
allocations to optimize performance, scalability, and cost-efficiency.

You might also like