DS Unit-1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

DISTRIBUTED SYSTEMS

UNIT-1

CHARACTERIZATION OF DISTRIBUTED SYSTEMS:

What is a Distributed System?


Distributed System is a collection of autonomous computer systems that are physically separated but are
connected by a centralized computer network that is equipped with distributed system software. The autonomous
computers will communicate among each system by sharing resources and files and performing the tasks assigned to
them.

Types of Distributed Systems:


There are many models and architectures of distributed systems in use today.
 Client-server systems, the most traditional and simple type of distributed system, involve a multitude of
networked computers that interact with a central server for data storage, processing or other common goal.
 Peer-to-peer networks distribute workloads among hundreds or thousands of computers all running the
same software.
 Cell phone networks are an advanced distributed system, sharing workloads among handsets, switching
systems and internet-based devices.
So, The most common forms of distributed systems today operate over the internet, handing off workloads to
dozens of cloud-based virtual server instances that are created as needed, and then terminated when the task is
complete.
Examples or applications of Distributed System:

1. The Internet: The most ubiquitous example of a distributed system, connecting billions of devices
worldwide through a decentralized network of servers and routers.

2. Cloud Computing Platforms: Services like Amazon Web Services (AWS), Google Cloud Platform (GCP), and
Microsoft Azure provide distributed computing resources over the internet, allowing users to deploy and
manage applications across a network of servers.
3. Peer-to-Peer (P2P) Networks: File-sharing networks like BitTorrent and blockchain networks like Bitcoin
operate as distributed systems, where nodes (or peers) share resources and information directly with each
other without a centralized authority.

4. Distributed Databases: Systems like Apache Cassandra, MongoDB, and Riak distribute data across multiple
nodes to ensure scalability, fault tolerance, and high availability.

5. Content Delivery Networks (CDNs): Services like Akamai and Cloudflare distribute web content across
geographically dispersed servers to minimize latency and improve website performance.

6. Distributed File Systems: Examples include Google File System (GFS), Hadoop Distributed File System (HDFS),
and Dropbox, which store and manage large volumes of data across multiple servers.

7. Distributed Messaging Systems: Platforms like Apache Kafka and RabbitMQ enable asynchronous
communication and real-time data streaming between distributed applications.

8. Distributed Computing Frameworks: Systems like Apache Spark and Apache Hadoop allow for the
distributed processing of large datasets across clusters of computers.

9. Sensor Networks: Distributed systems of interconnected sensors collect and transmit data from the physical
environment, commonly used in applications such as environmental monitoring, smart grids, and industrial
automation.

10. Social Networks: Platforms like Facebook, Twitter, and LinkedIn are distributed systems that handle millions
of users and interactions across multiple servers and data centers.

Characteristics of Distributed System:

Hey there! I see you're interested in learning about the characteristics of distributed systems. While I can definitely
give you a detailed explanation, I'll try to keep it concise and highlight the key points. So, let's dive in!

1. Scalability: One of the primary characteristics of a distributed system is its ability to scale. This means that as the
demand for resources or services increases, the system can handle the additional load by adding more machines or
nodes to the network. Scalability allows distributed systems to accommodate growing user bases and handle larger
workloads without sacrificing performance.

2. Fault Tolerance: Distributed systems are designed to be fault-tolerant, meaning they can continue functioning
even if individual components or nodes fail. By distributing data and processing across multiple machines, a
distributed system can withstand failures and ensure uninterrupted operation. Techniques like redundancy,
replication, and error detection and recovery mechanisms help achieve fault tolerance.

3. Concurrency: Concurrency refers to the ability of a distributed system to handle multiple tasks or requests
simultaneously. In a distributed system, different nodes can work on different tasks concurrently, improving overall
efficiency and reducing response times. However, managing concurrency can be complex, and techniques like
synchronization, locking, and distributed algorithms are used to ensure data consistency and avoid conflicts.

4. Transparency: Transparency in a distributed system refers to the idea that users or applications should not be
aware of the underlying complexities of the system. There are different types of transparency, such as location
transparency (users are unaware of where resources are physically located), access transparency (users can access
resources without knowing their specific details), and failure transparency (users are shielded from the effects of
failures). Transparency simplifies the development and use of distributed systems.

5. Heterogeneity: Distributed systems often consist of different types of hardware, software, and network
components, making them heterogeneous. This heterogeneity poses challenges in terms of interoperability and
communication between different components. Standardized protocols, middleware, and APIs are used to ensure
seamless integration and communication across heterogeneous components.

6. Security: Security is a crucial aspect of distributed systems, as they often involve the exchange of sensitive data
over networks. Distributed systems employ various security measures, such as encryption, authentication, access
control, and secure communication protocols, to protect data and ensure the privacy and integrity of information.

7. Load Balancing: Load balancing is the process of distributing workloads evenly across multiple nodes in a
distributed system. It helps optimize resource utilization, improve performance, and prevent bottlenecks. Load
balancing techniques include dynamic resource allocation, task
8. Resource Sharing: It is the ability to use any Hardware, Software, or Data anywhere in the System.
9. Openness: It is concerned with Extensions and improvements in the system (i.e., How openly the software is
developed and shared with others).

Advantages of Distributed Systems:


 Better Performance: By using the resources of numerous computers to tackle the workload, distributed
systems can perform at higher levels than centralized systems.
 Cost Effectivity: Although distributed systems consist of high implementation costs, they are relatively
cost-effective in the long run. Compared to a mainframe computer, where a single system is composed of
several processors, the distributed system is made up of several computers together. This type of
infrastructure is far more cost-effective than a mainframe system.
 Efficiency: Distributed systems are made to be efficient in every aspect since they possess multiple
computers. Each of these computers could work independently to solve problems. This is not only considered
to be efficient, but also it significantly saves time for the user.
 Scalability: Distributed systems are made on default to be scalable. Whenever there is an increase in
workload, users can add more workstations. There is no need to upgrade a single system. Moreover, no
restrictions are placed on the number of machines. This means that these machines will be able to handle high-
demand workloads easily.
 Reliability: Distributed systems are far more reliable than single systems in terms of failures. Even in the
case of a single node malfunctioning, it does not pose problems to the remaining servers. Other nodes can
continue to function fine.
 Geographic Distribution: Geographic distribution is a feature of distributed systems that enables them to
offer services to users in various areas.
 Reduced Cost: Because distributed systems can make use of existing resources rather than needing to buy
new gear, they can be less expensive than centralized systems.
 Flexibility: Distributed systems are adaptable and can be tailored to fit a variety of needs, making them
suitable for a wide range of applications.
 Fault Tolerance: The ability to continue operating even when one or more nodes fail is known as fault
tolerance, and distributed systems can be built to be fault-tolerant.
 Reduced Latency: Distributed systems result in low latency. If a particular node is located closer to the
user, the distributed system makes sure that the system receives traffic from that node. Thus, the user could
notice much less time it takes to serve them.
 Security: Data breaches and illegal access can be prevented by including security safeguards in distributed
systems.
 Innovation: Data analytics, machine learning, and the Internet of Things are just a few of the areas where
distributed systems are fostering innovation (IoT).

Disadvantages of Distributed Systems:
 Compatibility: In a distributed system, compatibility across multiple nodes and software systems can be a
problem since they may employ various hardware, software, or protocols.
 Startup Cost: Compared to a single system, the implementation cost of a distributed system is significantly
higher. The infrastructure used in a distributed system makes it expensive. In addition to that, the constant
transmission of information and processing overhead further increases the cost.
 Security: Distributed systems always come with security risks since it contains open system characteristics.
The data of the user is stored in different workstations. Thus, the user needs to make sure that their data is
secured in each of these computers. Moreover, unlike in a centralized computing system, it is not an easy task
to manage data access in a distributed system.
 Overheads: Overheating is a common problem faced by a distributed system. This happens when all the
workstations try to operate at once. Even though this essentially brings desired results, eventually there will be
an increase in computing time. This ultimately impacts the system’s response time.
 Testing and Debugging: Because of the complexity of the system or the interactions between many nodes,
testing and debugging distributed systems can be difficult.
 Network Dependency: Distributed systems are prone to network errors which result in communication
breakdown. The information may fail to be delivered or not in the correct sequence. And also, troubleshooting
errors is a difficult task since the data is distributed across various nodes.
 Consistency: Data consistency can be difficult to ensure across several nodes and may call for the
deployment of intricate algorithms and protocols.
 Complexity: The difficulty involved in implementation, maintenance, and troubleshooting makes the
distributed system a complex strategy. Besides hardware complexity, distributed systems possess difficulty in
software too. The software used in distributed systems needs to be well-attentive when handling
communication and security

Applications Area of Distributed System:


 Finance and Commerce: Amazon, eBay, Online Banking, E-Commerce websites.
 Information Society: Search Engines, Wikipedia, Social Networking, Cloud Computing.
 Cloud Technologies: AWS, Salesforce, Microsoft Azure, SAP.
 Entertainment: Online Gaming, Music, youtube.
 Healthcare: Online patient records, Health Informatics.
 Education: E-learning.
 Transport and logistics: GPS, Google Maps.
 Environment Management: Sensor technologies.

RESOURCE SHARING AND THE WEB:

Resource sharing on the web refers to the ability to access and distribute various types of resources, like documents,
images, videos, and more. It's one of the fundamental aspects that make the web such a powerful tool for
information exchange and collaboration.

To understand resource sharing on the web, we need to explore a few key concepts and technologies. Let's start
with the Hypertext Transfer Protocol (HTTP). This protocol allows users to request and receive resources from web
servers. When you type a URL into your browser or click on a link, your browser sends an HTTP request to the server
hosting the resource. The server then responds with the requested resource, which your browser displays.

Now, let's talk about Uniform Resource Locators (URLs). URLs are the addresses that identify and locate specific
resources on the web. They serve as unique identifiers for web pages, images, videos, and other resources. When
you share a URL with someone, they can use it to access the same resource you're viewing. It's like giving them
directions to a specific location on the web!

Resource sharing on the web has revolutionized the way we access and interact with information. It has made
knowledge more accessible to people all around the world. Before the web, accessing resources often required
physical copies or visiting specific locations. Now, with just a few clicks, we can access a vast amount of knowledge
from the comfort of our homes.

One of the significant advantages of resource sharing on the web is the ease and efficiency it provides. With the click
of a button, you can share documents, images, and videos with others, regardless of their physical location. This has
greatly facilitated collaboration and communication, enabling teams to work together seamlessly, even if they're
miles apart.

Another benefit of resource sharing on the web is the ability to disseminate information quickly and widely. Content
creators can publish their work online, making it accessible to a global audience. This has opened up new avenues
for artists, writers, musicians, and other creative individuals to showcase their talent and reach a broader audience.

However, as with any technology, there are challenges and concerns associated with resource sharing on the web.
Let's address a few of them.

Let's continue exploring the challenges and concerns associated with resource sharing on the web:

As I mentioned earlier, copyright infringement is a significant concern. It's important to respect the intellectual
property rights of content creators. When sharing resources, we should always ensure that we have the necessary
permissions or licenses to do so. This helps protect the rights of creators and fosters a fair and ethical online
ecosystem.
Another challenge is the issue of misinformation and fake news. With the ease of sharing information on the web, it
becomes crucial to verify the credibility and accuracy of the resources we encounter. It's always a good idea to cross-
check information from multiple reliable sources before accepting it as factual. By being critical consumers of
information, we can help combat the spread of misinformation.

Privacy and security are also significant concerns when it comes to resource sharing on the web. It's essential to be
mindful of the personal information we share and the platforms we use to share it. Understanding privacy settings,
using secure connections (like HTTPS), and being cautious about the information we provide can help protect our
online identities and data.

Now, let's shift gears a bit and talk about some exciting developments in resource sharing on the web. One notable
advancement is the rise of cloud storage and file-sharing services. These platforms allow users to store and share
large files and collaborate on projects seamlessly. Services like Google Drive, Dropbox, and OneDrive have become
integral parts of many individuals' and businesses' workflows.

Another exciting development is the emergence of open-source software and collaborative platforms. Open-source
software refers to software whose source code is freely available for anyone to view, modify, and distribute. This
fosters collaboration and innovation, as developers from around the world can contribute to the improvement of
these software projects. Platforms like GitHub have become hubs for open-source development, enabling
developers to share code and collaborate on projects.

In conclusion, resource sharing on the web has transformed the way we access and distribute information. It has
made knowledge more accessible, facilitated collaboration, and empowered content creators. However, it also
comes with challenges such as copyright infringement, misinformation, and privacy concerns. By being responsible
and informed users of the web, we can harness the power of resource sharing while mitigating these challenges.

Distributed Systems: Challenges/Failures


There are also multiple challenges of distributed systems that determine the performance of the overall
system.

Heterogeneity

Heterogeneity is one of the challenges of a distributed system that refers to differences in hardware, software,
or network configurations among nodes. This can present challenges for communication and coordination.
Techniques for managing heterogeneity include middleware, virtualization, standardization, and service-
oriented architecture. These approaches can help build robust and scalable systems that accommodate
diverse configurations.

Note: Service-oriented architecture (SOA) is an approach used to create a modular and reusable system with
well-defined functionality.

Scalability

Scalability is one of the challenges in distributed systems. As distributed systems grow in size and
complexity, it becomes increasingly difficult to maintain their performance and availability. The major
challenges are security, maintaining consistency of data in every system, network latency between systems,
resource allocation, or proper node balancing across multiple nodes.

Openness

Openness in distributed systems refers to achieving a standard between different systems that use different
standards, protocols, and data formats. It is crucial to ensure that different systems can communicate and
exchange data seamlessly without the need for extensive manual intervention. It is also important to
maintain the correct amount of transparency and security in such systems.

Transparency

Transparency refers to the level of abstraction present in the system to hide complex information from the
user. It is essential to ensure that failures are transparent to users and do not affect the overall system's
performance. Systems with different hardware and software configurations provide to be a challenge for
Transparency. Security is also a concern to maintain transparency in distributed systems.

Concurrency

Concurrency is the ability to process data parallelly on different nodes of the system. One of the primary
challenges of concurrency in distributed systems is the issue of race conditions. Problems like
communication and synchronization between nodes also pose a challenge. When a node fails, the fault

Security

The distributed and heterogeneous nature of the distributed system makes security a major challenge for data
processing systems. The system must ensure confidentiality from unauthorized access as data is transmitted
across multiple nodes. Various methods like Digital signatures, Checksums, and Hash functions should be
used to verify the integrity of data as data is being modified by multiple systems. Authentication
mechanisms are also challenging as users and processes may be located on different nodes.

Failure Handling

One of the primary challenges of failure handling in distributed systems is identifying and diagnosing
failures as failure can occur at any node. Logging mechanisms should be implemented to identify the failed
nodes. Techniques like redundancy, replication, and checkpoints should be used to ensure the continuous
working of the system in case of a node failure. Data recovery should be implemented with techniques like
Rollback to recover data in the event of a failure.

Communication and Coordination:

In a distributed system, components need to communicate and coordinate with each other to accomplish tasks.
However, ensuring reliable and efficient communication can be challenging due to factors like network
latency, message loss, and synchronization issues.

Consistency and Replication:

When data is replicated across multiple nodes in a distributed system, ensuring consistency becomes crucial.
Maintaining consistency in the face of concurrent updates and network failures can be complex. Different
consistency models, such as strong consistency or eventual consistency, are used to address this challenge.

Scalability:

Distributed systems should be able to handle increasing workloads and accommodate a growing number of
users or data. Scaling a distributed system involves distributing the load across multiple nodes and ensuring
that performance remains acceptable as the system expands.

Load Balancing:
Efficiently distributing the workload across nodes in a distributed system is essential for optimal
performance. Load balancing algorithms and techniques are employed to ensure that no single node
becomes overwhelmed while others remain underutilized.
Monitoring and Debugging:

Diagnosing and resolving issues in a distributed system can be complex due to the distributed nature of the
components. Effective monitoring and debugging tools are required to identify and address performance
bottlenecks, failures, and other issues.

Consensus and Distributed Algorithms:

Achieving consensus among distributed nodes is a fundamental challenge in distributed systems. Distributed
algorithms, such as the Paxos algorithm or the Raft consensus algorithm, are used to coordinate actions and
ensure agreement among nodes.

These are just a few of the challenges that arise when designing, building, and maintaining distributed
systems. Each challenge requires careful consideration and often involves trade-offs to strike a balance
between performance, reliability, and scalability.

SYSTEM MODELS:

Types of System Models

Physical Model

A physical model is basically a representation of the underlying hardware elements of a distributed


system. It encompasses the hardware composition of a distributed system in terms of computers and
other devices and their interconnections. It is primarily used to design, manage, implement and
determine the performance of a distributed system. A physical model majorly consists of the following
components:
 Nodes – Nodes are the end devices that have the ability of processing data, executing tasks and
communicating with the other nodes. These end devices are generally the computers at the user end
or can be servers, workstations etc. Nodes provision the distributed system with an interface in the
presentation layer that enables the user to interact with other back-end devices, or nodes, that can
be used for storage and database services, or processing, web browsing etc. Each node has an
Operating System, execution environment and different middleware requirements that facilitate
communication and other vital tasks.
 Links – Links are the communication channels between different nodes and intermediate devices.
These may be wired or wireless. Wired links or physical media are implemented using copper wires,
fibre optic cables etc. The choice of the medium depends on the environmental conditions and the
requirements. Generally, physical links are required for high performance and real-time computing.
Different connection types that can be implemented are as follows:
 Point-to-point links – It establishes a connection and allows data transfer between only
two nodes.
 Broadcast links – It enables a single node to transmit data to multiple nodes
simultaneously.
 Multi-Access links – Multiple nodes share the same communication channel to transfer
data. Requires protocols to avoid interference while transmission.
 Middleware – These are the softwares installed and executed on the nodes. By running
middleware on each node, the distributed computing system achieves a decentralised control and
decision-making. It handles various tasks like communication with other nodes, resource
management, fault tolerance, synchronisation of different nodes and security to prevent malicious
and unauthorised access.
 Network Topology – This defines the arrangement of nodes and links in the distributed
computing system. The most common network topologies that are implemented are bus, star, mesh,
ring or hybrid. Choice of topology is done by determining the exact use cases and the requirements.
 Communication Protocols – Communication protocols are the set rules and procedures for
transmitting data from in the links. Examples of these protocols include TCP, UDP, HTTPS, MQTT etc.
These allow the nodes to communicate and interpret the data.

Architectural Model

Architectural model in distributed computing system is the overall design and structure of the system,
and how its different components are organised to interact with each other and provide the desired
functionalities. It is an overview of the system, on how will the development, deployment and operations
take place. Construction of a good architectural model is required for efficient cost usage, and highly
improved scalability of the applications. The key aspects of architectural model are –
 Client-Server model – It is a centralised approach in which the clients initiate requests for
services and severs respond by providing those services. It mainly works on the request-response
model where the client sends a request to the server and the server processes it, and responds to the
client accordingly. It can be achieved by using TCP/IP, HTTP protocols on the transport layer. This is
mainly used in web services, cloud computing, database management systems etc.
 Peer-to-peer model – It is a decentralised approach in which all the distributed computing nodes,
known as peers, are all the same in terms of computing capabilities and can both request as well as
provide services to other peers. It is a highly scalable model because the peers can join and leave the
system dynamically, which makes it an ad-hoc form of network. The resources are distributed and the
peers need to look out for the required resources as and when required. The communication is
directly done amongst the peers without any intermediaries according to some set rules and
procedures defined in the P2P networks. The best example of this type of computing is BitTorrent.

 Layered model – It involves organising the system into multiple layers, where each layer will
provision a specific service. Each layer communicated with the adjacent layers using certain well-
defined protocols without affecting the integrity of the system. A hierarchical structure is obtained
where each layer abstracts the underlying complexity of lower layers.

 Micro-services model – In this system, a complex application or task, is decomposed into multiple
independent tasks and these services running on different servers. Each service performs only a single
function and is focussed on a specific business-capability. This makes the overall system more
maintainable, scalable and easier to understand. Services can be independently developed, deployed
and scaled without affecting the ongoing services.
Fundamental Model

The fundamental model in a distributed computing system is a broad conceptual framework that helps in
understanding the key aspects of the distributed systems. These are concerned with more formal
description of properties that are generally common in all architectural models. It represents the
essential components that are required to understand a distributed system’s behaviour. Three
fundamental models are as follows:
 Interaction Model – Distributed computing systems are full of many processes interacting with
each other in highly complex ways. Interaction model provides a framework to understand the
mechanisms and patterns that are used for communication and coordination among various
processes. Different components that are important in this model are –
 Message Passing – It deals with passing messages that may contain, data, instructions, a
service request, or process synchronisation between different computing nodes. It may be
synchronous or asynchronous depending on the types of tasks and processes.
 Publish/Subscribe Systems – Also known as pub/sub system. In this the publishing process
can publish a message over a topic and the processes that are subscribed to that topic can take it up
and execute the process for themselves. It is more important in an event-driven architecture.
 Remote Procedure Call (RPC) – It is a communication paradigm that has an ability to invoke a
new process or a method on a remote process as if it were a local procedure call. The client process
makes a procedure call using RPC and then the message is passed to the required server process using
communication protocols. These message passing protocols are abstracted and the result once
obtained from the server process, is sent back to the client process to continue execution.
 Failure Model – This model addresses the faults and failures that occur in the distributed
computing system. It provides a framework to identify and rectify the faults that occur or may occur
in the system. Fault tolerance mechanisms are implemented so as to handle failures by replication
and error detection and recovery methods. Different failures that may occur are:
 Crash failures – A process or node unexpectedly stops functioning.
 Omission failures – It involves a loss of message, resulting in absence of required
communication.
 Timing failures – The process deviates from its expected time quantum and may lead to
delays or unsynchronised response times.
 Byzantine failures – The process may send malicious or unexpected messages that conflict
with the set protocols.
 Security Model – Distributed computing systems may suffer malicious attacks, unauthorised
access and data breaches. Security model provides a framework for understanding the security
requirements, threats, vulnerabilities, and mechanisms to safeguard the system and its resources.
Various aspects that are vital in the security model are –
 Authentication – It verifies the identity of the users accessing the system. It ensures that
only the authorised and trusted entities get access. It involves –
 Password-based authentication – Users provide a unique password to prove their
identity.
 Public-key cryptography – Entities possess a private key and a corresponding
public key, allowing verification of their authenticity.
 Multi-factor authentication – Multiple factors, such as passwords, biometrics, or
security tokens, are used to validate identity.
 Encryption – It is the process of transforming data into a format that is unreadable
without a decryption key. It protects sensitive information from unauthorized access or disclosure.
 Data Integrity – Data integrity mechanisms protect against unauthorised modifications or
tampering of data. They ensure that data remains unchanged during storage, transmission, or
processing. Data integrity mechanisms include:
 Hash functions – Generating a hash value or checksum from data to verify its integrity.
 Digital signatures – Using cryptographic techniques to sign data and verify its authenticity
and integrity.
HARDWARE CONCEPTS OF DISTRIBUTED SYSTEMS:
Distributed Systems: Software Concepts
Distributed operating system
_ Network operating system
_ Middleware
Distributed Operating System

Some characteristics:
_ OS on each computer knows about the other computers
_ OS on different computers generally the same
_ Services are generally (transparently) distributed across computers
Network Operating System

Some characteristics:
_ Each computer has its own operating system with networking facilities
_ Computers work independently (i.e., they may even have different operating systems)
_ Services are tied to individual nodes (ftp, telnet, WWW)
_ Highly file oriented (basically, processors share only files)
Distributed System (Middleware)
Some characteristics:
_ OS on each computer need not know about the other computers
_ OS on different computers need not generally be the same
_ Services are generally (transparently) distributed across computersPage | 8
-
Need for Middleware
Motivation: Too many networked applications were
hard or difficult to integrate:
_ Departments are running different NOSs
_ Integration and interoperability only at level of primitive NOS services
_ Need for federated information systems:
– Combining different databases, but providing a single view to applications
– Setting up enterprise-wide Internet services, making use of existing information systems
– Allow transactions across different databases
– Allow extensibility for future services (e.g., mobility, teleworking, collaborative applications)
_ Constraint: use the existing operating systems, and treat them as the underlying environment
(they provided the basic functionality anyway)
Communication services: Abandon primitive socket based message passing in favor of:
_ Procedure calls across networks
_ Remote-object method invocation
_ Message-queuing systems
_ Advanced communication streams
_ Event notification service
Information system services: Services that help manage data in a distributed system:
_ Large-scale, system wide naming services
_ Advanced directory services (search engines)
_ Location services for tracking mobile objects
_ Persistent storage facilities
_ Data caching and replication
Control services: Services giving applications control over when, where, and how they access
data:Page | 9
_ Distributed transaction processing
_ Code migration
Security services: Services for secure processing and communication:
_ Authentication and authorization services
_ Simple encryption services
_ Auditing service
NETWORKING IN DISTRIBUTED SYSTEMS:
In distributed systems, networking plays a crucial role in enabling
communication and coordination between different components spread across
multiple machines or locations. It allows these components to exchange data,
synchronize their actions, and collaborate to achieve a common goal. Let's dive
into some key concepts and technologies related to networking in distributed
systems.

1. Network Protocols: Network protocols define the rules and conventions for
communication between devices on a network. Protocols like TCP/IP (Transmission
Control Protocol/Internet Protocol) form the foundation of the internet and are
widely used in distributed systems. They ensure reliable and ordered delivery
of data packets, error detection and correction, and address routing.

2. Message Passing: In distributed systems, message passing is a fundamental


mechanism for communication between components. It involves sending messages or
data packets from one component to another over the network. Message passing
can be either synchronous or asynchronous, depending on whether the sender
waits for a response or not. It enables coordination, data sharing, and
synchronization between distributed components.

3. Remote Procedure Call (RPC): RPC is a technique that allows a distributed


system to invoke procedures or methods on remote components as if they were
local. It abstracts the complexities of network communication and provides a
familiar programming model. RPC frameworks like gRPC and Apache Thrift enable
developers to build distributed systems by defining interfaces and generating
code for client-server communication.

4. Publish-Subscribe Systems: Publish-subscribe systems facilitate


communication between distributed components using a messaging pattern.
Publishers produce messages, while subscribers express interest in specific
types of messages. A message broker acts as an intermediary, delivering
messages from publishers to interested subscribers. Technologies like Apache
Kafka and RabbitMQ are widely used for building publish-subscribe systems.

5. Distributed File Systems: Distributed file systems provide a way to store


and access files across multiple machines in a distributed system. They
abstract the complexities of data storage and replication, ensuring fault
tolerance and scalability. Examples include the Google File System (GFS) and
the Hadoop Distributed File System (HDFS).

6. Load Balancing: Load balancing is a technique used to distribute incoming


network traffic across multiple servers or nodes in a distributed system. It
ensures that no single component becomes overwhelmed with requests, improving
performance and scalability. Load balancers like Nginx and HAProxy
intelligently distribute traffic based on various algorithms.

What is Internet? Definition, Uses, Working, Advantages and


Disadvantages
Introduction to Internet

The Internet is the foremost important tool and the prominent resource that is being used by almost every
person across the globe. It connects millions of computers, webpages, websites, and servers. Using the
internet we can send emails, photos, videos, and messages to our loved ones. Or in other words, the
Internet is a widespread interconnected network of computers and electronic devices(that support
Internet). It creates a communication medium to share and get information online. If your device is
connected to the Internet then only you will be able to access all the applications, websites, social media
apps, and many more services. The Internet nowadays is considered the fastest medium for sending and
receiving information.
Internet

How is the Internet Set Up?

The internet is set up with the help of physical optical fiber data transmission cables or copper wires and
various other networking mediums like LAN, WAN, MAN, etc. For accessing the Internet even the 2G, 3G,
and 4G services and the Wifi require these physical cable setups to access the Internet. There is an
authority named ICANN (Internet Corporation for Assigned Names and Numbers) located in the USA
which manages the Internet and protocols related to it like IP addresses.

How Does the Internet Work?

The actual working of the internet takes place with the help of clients and servers. Here the client is a
laptop that is directly connected to the internet and servers are the computers connected indirectly to the
Internet and they are having all the websites stored in those large computers. These servers are connected
to the internet with the help of ISP (Internet Service Providers) and will be identified with the IP address.

Each website has its Domain name as it is difficult for any person to always remember the long numbers or
strings. So, whenever you search for any domain name in the search bar of the browser the request will be
sent to the server and that server will try to find the IP address from the Domain name because it cannot
understand the domain name. After getting the IP address the server will try to search the IP address of
the Domain name in a Huge phone directory that in networking is known as a DNS server (Domain Name
Server). For example, if we have the name of a person and we can easily find the Aadhaar number of
him/her from the long directory as simple as that.

So after getting the IP address, the browser will pass on the further request to the respective server and
now the server will process the request to display the content of the website which the client wants. If you
are using a wireless medium of Internet like 3G and 4G or other mobile data then the data will start flowing
from the optical cables and will first reach towers from there the signals will reach your cell phones and
PCs through electromagnetic waves and if you are using routers then optical fiber connecting to your
router will help in connecting those light-induced signals to electrical signals and with the help of ethernet
cables internet reaches your computers and hence the required information.

For more, you can refer to How Does the Internet Work?

World Wide Web (WWW)

The world wide web is a collection of all the web pages, and web documents that you can see on the
Internet by searching their URLs (Uniform Resource Locator) on the Internet. For example,
www.geeksforgeeks.org is the URL of the GFG website, and all the content of this site like webpages and all
the web documents are stored on the world wide Web. Or in other words, the world wide web is an
information retrieval service of the web. It provides users with a huge array of documents that are
connected to each other by means of hypertext or hypermedia links. Here, hyperlinks are known as
electronic connections that link the related data so that users can easily access the related information
hypertext allows the user to pick a word or phrase from text, and using this keyword or word or phrase can
access other documents that contain additional information related to that word or keyword or phrase.
World wide web is a project which is created by Timothy Berner’s Lee in 1989, for researchers to work
together effectively at CERN. It is an organization, named World Wide Web Consortium (W3C), which was
developed for further development in the web.

WWW Image

Difference Between World Wide Web and the Internet

The main difference between the World Wide Web and the Internet are:

World Wide Web Internet

All the web pages and web documents are stored


The Internet is a global network of computers that is
there on the World wide web and to find all that
accessed by the World wide web.
stuff you will have a specific URL for each website.

The world wide web is a service. The Internet is an infrastructure.

The world wide web is a subset of the Internet. The Internet is the superset of the world wide web.

The world wide web is software-oriented. The Internet is hardware-oriented.

The world wide web uses HTTP. The Internet uses IP Addresses.
World Wide Web Internet

The world wide web can be considered as a book


The Internet can be considered a Library.
from the different topics inside a Library.

Uses of the Internet

Some of the important usages of the internet are:

 Online Businesses (E-commerce): Online shopping websites have made our life easier, e-commerce
sites like Amazon, Flipkart, and Myntra are providing very spectacular services with just one click and
this is a great use of the Internet.
 Cashless Transactions: All the merchandising companies are offering services to their customers to
pay the bills of the products online via various digital payment apps like Paytm, Google Pay, etc. UPI
payment gateway is also increasing day by day. Digital payment industries are growing at a rate of 50%
every year too because of the INTERNET.
 Education: It is the internet facility that provides a whole bunch of educational material to
everyone through any server across the web. Those who are unable to attend physical classes can
choose any course from the internet and can have point-to-point knowledge of it just by sitting at home.
High-class faculties are teaching online on digital platforms and providing quality education to students
with the help of the Internet.
 Social Networking: The purpose of social networking sites and apps is to connect people all over
the world. With the help of social networking sites, we can talk, and share videos, and images with our
loved ones when they are far away from us. Also, we can create groups for discussion or for meetings.
 Entertainment: The Internet is also used for entertainment. There are numerous entertainment
options available on the internet like watching movies, playing games, listening to music, etc. You can
also download movies, games, songs, TV Serial, etc., easily from the internet.

Security and the Internet

Very huge amount of data is managed across the Internet almost the time, which leads to the risk of data
breaching and many other security issues. Both Hackers and Crackers can lead to disrupting the network
and can steal important information like Login Credentials, Banking Credentials, etc.

Steps to Protect the Online Privacy

 Install Antivirus or Antimalware.


 Create random and difficult passwords, so that it becomes difficult to guess.
 Use a private browsing window or VPN for using the Internet.
 Try to use HTTPS only for better protection.
 Try to make your Social Media Account Private.

INTRANET

What is an Intranet?
A private network utilized by a company might be referred to as an intranet. Its main objectives are to
support safe staff communication, information archiving, and teamwork. Employees can create profiles,
submit, like, comment on, and share posts using social intranet features that are common in contemporary
intranets.

What is Intranet?

An intranet is a kind of private network. For example, an intranet is used by different organizations and
only members/staff of that organization have access to this. It is a system in which multiple computers of
an organization (or the computers you want to connect) are connected through an intranet. As this is a
private network, so no one from the outside world can access this network. So many organizations and
companies have their intranet network and only its members and staff have access to this network. This is
also used to protect your data and provide data security to a particular organization, as it is a private
network and does not leak data to the outside world.

Working of Intranet

An intranet is a network confined to a company, school, or organization that works like the Internet. Let us
understand more about the working of the intranet with the help of a diagram, as shown below:

Here in this diagram, a company or an organization has created its private network or intranet for its
work(intranet network is under the circle). The company or organization has many employees(in this
diagram, we have considered 3). So, for their access, they have PC 1, PC 2, and PC 3(In the real world there
are many employees as per the requirements of an organization). Also, they have their server for files
or data to store, and to protect this private network, there is a Firewall. This firewall protects and gives
security to the intranet server and its data from getting leaked to any unwanted user. So, a user who has
access to the intranet can only access this network. So, no one from the outside world can access this
network. Also, an intranet user can access the internet but a person using the internet cannot access the
intranet network.

Why is Intranet Important?

Intranets play a crucial role in organizations by providing a centralized platform for seamless internal
communication, collaboration, and knowledge sharing, thereby significantly enhancing productivity,
streamlining operations, and fostering a culture of innovation and efficiency. Here are the reasons that
increase its importance:

 Improves internal communication


 Connects employees across locations and time zones
 Boosts recognition and reward
 Simplifies employee onboarding
 Provides organizational clarity
 Encourages knowledge sharing

Features of Intranet
 Document management: The ability to store, organize, and share documents.
 Collaboration tools: The ability to collaborate on projects and tasks.
 News and announcements: The ability to share news and announcements with employees.
 Employee directory: The ability to find contact information for employees.
 Training and development: The ability to provide training and development resources to
employees.
 HR resources: The ability to access HR-related information, such as benefits and policies.
 Support services: The ability to submit support tickets and get help from IT.
difference between the internet and intranet:

S.NO Internet Intranet

Internet is used to connect different


1. Intranet is owned by private firms.
networks of computers simultaneously.

On the internet, there are multiple On an intranet, there are limited


2.
users. users.

3. Internet is unsafe. Intranet is safe.

On the internet, there is more number In the intranet, there is less


4.
of visitors. number of visitors.

5. Internet is a public network. Intranet is a private network.

In this, anyone can’t access the


6. Anyone can access the Internet.
Intranet.

The Internet provides unlimited Intranet provides limited


7.
information. information.

A company used to communicate


Using Social media on your phone or
8. internally with its employees and
researching resources via Google.
share information

The Internet is a global network that An intranet is a private network


9. connects millions of devices and that connects devices and computers
computers worldwide. within an organization.

It is open to everyone and allows An intranet is only accessible to


10. access to public information, such as authorized users within the
websites and online services. organization.

An intranet is primarily used for


It is used for communication, sharing
internal communication,
11. of information, e-commerce, education,
collaboration, and information
entertainment, and other purposes.
sharing within an organization.

Access to an intranet is restricted


Users can access the Internet from any
to authorized users within the
12. location with an Internet connection
organization and is typically
and a compatible device.
limited to specific devices and
S.NO Internet Intranet

locations.

Security measures, such as firewalls, Intranets employ similar security


encryption, and secure sockets layer measures to protect against
13. (SSL) protocols, are used to protect unauthorized access and ensure the
against threats like hacking, viruses, privacy and integrity of shared
and malware. data.

The Internet is a public network that Intranets are private networks that
14. is not owned by any particular are owned and managed by the
organization or group. organization that uses them.

Examples of intranet-based services


Examples of Internet-based services
include internal communications,
15. include email, social media, search
knowledge management systems, and
engines, and online shopping sites.
collaboration tools

COMPUTER NETWORKS: AND ITS TYPES:

Computer Networks
Computer Networking
A computer network is a cluster of computers over a shared communication path that works to share resources from one
computer to another, provided by or located on the network nodes.

Uses of Computer Networks


 Communicating using email, video, instant messaging, etc.
 Sharing devices such as printers, scanners, etc.
 Sharing files.
 Sharing software and operating programs on remote systems.
 Allowing network users to easily access and maintain information.

Types of Computer Networks


There are mainly five types of Computer Networks
1. Personal Area Network (PAN)
2. Local Area Network (LAN)
3. Metropolitan Area Network (MAN)
4. Wide Area Network (WAN)

Types of Computer Networks


These are explained below.

1. Personal Area Network (PAN)

PAN is the most basic type of computer network. This network is restrained to a single person, that is, communication
between the computer devices is centered only on an individual’s workspace. PAN offers a network range of 1 to 100
meters from person to device providing communication. Its transmission speed is very high with very easy maintenance
and very low cost.
This uses Bluetooth, IrDA, and Zigbee as technology.
Examples of PAN are USB, computer, phone, tablet, printer, PDA, etc.

Personal Area Network (PAN)

2. Local Area Network (LAN)

LAN is the most frequently used network. A LAN is a computer network that connects computers through a common
communication path, contained within a limited area, that is, locally. A LAN encompasses two or more computers
connected over a server. The two important technologies involved in this network are Ethernet and Wi-fi. It ranges up to
2km & transmission speed is very high with easy maintenance and low cost.
Examples of LAN are networking in a home, school, library, laboratory, college, office, etc.

Local Area Network (LAN)

4. Metropolitan Area Network (MAN)

A MAN is larger than a LAN but smaller than a WAN. This is the type of computer network that connects computers over a
geographical distance through a shared communication path over a city, town, or metropolitan area. This network mainly
uses FDDI, CDDI, and ATM as the technology with a range from 5km to 50km. Its transmission speed is average. It is
difficult to maintain and it comes with a high cost.
Examples of MAN are networking in towns, cities, a single large city, a large area within multiple buildings, etc.

Metropolitan Area Network (MAN)

5. Wide Area Network (WAN)

WAN is a type of computer network that connects computers over a large geographical distance through a shared
communication path. It is not restrained to a single location but extends over many locations. WAN can also be defined as
a group of local area networks that communicate with each other with a range above 50km.
Here we use Leased-Line & Dial-up technology. Its transmission speed is very low and it comes with very high maintenance
and very high cost.
The most common example of WAN is the Internet.

You might also like