Professional Documents
Culture Documents
4 It6602 Notes Unit 4
4 It6602 Notes Unit 4
Data-flow architectures can be classified into batch-sequential architectures and pipe and filters.
In the batch sequential style each step runs to completion before the next step starts. E.g. UNIX
command line pipes. In pipe and filters style steps might run concurrently processing parts of
data incrementally.
Dataflow network
Component : Transducer
Connectors : Data stream
Control Topology : Arbitrary
Control Synchronicity : Asynchronous
Binding time : Run time
Data Topology : Arbitrary
Data Continuity : Continuous, low or high volume
Data Mode : Passed
Data binding time : Run time
Control/Data Interaction : isomorphic shape
Flow Direction : Same
Type of Reasoning : Functional composition
Batch Sequential
In the batch sequential style, processing steps, or components, are independent programs, and
the assumption is that each step runs to completion before the next step starts. Each batch of data
is transmitted as a whole between the steps. The typical application for this style is classical data
processing.
Batch sequential is a classical data processing model, in which a data transformation subsystem
can initiate its process only after its previous subsystem.
The flow of data carries a batch of data as a whole from one subsystem to another.
The communications between the modules are conducted through temporary intermediate
files which can be removed by successive subsystems.
It is applicable for those applications where data is batched, and each subsystem reads
related input files and writes output files.
Typical application of this architecture includes business data processing such as banking
and utility billing.
Separate programs are executed in order; data is passed as an aggregate from one
program to the next.
Connectors: The human hand carrying tapes between the programs, also known as.
sneaker-net
Data Elements: Explicit, aggregate elements passed from one component to the next
upon completion of the producing programs execution.
Typical uses: Transaction processing in financial systems.
Batch Sequential
Examples
Payroll computations
Tax reports
Each step runs to completion before next step starts Atomic computations
Data transmitted as a whole between steps
Typical applications
o Classical data processing
o Program developments
Summary:
A pipe has a source end that can only be connected to a filter's output port and a sink end that can
only be connected to a filter's input port.
The connections between modules are data stream which is first-in/first-out buffer that can be
stream of bytes, characters, or any other type of such kind. The main feature of this architecture
is its concurrent and incremented execution.
In a pipe and filter style each component has a set of inputs (read) and a set of outputs (write).
A component reads streams of data on its inputs and produces streams of data on its outputs,
delivering a complete instance of the result in a standard order. This is usually accomplished by
applying a local transformation to the input streams and computing incrementally so output
begins before input is consumed. Hence components are termed filters. The connectors of this
style serve as conduits (channel for conveying water or fluid) for the streams, transmitting
outputs of one filter to inputs of another. Hence the connectors are termed pipes.
A filter is an independent data stream transformer or stream transducers. It transforms the data of
the input data stream, processes it, and writes the transformed data stream over a pipe for the
next filter to process. It works in an incremental mode, in which it starts working as soon as data
arrives through connected pipe. There are two types of filters active filter/ and passive filter.
Active filter: Active filter lets connected pipes to pull data in and push out the
transformed data. It operates with passive pipe, which provides read/write mechanisms
for pulling and pushing. This mode is used in UNIX pipe and filter mechanism.
Passive filter: Passive filter lets connected pipes to push data in and pull data out. It
operates with active pipe, which pulls data from a filter and pushes data into the next
filter. It must provide read/write mechanism.
Among the important invariants of the style, filters must be independent entities: in particular,
they should not share state with other filters. Another important invariant is that filters do not
know the identity of their upstream and downstream filters.
Their specifications might restrict what appears on the input pipes or make guarantees about
what appears on the output pipes, but they may not identify the components at the ends of those
pipes. Furthermore, the correctness of the output of a pipe and filter network should not depend
on the order in which the filters perform their incremental processingalthough fair scheduling
can be assumed.
Figure 1 illustrates this style. Common specializations of this style include pipelines, which
restrict the topologies to linear sequences of filters; bounded pipes, which restrict the amount of
data that can reside on a pipe; and typed pipes, which require that the data passed between two
filters, have a well-defined type.
Pipes form data transmission graphs. Overall computation Run pipes and filters (non-
deterministically) until no more computations are possible. Correctness of pipes and filters
networks output should not depend on ordering (invariant).
Examples
1. Programs written in Unix shells Filters: Unix processes. Unix supports this style by providing
a notation for connecting components (represented as Unix processes) and by providing
run time mechanisms for implementing pipes.
2. Pipes: runtime mechanisms for implementing them.
3. Compilers (exp of pipeline) (phases often not incremental). Lexical analysis, parsing, semantic
analysis, code generation. Example: ls invoices | grep -e August | sort
4. Signal processing domains
5. Parallel programming
6. Distributed programming
Advantages
Simplicity: Simple, intuitive, efficient composition of components
Reusability: High potential for reuse of components; any two filters can be
connected if they agree on data format
Evolvability: Changing architectures is trivial
Efficiency: Limited amount of concurrency (contrast batch-sequential)
Consistency: All components have the same interfaces, only one type of
connector
Distributability: Byte streams can be sent across networks
Ease of maintenance: filters can be added or replaced; Possible to hook any two
filters together
Potential for parallelism: filters implemented as separate tasks, consuming and
producing data incrementally
Certain analyses Throughput, latency, deadlock
Concurrent execution
Disadvantages
Batch-oriented processing (Often lead to a batch organization of processing)
Must agree on lowest-common-denominator data format
Does not guarantee semantics
Limited application domain: stateless data transformation
Sharing global data expensive or limiting
Scheme is highly dependent on order of filters
Can be difficult to design incremental filters
Not appropriate for interactive applications
Error handling difficult: what if an intermediate filter crashes?
Data type must be greatest common denominator, e.g. ASCII
Both decompose the task into a fixed sequence of computations (components) interacting
only through data passed from one to another
CALL-AND-RETURN STYLES
Call-and-Return architectures have the goal of achieving the qualities of modifiability and
solvability. Call-and-Return architectures have been the dominant architectural style in large
software systems for the past 30 years. However, within this style, a number of sub styles, each
of which has interesting features, have emerged.
usually the caller waits until an invoked service, completes and returns results before
continuing
components depend on invoked functionality to get their own work done
the correctness of each component may depend on the correctness of the functionality it
invokes
subroutines
o decomposition of main program into processing steps
functional modules
o aggregation of processing steps into modules
abstract data types
o bundle operations and data, hide representations and other decisions
objects
o sub-typing, polymorphism, dynamic binding of methods
client-server
o distribution, tiers
components
o multiple interfaces, advanced middleware services
services
o late binding of providers
Variations
Objects as concurrent tasks
Control Topology: Hierarchical
Control Synchronicity: Sequential
Binding time: Write/Compile time
Data Topology: Arbitrary
Data Continuity: Sporadic, low volume
Data Mode: Passed and shared
Data binding time: Run time
Control/Data Interaction: not isomorphic
Flow Direction: N/A
Type of Reasoning: Hierarchy
Advantages
Disadvantages
Remote procedure call systems are main-program-and-subroutine systems that are decomposed
into parts that live on computers connected via a network. The goal is to increase performance
by distributing the computations and taking advantage of multiple processors. In remote
procedure call systems, the actual assignment of parts to processors is deferred until runtime,
meaning that the assignment is easily changed to accommodate performance tuning. In fact,
except that subroutine calls may take longer to accomplish if it is invoking a function on a
remote machine, a remote procedure call is indistinguishable from standard main program and
subroutine systems.
Object-oriented or abstract data type systems are the modern version of call-and-return
architectures. The object-oriented paradigm, like the abstract data type paradigm from which it
evolved, emphasizes the bundling of data and methods to manipulate and access that data (Public
Interface). The object abstractions form components that provide black-box services and other
components that request those services. The goal is to achieve the quality of modifiability.
This bundle is an encapsulation that hides its internal secrets from its environment. Access to
the object is allowed only through provided operations, typically known as methods, which are
constrained forms of procedure calls. This encapsulation promotes reuse and modifiability,
principally because it promotes separation of concerns: The user of a service need not know, and
should not know, anything about how that service is implemented.
Components: Classes
Connectors: Routine calls
Key aspects
A class describes a type of resource and all accesses to it (encapsulation)
Representation hidden from client classes
Variations : Objects as concurrent tasks
Advantages
Disadvantages
Objects must know their interaction partners; when partner changes, clients must change
Side effects: if A uses B and C uses B, then Cs effects on B can be unexpected to A
Identity of interacting objects needs to be known and needs to be changed in all objects
interacting with an object whose identity was modified
Layered Systems
Layered systems are ones in which components are assigned to layers to control inter component
interaction. In the pure version of this architecture, each level communicates only with its
immediate neighbours.
The goal is to achieve the qualities of modifiability and, usually, portability. The lowest layer
provides some core functionality, such as hardware, or an operating system kernel. Each
successive layer is built on its predecessor, hiding the lower layer and providing some services
that the upper layers make use of.
Components (layers)
Programs or subprograms
Connectors (services)
Procedure calls or system calls
Configurations
Onion or stovepipe structure, possibly replicated
Underlying computational model
Procedure call/return
Stylistic invariants
Each layer provides a service only to the immediate layer above (at the next
higher level of abstraction) and uses the service only of the immediate layer
below (at the next lower level of abstraction)
Each layer provides certain services Hides part of lower layer (invariant)
Provides well-defined interfaces of services to certain other layers
(invariant)
Various functions Kernels: provide core capability, often set of procedures
Advantages
Decomposability: Effective separation of concerns
Maintainability: Changes that do not affect layer interfaces are easy to make
Evolvability: Potential for adding layers
Adaptability/Portability: Can replace inner layers as long as interfaces remain the
same (consider swapping out a Solaris JVM for a Linux one)
Understandability: Strict set of dependencies allow you to ignore outer layers
Modifiability : Changing one layer influences only the two adjacent layers
Reuse : Different implementations easy to substitute
Disadvantages
Performance degrades with too many layers
Can be difficult to cleanly assign functionality to the right layer
Not all systems suitable for this
Performance may require other coupling
Abstraction quite hard
Note : Control Topoly, Control Synchronicity, Binding Time, Control / Data Interaction, Flow
Direction parameters are NOT APPLICABLE for this style
Another example of data-centered architectures is the web architecture which has a common
data schema (i.e. meta-structure of the Web) and follows hypermedia data model and processes
communicate through the use of shared web-based data services.
Data-Centered Style
Client A Client B Client C
Shared Data
23
Advantage
Clients are independent from each other. Thus, a client can be changed, without affecting
the others. Also further clients can be added.
This advantage pales if the architecture is changed in such a way that clients are coupled
closely, for example in order to improve the performance of the system.
Disadvantage
Data consistency - synchronization of read/write operations
Data security, access control
Single point of failure
Types of Components
Access to shared data represents the core characteristic of data-centered architectures. The data
integrability forms the principal goal of such systems
The means of communication between the components distinguishes the subtypes of the data-
centered architectural style:
Interactions or communication between the data accessors is only through the data store. The
data is the only means of communication among clients. The flow of control differentiates the
architecture into two categories as,
In Repository Architecture Style, the data store is passive and the clients (software components
or agents) of the data store are active, which control the logic flow. The participating
components check the data-store for changes.
A client sends a request to the system to perform actions (e.g. insert data). The computational
processes are independent and triggered by incoming requests. If the types of transactions in an
input stream of transactions trigger selection of processes to execute, then it is traditional
database or repository architecture, or passive repository. This approach is widely used in
DBMS, library information system, the interface repository in CORBA, compilers, and CASE
(computer aided software engineering) environments.
Advantages
Provides data integrity, backup and restore features.
Provides scalability and reusability of agents as they do not have direct communication
with each other.
Reduces overhead of transient data between software components.
Disadvantages
High dependency between data structure of data store and its agents.
Changes in data structure highly affect the clients.
Evolution of data is difficult and expensive.
Cost of moving data on network for distributed data.
In Blackboard Architecture Style, the data store is active and its clients are passive. Therefore the
logical flow is determined by the current data status in data store. It has a blackboard component,
acting as a central data repository, and an internal representation is built and acted upon by
different computational elements.
Further, a number of components that act independently on the common data structure are stored
in the blackboard. In this style, the components interact only through the blackboard. The data-
store alerts the clients whenever there is a data-store changes. The current state of the solution is
stored in the blackboard and processing is triggered by the state of the blackboard.
When changes occur in the data, the system sends the notifications known as trigger and data to
the clients. This approach is found in certain AI applications and complex applications, such as
speech recognition, image recognition, security system, and business resource management
systems etc.
If the current state of the central data structure is the main trigger of selecting processes to
execute, the repository can be a blackboard and this shared data source is an active agent.
A major difference with traditional database systems is that the invocation of computational
elements in a blackboard architecture is triggered by the current state of the blackboard, and not
by external inputs.
Control
Control manages tasks and checks the work state.
Advantages
Blackboard Model provides concurrency that allows all knowledge sources to work in parallel as
they independent of each other. Its scalability feature facilitates easy steps to add or update
knowledge source. Further, it supports experimentation for hypotheses and reusability of
knowledge source agents.
Disadvantages
The structural change of blackboard may have a significant impact on all of its agents, as
close dependency exists between blackboard and knowledge source.
Blackboard model is expected to produce approximate solution; however, sometimes, it
becomes difficult to decide when to terminate the reasoning.
Further, this model suffers some problems in synchronization of multiple agents,
therefore, it faces challenge in designing and testing of the system.
EVENT STYLES
Event Based Implicit Invocation
Examples
debugging systems (listen for particular breakpoints)
database management systems (for data integrity checking) _ graphical user interfaces
Interesting properties
announcers of events dont need to know who will handle the event
Supports re-use, and evolution of systems (add new agents easily)
Disadvantages
Components have no control over ordering of computations
Components
Programs or program entities that announce and/or register interest in events
Events represent happenstances inside an entity that may (or may not) be of interest to
other entities
Connectors
Direct registration with announcing entities
Or, explicit event broadcast and registration infrastructure
Configurations
Implicit dependencies arising from event announcements and registrations
Publish-Subscribe Event-Based
A component may:
Announce events
Register a callback for events of other components
Connectors are the bindings between event announcements and routine calls (callbacks)
Components
o Publishers, subscribers
o Event generators and consumers
Connectors
Procedure calls
Event bus
Topology
Subscribers connect to publishers directly (or through network)
Components communicate with the event bus, not directly to each other
Advantages
o Efficient dissemination of one-way information
o Provides strong support for reuse
o Allows for decoupling and autonomy of components
o Any component can be added, by registering/subscribing for events
o Eases system evolution
components may be replaced without affecting other components in the system
Disadvantages
o Need special protocols when number of subscribers is very large
o Event abstraction does not cleanly lend itself to data exchange
o Difficult to reason about behaviour of an announcing component independently of
components that register for its events
o When a component announces an event:
it has no idea what other components will respond to it,
it cannot rely on the order in which the responses are invoked
it cannot know when responses are finished
o Correctness hard to ensure: depends on context and order of invocation
QA evaluation
Summary:
o GUI
o Multi-player network-based games
What are the advantages of using the style?
o Subscribers are independent from each other
o Very efficient one-way information dissemination
What are the disadvantages of using the style?
o When a number of subscribers is very high, special protocols are needed
CASE STUDIES
CASE STUDY 1 : KWIC (KEY WORD IN CONTEXT)
REFER UNIT 1
It is improvement over layered model as it did not isolate the functions in separate
partition.
Main problem with this model is that
o It is not clear how the user should interact with it
DESIGN CONSIDERATIONS
REQ1: Supports deliberate and reactive behavior. Robot must coordinate the actions to
accomplish its mission and reactions to unexpected situations
REQ2: Allows uncertainty and unpredictability of environment. The situations are not fully
defined and/or predicable. The design should handle incomplete and unreliable information
REQ3: System must consider possible dangerous operations by Robot and environment
REQ4: The system must give the designer flexibility (missions change/requirement changes)
The third solution is based on the form of implicit invocation, as embodied in the Task-Control-
Architecture (TCA). The TCA design is based on hierarchies of tasks or task trees
Parent tasks initiate child task
Temporal dependencies between pairs of tasks can be defined
A must complete A must complete before B starts (selective concurrency)
Allows dynamic reconfiguration of task tree at run time in response to sudden
change(robot and environment)
Uses implicit invocation to coordinate tasks
Tasks communicate using multicasting message (message server) to tasks that are
registered for these events