Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

1805594

RISHABH DEO

Q5. Discuss some effective ways to measure the fault tolerance technique to increase the reliability
of a private, public and hybrid cloud.

Ans: Cloud computing has brought about a transformation in the delivery model of information
technology from a product to a service. However, the performance of cloud computing services is
hampered due to their inherent vulnerability to failures owing to the scale at which they operate. It
is possible to utilize cloud computing services to their maximum potential only if the performance
related issues of reliability, availability, and throughput are handled effectively by cloud service
providers. Therefore, fault tolerance becomes a critical requirement for achieving high performance
in cloud computing.

Fault tolerance in cloud computing is about designing a blueprint for continuing the ongoing work
whenever a few parts are down or unavailable. This helps the enterprises to evaluate their
infrastructure needs and requirements, and provide services when the associated devices are
unavailable due to some cause. It doesn’t mean that the alternate arrangement can provide 100% of
the full service, but this concept keeps the system in running mode at a useable, and most
importantly, at a reasonable level. This is important if the enterprises are to keep growing in a
continuous mode and increase their productivity levels.

Fault tolerance techniques implemented can be measured using the following parameters:

Parameters/Metrics Description
Adaptive All processes are automatically executed according to the conditions
Performance Used to ensure an efficiency of the system
Response Time Total time that is taken to respond/reply to a specific algorithm
Throughput It computes the number of tasks whose implementation has been
completed successfully
Reliability Its main motive is to provide accurate or acceptable result in a certain time
period
Availability It is described as probability i.e. the system is functioning properly after it
is requested/intended for use
Usability a user can make use of an invention/a product to accomplish the target
with efficiency, effectiveness, and satisfaction
Overhead Associated determine the total overhead involved while executing a fault-tolerance
(FT) algorithm
Cost-effectiveness It is a description of the system monetarily

On the basis of these parameters, we can measure the reliability of a private, public and hybrid
cloud.
Q7. Design a stock performance system to predict good and bad stocks based on their history
exhibiting parallel and distributed computing

A: Basically stocks can be predicted in two methods:

i) from historical data, stocks can be predicted as good or bad stocks


ii) from market sentiment analysis, a prediction can be made on which stocks can be potential
good or bad.
Here we will design our system using historical data based time-series model.

To apply the model with a good confidence interval, we need to provide huge amount of historical
data. To handle this huge amount of data we will use principles of distributed computing and weight
sharing.

The distributed computing mechanism can be implemented by deploying a distributed master-client


time-series model. In this setting, there will be different nodes (clients) who will be responsible for
processing batches of data (random chunks of data) to generate a time-series model. After regular
time-stamps these models generated in different nodes, will be communicated to the master node,
where all these time-series model will be aggregated. This reduces the overall time highly as well as
minimizes the computational cost. Every node works in parallel, thus making the overall data
processing very smooth and fault-tolerant. The communication between different nodes is done
using weight sharing. Weight Sharing refers to the process where the weights or parameters of a
time-series model is sent from one node to another node, to pass the underlying probability density
in the time-series model.

The time-series model that will be used here consists of LSTM and ARIMA. Basically ARIMA model
mainly captures linear relationship, whereas LSTM can capture higher orders, as well as it has a
benefit of modelling the past observations along with present (I.e. it can remember and represent
the past data, which can impact very well in the prediction. Also we can’t use only LSTM model,
because ARIMA performs well in short term forecasting. So, a mixture of both will outperform both
of them individually. The hybrid model can be generated by taking some weighted average of the
predicted forecast values of LSTM and ARIMA.

Volume - The volume/amount of data will be huge like Terabytes, Petabytes or more.

Velocity - The volume of data is huge, as well as the velocity of generation of data is also very high.
So we use batch processing (distributed processing) to tackle the high velocity data.

Variety - The data is structured. It is mostly available in proper tabular forms.

Veracity - As the data is directly generated from real time stocks rise and fall, so mostly data is not
tampered and is reliable and trustworthy.

Value - The modelling of these data has a very high statistical value, as it helps users decide where to
invest, based on the prediction of good and bad stocks.
Q9. Explain the similarity and difference between JSON and BSON with suitable examples.
Ans: JSON, or JavaScript Object Notation, is a minimal, text-based, human-readable format for
structuring data. The two primary parts that make up JSON are keys and values. Together they make
a key/value pair. At high level, JSON will have two things- An object and an array. An object is a
collection of name/value pairs and an array is an ordered list of values. With the combination of two,
you can have a complete JSON structure.
Example:
{"menu": {
"id": "file",
"value": "File",
"popup": {
"menu item": [
{"value": "New", "onclick": "CreateDoc()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Save", "onclick": "SaveDoc()"}
]
}
}}
BSON, or Binary JSON i.e. Binary JavaScript Object Notation. BSON is a lightweight binary-encoded
serialization format capable of representing any JSON-like document. BSON is a superset of JSON.
BSON supports all of the JSON types but also contains extensions or extra types that are not part of
the JSON spec.
Example:
{"hello":"world"} 
\x16\x00\x00\x00 // total document size
\x02 // 0x02 = type String
hello\x00 // field name
\x06\x00\x00\x00world\x00 // field value (size of value, value, null terminator)
\x00 // 0x00 = type EOO ('end of object')

Comparison:
As BSON is a superset of JSON, they share many similarities in function and usage but Unlike JSON,
BSON is not in a readable format. It supports the embedding of documents and arrays within other
documents and arrays. Like JSON, it is easy for machines to parse and generate.
BSON is designed to be efficient in space, but in some cases is not much more efficient than JSON. In
some cases BSON uses even more space than JSON. The reason for this is another of the BSON
design goals: traversability. BSON adds some "extra" information to documents, like length of strings
and sub objects. This makes traversal faster.
BSON is also designed to be fast to encode and decode. For example, integers are stored as 32 (or
64) bit integers, so they don't need to be parsed to and from text. This uses more space than JSON
for small integers, but is much faster to parse.
In addition to compactness, BSON adds additional data types unavailable in JSON, notably the
BinData and Date data types.
Here is a full list of differences with the basis of differentiation:

The basis Of Comparison  JSON BSON

Type Standard file format Binary file format

Speed Comparatively less fast Faster

Space Consumes comparatively less More space is consumed.


space.

Usage Transmission of data. Storage of data.

Encoding and Decoding No such technique. Faster encoding and decoding


technique technique.

Characteristics Key value pair only used for Lightweight, fast and traversable.
transmission of data.

Structure Language independent format used Binary JSON which consist of a list
for asynchronous server browser of ordered elements containing a
communication. field name, type, and a value. Field
name types are typically a string.

Traversal JSON doesn’t skip rather skims BSON on the other hand just
through all the content. indexes on the relevant content and
skips all the content which does not
have to be in use.

Parse JSON formats need not be parsed BSON, on the other hand, needs to
as they are in a human readable be parsed as they are easy for
format already. machines to parse and generate.

Creation type Broadly JSON consists of object The binary encoding technique
and array where the object is a consists of additional information
collection of key-value pairs and such as lengths of strings and the
the array is ordered list of values. object subtypes. Moreover, BinData
and Date data types are the data
types which are not supported in
JSON.

Q18. In reference to time series,

- How to determine if it has a trend component?

- How to determine if it has a seasonal component?

- How to determine if it has both a trend and seasonal component?

A: Determining trend component in time series

As part of trend analysis, we remove the random noise by apply smoothing. Moving Average is the
most common smoothing technique. It could be a simple average or weighted average. The benefit
of moving average smoothing is that its results are less biased by outliers. Monotonous time series
data can be approximated by a linear function. For non-monotonous non-linear data, we apply
transformations like log transformation, exponential transformation or square root transformations
to remove the non-linearity.

Determining seasonality in time series

Seasonality is a structured pattern of changes within a year. Understanding of the seasonality of data
is useful to make a prediction for any time series data like managing right inventory levels or
planning production for the right products. Seasonality present in Time Series can be identified by
Domain knowledge of time series is helpful to understand structured patterns repeating over year
using a plot of the data. Obtaining a lagged auto-correlation function. ANOVA: Analysis of Variance
—Used to test two groups to check the difference between them.

You might also like