Unit IV (2)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

UNIT IV: Solution framework for IoT applications

The IoT Device Integration Concepts, Standards, and Implementations


i. There are several standards for device integration, especially for consumer and industry
devices. That is, these standards are continuously optimized toward integrating a wide variety
of distributed, decentralized, and disparate devices.
ii. However, the ultimate target is to establish smarter environments that readily link cross
domain automation units [home, building and industry automation, manufacturing execution
systems (MES), cyber-physical systems (CPS), health care instruments, media players,
Internet gateways, consumer electronics, kitchen utensils, appliances, manufacturing
machines, defense equipment, vehicles, robots, sensor and actuator networks, personal digital
assistants, energy grids, etc.].
iii. Initially, industrial machinery was chosen for internal as well as external communication
and integration.

a. Machine-to-Machine Communication
i. Machine-to-machine (M2M) uses a device (sensor, meter, etc.) to capture an event
(temperature, inventory level, etc.), which is sent through a network (wireless, wired, or
hybrid) to an application (software program) that translates the captured event into a
meaningful information (e.g., items need to be restocked).
ii. With the availability of implementation technologies, a large number of intelligent
machines sharing information and making decisions without direct human intervention is
getting realized.
iii. The M2M applications are various and vast:
◾ Physical and homeland security through connected security and surveillance cameras,
alarm systems, and access control
◾ Object tracking and tracing: Fleet management, supply chain management, order
management, asset tracking and management, road tolling, and traffic optimization or
steering
◾ Automated payment through the integrated point of sales (PoS), vending machines,
gaming consoles, integrated dashboards, and so on
◾ Smarter health care through continuous monitoring vital signs, ambient assisted living
(AAL), telemedicine, remote diagnostics, and so on
◾ Remote diagnostics or maintenance or control of machines, sensors, and actuators,
lighting, pumps, valves, elevators, roadside vending machines, transport vehicles, and so on
◾ Advanced metering infrastructure (AMI) for power, gas, water, heating, grid control,
industrial metering, and so on
◾ Industry automation through production chain monitoring and automation
◾ Home or building or campus networking and automation
◾ Mobile enablement for machine operations
iv. A variety of protocols exist that allow machines to connect and communicate with one
another. Machine data is collected and sent to centralised control and analytical systems in
order to extract and may be utilised for later decision-making and appropriate actuation.

1
v. Cloud-based machine-centric services can be downloaded and installed on a wide range of
ground-level machines to make them more adaptable and clever. Thus, machine connectivity
brings about a wide range of new abilities and capacities for both machines and people.

b. Service Oriented Device Architecture for Device Integration


i. Service-Oriented Architecture (SOA) has been a successful architectural style, approach,
and pattern for designing and building the next-generation enterprise that will be used in the
future.
ii. SOAs are often used to improve flexibility and reusability of components in complex
distributed applications. This is achieved by tuning functional blocks as independent and self-
sufficient services.
iii. Several upcoming standards are the game-changing SOA concept. There are around 50
well-established web services (WS) standards to implement the new generation service
oriented systems.
iv. Every service is bound to expose one or more service interfaces and service
implementations, which can be done using any programming language hidden behind
interfaces.
v. Service interfaces help identify, match, and integrate services. Composite services are
made of various services that are business-aligned and process-aware. So, in corporate and
cloud IT, service-based application integration is the new normal.
vi. Similarly, there is an integration established between applications and various data
sources. Thus, data, applications, systems, and networks are integrated spontaneously for
various services (IT and business).
vii. So, devices are hidden behind their service interfaces so that they can talk with each other
programmatically. That is, the SOA concept is being adjusted to service-oriented device
architecture (SODA) so that service-based devices can be integrated with each other.
viii. Devices and machines must be able to communicate and work together in order to
demonstrate intelligent behaviour. Driving the devices together allows them to share their
unique service abilities with each other.
ix. Service requests and responses are getting fulfilled through SODA. Data and event
messages can be transmitted over any network among devices to accomplish the desired
tasks. Further, devices can be linked up with applications and data sources hosted in web and
cloud environments. Thereby devices can be activated over the Internet communication
infrastructure remotely.
x. Software infrastructure solutions that strictly follow SODA specifications and standards
make it easier and faster to monitor, measure, manage, and maintain devices that are used in
critical and disaster situations. That is, the Internet of devices (IoD) concept is becoming
more and more conscious.
xi. Finally, web-enabled devices are being designed and manufactured in large quantities and
sent out to the market. For example, web-enabled refrigerators are already hitting the market.
SODA is an emerging concept for the device world. There are several standards such as
DPWS, OSGi, RESTful services, and OPC fulfilling the SODA idea.

2
Figure 1: The DPWS (Devices Profile for Web Services) OSI model.

Data acquisition
a. Data generation
Data generates at devices that later on, transfers to the Internet through a gateway. Data
generates as follows:
i. Passive devices data: Data generate at the device or system, following the result of
interactions. A passive device does not have its own power source. An external source helps
such a device to generate and send data. Examples are an RFID or an ATM debit card. The
device may or may not have an associated microcontroller, memory and transceiver. A
contactless card is an example of the former and a label or barcode is the example of the
latter.
ii. Active devices data: Data generates at the device or system or following the result of
interactions. An active device has its own power source. Examples are active RFID,
streetlight sensor or wireless sensor node. An active device also has an associated
microcontroller, memory and transceiver.
iii. Event data: A device can generate data on an event only once. For example, on detection
of the traffic or on dark ambient conditions, which signals the event. The event on darkness
communicates a need for lighting up a group of streetlights. A system consisting of security
cameras can generate data on an event of security breach or on detection of an intrusion. A
waste container with associate circuit can generate data in the event of getting it filled up
90% or above. The components and devices in an automobile generate data of their
performance and functioning.
iv. Device real-time data: An ATM generates data and communicates it to the server
instantaneously through the Internet. This initiates and enables Online Transactions
Processing (OLTP) in real time.
v. Event-driven device data: A device data can generate on an event only once. Examples are:
(a) a device receives command from Controller or Monitor, and then performs action(s) using

3
an actuator. When the action completes, then the device sends an acknowledgement; (b)
when an application seeks the status of a device, then the device communicates the status.

b. Data Acquisition
i. Data acquisition means collecting data from IoT or M2M devices. The data transfers after
the interactions with a data acquisition system of the application. The application interacts
and communicates with a number of devices for acquiring the needed data. The devices send
data on demand or at programmed intervals. Data of devices communicate using the network,
transport and security layers.
ii. When devices have the ability to configure themselves, an application can setup them for
data. The system, for example, can configure devices to send data at predefined intervals. The
frequency of data generation is determined by the device settings. For example, the system
can design an umbrella device to obtain weather data from an Internet weather service once
every week.
iii. Every hour, Vending machine can be programmed to communicate machine sales
statistics and other information. The Vending machine system can be programmed to
communicate instantly in the event of a problem or when a specific chocolate flavour requires
the fill service.
iv. The data-adaptation layer allows application to customize data sending after filtering or
enriching. The application-to-device’s gateway can do transcoding, data management, and
device management. Data management may include data integration, compaction, and fusion.
vi. Device-management software maintains the device ID or address, activation, configuring
registering, deregistering, attaching, and detaching.

c. Data Validation
i. Data acquired from the devices does not mean that data are correct, meaningful or
consistent. Data consistency means within expected range, data or as per pattern or data not
corrupted during transmission. The applications or services depend on valid data. Therefore,
data needs validation checks.
ii. Data validation software do the validation checks on the acquired data. Validation software
applies logic and rules. Then only the analytics, predictions, prescriptions, diagnosis and
decisions can be acceptable.
iii. Large magnitude of data is acquired from a large number of devices, especially, from
machines in industrial plants or embedded components data from large number of
automobiles or health devices in ICUs or wireless sensor networks, and so on.
iv. Therefore, validation software consumes significant resources. An appropriate strategy
needs to be adopted.

Data Integration
i. Data integration is the process of combining data from different sources into a single,
unified view. Data integration ultimately enables analytics tools to produce effective,
actionable business intelligence.
ii. There is no universal approach to data integration. However, data integration solutions
typically involve a few common elements, including a network of data sources, a master
server, and clients accessing data from the master server.
4
iii. In a typical data integration process, the client sends a request to the master server for
data. The master server then intakes the needed data from internal and external sources. The
data is extracted from the sources, then consolidated into a single, organize data set.
Integration helps businesses succeed
iv. Advantages of data integration are Improves collaboration and unification of systems,
saves time and boosts efficiency, reduces errors (and rework), delivers more valuable data.

a. Challenges to data integration


Taking several data sources and turning them into a unified all within a single structure is a
technical challenge. As more businesses start to use data integration solutions, they have to
think about how to make sure that data always moves to where it needs to go. While this
provides time and cost savings in the short-term, implementation can be slowdown by
numerous obstacles. Here are some common challenges that organizations face in building
their integration systems:
How to get to the finish line — Companies typically know what they want from data
integration. Anyone implementing data integration must understand what types of data need
to be collected and analyzed, where that data comes from, the systems that will use the data,
what types of analysis will be conducted, and how frequently data and reports will need to be
updated.
Data from legacy systems — Integration efforts may need to include data stored in legacy
systems.
Data from newer business demands — New systems today are generating different types of
data (such as unstructured or real-time) from all sorts of sources such as videos, IoT devices,
sensors, and cloud. Figuring out how to quickly adapt your data integration infrastructure to
meet the demands of integrating all these data becomes critical for your business to win.
External data — Data taken from external sources may not be provided at the same level of
detail as internal sources, making it difficult to examine.
Keeping up — As soon as an integration system has been set up, the job isn't done. This
means that the data team has to make sure that the integration of data is done in a way that
meets best practises.

b. Integration strategies for business


There are several ways to integrate data that depend on the size of the business, the need
being fulfilled, and the resources available.
Manual data integration is simply the process by which an individual user manually collects
necessary data from various sources by accessing interfaces, then cleans it up as needed, and
combines it into one warehouse. This is highly inefficient and inconsistent.
Middleware data integration is an integration approach where a middleware application acts
as a mediator, helping to normalize data and bring it into the master data pool. Legacy
applications often don’t play well with others then Middleware comes into play.
Application-based integration is an approach to integration wherein software applications
locate, retrieve, and integrate data. During integration, the software must make data from
different systems compatible with one another so they can be transmitted from one source to
another.

5
Uniform access integration is a type of data integration that focuses on creating data reliable
when accessed from different sources. The data, however, is left within the original source.
Using this method, object-oriented database management systems can be used to create the
appearance of uniformity between unlike databases.
Common storage integration is the most frequently used approach to storage within data
integration. A copy of data from the original source is kept in the integrated system and
processed for a unified view.

c. Data integration tools


Data integration tools have the potential to simplify data integration process. The features for
a data integration tool are:
A lot of connectors- There are many systems and applications in the world; the more pre-built
connectors your Data Integration tool has, the more time will save.
Open source- Open source architectures typically provide more flexibility while helping to
avoid vendor.
Portability- It's important, as companies increasingly move to hybrid cloud models, to be able
to build your data integrations once and run them anywhere.
Ease of use- Data integration tools should be easy to learn and easy to use with a GUI
interface to make visualizing your data pipelines simpler.
Cloud compatibility- Your data integration tool should work in a single cloud, multi-cloud, or
hybrid cloud environment.

Data Categorisation for Storage


Services, business processes and business intelligence use data. Valid, useful and relevant
data can be categorised into three categories for storage—data alone, data as well as results of
processing, only the results of data analytics are stored. Following are three cases for storage:
1. Data which needs to be repeatedly processed, referenced or audited in future, and
therefore, data alone needs to be stored.
2. Data which needs processing only once, and the results are used at a later time using the
analytics, and both the data and results of processing and analytics are stored. Advantages of
this case are quick visualisation and reports generation without reprocessing. Also the data is
available for reference or auditing in future.
3. Online, real-time or streaming data need to be processed and the results of this processing
and analysis need storage. Data must be validated before storing Data aggregation, adaptation
and enrichment is done before communicating to the Internet Data from large number of
devices and sources categorises into a fourth category called Big data. Data is stored in
databases at a server or in a data warehouse or on a Cloud as Big data.

Data store
A data store is a data repository of a set of objects which integrate into the store. Features
of data store are:
i. Objects in a data-store are modeled using Classes which are defined by the database
schemas.

6
ii. A data store is a general concept. It includes data repositories such as database, relational
database, flat file, spreadsheet, mail server, web server, directory services and VMware
iii. A data store may be distributed over multiple nodes. Apache Cassandra is an example of
distributed data store.
iv. A data store may consist of multiple schemas or may consist of data in only one scheme.
Example of only one scheme data store is a relational database. Repository in English means
a group, which can be related upon to look for required things, for special information or
knowledge. For example, a repository of paintings of artists. A database is a repository of
data which can be relied upon for reporting, analytics, process, knowledge discovery and
intelligence. A flat file is another repository. Flat file means a file in which the records have
no structural interrelationship.

Data center management


A data centre is a facility which has multiple banks of computers, servers, large memory
systems, high speed network and Internet connectivity. The centre provides data security and
protection using advanced tools, full data backups along with data recovery, redundant data
communication connections and full system power as well as electricity supply backups.
Large industrial units, banks, railways, airlines and units for whom data are the critical
components use the services of data centres. Data centres also possess a dust free, heating
ventilation and air conditioning (HVAC), cooling, humidification and dehumidification
equipment, pressurisation system with a physically highly secure environment. The manager
of data centre is responsible for all technical and IT issues, operations of computers and
servers, data entries, data security, data quality control, network quality control and the
management of the services and applications used for data processing.

Server management
Server management means managing services, setup and maintenance of systems of all types
associated with the server. A server needs to serve around the clock. Server management
includes managing the following:
i. Short reaction times when the system or network is down
ii. High security standards by routinely performing system maintenance and updation
iii. Periodic system updates for state-of-the art setups
iv. Optimised performance
v. Monitoring of all critical services, with SMS and email notifications
vi. Security of systems and protection
vii. Maintaining confidentiality and privacy of data
viii. High degree of security and integrity and effective protection of data, files and databases
at the organisation
ix. Protection of customer data or enterprise internal documents by attackers which includes
spam mails, unauthorised use of the access to the server, viruses, malwares and worms
x. Strict documentation and audit of all activities.

Unstructured data is the data which does not follows to a data model and has no easily
identifiable structure such that it cannot be used by a computer program easily.

7
Unstructured data is not organised in a pre-defined manner or does not have a pre-defined
data model, thus it is not a good fit for a mainstream relational database.

Characteristics of Unstructured Data:


i. Data neither conforms to a data model nor has any structure.
ii. Data cannot be stored in the form of rows and columns as in Databases
iii. Data does not follows any rules
iv. Data lacks any particular format or sequence
v. Data has no easily identifiable structure
vi. Due to lack of identifiable structure, it cannot used by computer programs easily

Sources of Unstructured Data:


Web pages, Images (JPEG, GIF, PNG, etc.), Videos, Memos, Reports, Word documents
and PowerPoint presentations, Surveys

Key differences between structured and unstructured data


While structured (quantitative) data gives a “birds-eye view” of customers, unstructured
(qualitative) data provides a deeper understanding of customer behaviour and intent. Let’s
explore some of the key areas of difference and their implications:
Sources: Structured data is sourced from GPS sensors, online forms, network logs, web
server logs, etc., whereas unstructured data sources include email messages, word-processing
documents, PDF files, etc.
Forms: Structured data consists of numbers and values, whereas unstructured data consists of
sensors, text files, audio and video files, etc.
Models: Structured data has a predefined data model and is formatted to a set data structure
before being placed in data storage (e.g., schema-on-write), whereas unstructured data is
stored in its native format and not processed until it is used (e.g., schema-on-read).
Storage: Structured data is stored in tabular formats (e.g., excel sheets or SQL databases)
that require less storage space. It can be stored in data warehouses, which makes it highly
scalable. Unstructured data, on the other hand, is stored as media files or NoSQL databases,
which require more space. It can be stored in data lakes which makes it difficult to scale.
Uses: Structured data is used in machine learning (ML) and drives its algorithms, whereas
unstructured data is used in text mining.

Device data storage - Unstructured data storage on cloud/local server


i. Unstructured data storage use is very common as many systems have to upload
attachments, images, press releases and document management functions.
ii. Unstructured data is often relatively large, it will take up more bandwidth and a certain
server computing power, which has some influence on some of the high performance
requirements of servers.
iii. When the application needs to use more clustering support, the traditional way will be
more difficult to work with. As many servers are invaded by Trojan virus due to vulnerability
upload files.
iv. For unstructured data storage in a traditional file system (Cloud storage is not required), a
directory of the file system must be editable.
8
v. Technically advanced cloud storage has certain advantages. Cloud storage in the form of
object storage for storing and reading in charge of the actual content of the document.
vi. Cloud storage of high stretch, massive high reliability, duplicate files merge, will help
improve the quality of storage service. Unstructured data cloud storage architecture built on
this design, the hierarchy shown in Figure 2.
vii. Application layer provides unstructured data application interfaces. These interfaces
various types of storage applications such as online storage, network drives, video, data
hosting and software download services.
viii. Session layer is responsible for user management, rights assignment, space allocation
and storage security policy, the layer depending on the security level. Develop the different
security programs to ensure data security.
ix. The role of the data layer is a unified management of unstructured data and metadata.
Unstructured data volume level from MB to GB sizes. Its metadata information, such as data
identification, file length, type and other attribute information, the total length of no more
than 1 KB.

Figure 2: Cloud storage tier architecture of unstructured data


x. Routing layer is responsible for the cloud node, interoperability and storage path access
interface.
xi. Physical layer unstructured data storage provides storage space and computing resources,
and is responsible for maintaining physical path storage node.

9
Overview: Authentication and Authorization
Authentication and authorization are two critical components in the everyday mission to
secure clients and devices on the Internet. That makes these components essential to any IoT
project because the Internet of Things is simply devices-from simple sensors to complicated
cars and mobile devices-connecting together to share data. These connections must be
secured, and authentication and authorization.
The two concepts have some similarities, but really each one means something very specific
for this discussion:
Authentication is the process of identifying the device. For Message Queuing Telemetry
Transport (MQTT), the process of authentication is to confirm that the device’s client ID is
valid; that is, the ID belongs to the device in question.
Authorization provides a mechanism to bind a specific device to certain permissions. With
Edge Connect, authorization is broken into two tasks:
Binding devices to groups
Binding groups to topics

Device-based authentication and authorization


Access Control
Three Functional Components in a security Functional Group for ensuring security and
privacy are:
i. Authentications
ii. Authorisation
iii. Key exchange and management

Authentication
1. Identity (ID) establishment and authentication are essential elements of access control. A
hash function or MD5 gives the irreversible result after many operations on that and the
operations are just one way. The algorithm generates a fixed size, say, 128 or 256-bit hash or
digest value using authentication data and secret key.
2. Only the hash or digest value communicates. The receiver-end receives the value, and
compares that with a stored value. If both are equal then the sender is authenticated.
3. Hash function characteristic features are pre-image resistance, hash function should not
alter, before or after communication and should be as per the previous image (original
message). Second pre-image resistance: hash function should not be altered by an in between
entity (called eavesdropper), should remain the same as one for the previous image (original
message) should be collision-resistance and should not be the same for any form of altered
message.

Authorisation
1. Access control allows only an authorised device or application/service access to a resource,
such as web API input, IoT device, sensor or actuator data or URL.
2. Authorisation model is an essential element of secure access control. The standard
authorisation models are as follows:
i. Access Control List (ACL) for coarse-grain access control
ii. Role-Based Access Control (RBAC) for fine-grain access control

10
iii. Attribute-Based Access Control (ABAC) or other capability-based fine grain access
control An access control server and data communication gateway can be centrally used to
control accesses between application/service and IoT devices. The server central control can
be on a cloud server. Each device can access the server and communicate data to another
server. Alternatively, a distributed architecture enables:
iv. Each device to request access to the server and the server grants application/service access
token
v. Each application/service to request access to the server and the server grants device access
token for the device.

Key Exchange and Management


1. Key of sender messages needs to be known to receiver for accessing the received data. Key
of respondent of messages needs to be known to sender for accessing the responses.
2. The keys, therefore, need to be exchanged before the communication of authentication
code, authorisation commands and encrypted messages.
3. Since each application/service component and device data application or service may need
unique and distinct keys, an FC provisions for the functions of key management and
exchanges.
4. Figure 3 shows the steps are for key exchanges and management, authentication and
authorisations. The steps follow secure communication of an application or service message
to the gateway and device.
Steps for designing use case for key exchanges and encrypting and decrypting the messages
are:
1. 1 and 2: Device/gateway D generates secret key K
2. 3 and 4: Application/service A generates secret key K’
3. 5 and 6: exchanges key K and K’
4. 7 and 8: Authentication and authorisation of D and A,
5. 9: Message given to encryption algorithm
6. 10: Encrypted message using K’ to D
7. 11: Decryption algorithm decrypts encrypted message using K’ for D
8. 12: Message M retrieves at D

11
Figure 3: Steps during key exchanges, management, authentication and authorisations
followed by secure communication of application/service message to the device/gateway

MD5
https://www.youtube.com/watch?v=S9PMQsbMqUk
https://www.youtube.com/watch?v=Q2H2ndbHUFQ

Authentication and Authorisation of devices


https://developer.akamai.com/iot-edge-connect/authentication-authorization
https://www.keyfactor.com/blog/the-top-iot-authentication-methods-and-options/
https://www.talend.com/resources/what-is-data-
integration/#:~:text=Manual%20data%20integration%20is%20simply,combines%20it%20int
o%20one%20warehouse.

Additional information
Unstructured data cloud storage system structure design
In order to achieve effective management of unstructured data, many domestic and
foreign companies or individuals be a lot of research. The most important management is
divided into two: one is based on technology, semi-structured data to unstructured data
conversion; the other is unstructured data to structured data conversion, data will eventually
be stored in a relational database in. Unstructured to Structured Data Conversion mostly used
the "unstructured data, structured data half a structured data" gradual conversion. Thus, the
structure of the data obtained through the conversion of its relational database storage and
management. Based on the project requirements, the use of "unstructured data structured data
half a structured data" gradual conversion method and further expand on its basis, the concept
of the standard structure members to implement the data structure of the file name conversion
versatility introduction of templates to save the converted file to extract the file metadata,
create document templates, documents related table to achieve the association unstructured
data with structured data, as shown in Figure 4.

12
Figure 4: Unstructured data cloud storage system structure

System consists of database, file system, template libraries, file format definition module,
metadata extraction module, template creation and management module, intermediate module
data representation and data conversion modules and other components. On the whole system
architecture is divided into three levels: the interface application layer, application logic
layer, data storage layer. Interface application layer provides a graphical interface to the user
data conversion, through the application interface, users can use unstructured to structured
data conversion related operations, without having to be concerned about the specific data
conversion.
Program logic layer consists of five functional modules of the system structure, work
focused on achieving business logic structured to unstructured data conversion system.
Interface application layer client file system after obtaining simulation output file, issue a
request for data conversion, then, the application receives the request sent by the client, will
need to convert the file is passed to the data conversion module. After the module receives
the file, depending on the file type classification to determine which program to use to
convert. Then, five functional modules to work, extract metadata of the file, establish the
appropriate document templates, and then implement the unstructured to semi-structured data
conversion, the processed data is written to the simulation results table in the database.
Application and then convert the result back to the user, and prompts the user whether the
next data conversion, to finalize the whole process of data conversion.
Data storage layer collection system used by the database table, such as document templates,
documents associated table, the simulation results tables. Document templates, documents
13
associated table needs to be created before the system is running. Data simulation result table
is unstructured file data after converted structured data. After the data conversion is
completed, the system will associate the relevant information into the file table.

MD5: Digest is a process which gives the irreversible result involving many operations. A
standard algorithm called MD5 (Message Digest 5) is also used for digest, similar to the hash
value. Receiver-end stores the digest value expected to be obtained after the MD5 operations,
and compares that with received value. If both are equal then the sender message is
authenticated.

14

You might also like