Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 39

A Web Application for Sensor Data Collection and

Visualisation
TM470 – EMA – E395923X – Parth Shah

Project Description.............................................................................................................................. 3

Problem Statement.............................................................................................................................. 3
IoT Landscape.................................................................................................................................. 3
Core Sensors.................................................................................................................................... 3
Problems Addressed by the Web Application...................................................................................3
Benefits of Data................................................................................................................................. 4
Proposed Solution............................................................................................................................. 4
Impacts............................................................................................................................................. 5

Literature and Information Sources................................................................................................... 5

Account of Project Work and its Outcome........................................................................................ 7


Initial Phase – Deciding Technology Stack.......................................................................................7
Application Flow................................................................................................................................ 8
Authentication and Security.......................................................................................................... 8
Password Hashing..................................................................................................................................... 8
Session Authentication............................................................................................................................... 8
API Key...................................................................................................................................................... 9
Defining Sensor Schema.............................................................................................................. 9
Frontend..................................................................................................................................................... 9
Sensor Data Fields..................................................................................................................................... 9
Nesting..................................................................................................................................................... 10
Sensor UUID............................................................................................................................................ 10
Input Validation......................................................................................................................................... 10
Descriptive Error Messages..................................................................................................................... 11
Managing Sensors...................................................................................................................... 11
Instructions for Submitting Data...............................................................................................................11
Reusing Schema Definition Component..................................................................................................12
Viewing Recent Data................................................................................................................................ 12
Sending Sensor Data.................................................................................................................. 12
Validation against Schema.......................................................................................................................13
Validation Step One – Convert Schema JSON to Graph.........................................................................13
Validation Step Two – Validate Data Against Schema Graph...................................................................15
Validation Step Three – Check Schema Graph for Missing Fields...........................................................17
Descriptive Error Messages.....................................................................................................................17
Data Storage............................................................................................................................................ 18
Editing Sensor Dashboard.......................................................................................................... 19
Dashboard Design.................................................................................................................................... 19
Linking Data Block with Sensor................................................................................................................20
Data Blocks.............................................................................................................................................. 20
Idea for Streamlining the Process of Adding New Data Blocks................................................................21
Viewing Sensor Dashboard......................................................................................................... 22
Time Period Based Querying...................................................................................................................22
Example Visualisation.............................................................................................................................. 22
Front-End Implementation........................................................................................................... 23
Review of Current Stage of Project Work........................................................................................23
Goals Analysis............................................................................................................................ 24
Legal, Social, Ethical, and Professional Issues (LSEPIs)................................................................25
Equality, Diversity, and Inclusion (EDI) concerns............................................................................26

Review of Project Management........................................................................................................ 27


Resources Used for Development.................................................................................................. 28

Review of Personal Development.................................................................................................... 29

References......................................................................................................................................... 30

Appendix A – Project Work............................................................................................................... 31


Project Description
This project, titled “A web application for sensor data collection and visualisation” is
based on the development of a web application that allows users to store data
collected from various sensors and visualise it on one central platform through a
dashboard that can be designed with an editor. Through this dashboard, users gain
the ability to not only store their sensor data but also learn valuable insights through
intuitive visualisations.

Problem Statement

IoT Landscape
With the rise of Internet of Things (IoT) devices in recent years (Dahlqvist et al.,
2019), consumers are outfitting their homes with multiple smart-home devices
including thermostats, humidity sensors, motion sensors, and more. These devices
offer enhanced peace of mind and a better quality of life by granting consumers
greater control over their homes, even from thousands of miles away. However,
amidst this rapid adoption, a significant disparity has emerged between the price of
packaged smart-home device solutions and the core sensors that drive their
functionality.

Core Sensors
The heart of many smart home IoT devices are their core sensors, which are the
essential components that gather data and relay it to a microprocessor which uses
this data to control the device. For example, in a smart thermostat, the temperature
sensor serves as the core sensor. It collects room temperature data, enabling the
microprocessor to control the central heating system to achieve the desired
temperature setting. These core sensors themselves come at a very low cost, and
their prices have been steadily declining thanks to the rise of IoT device
manufactures (Dukes, 2018). However, the overall cost of packaged solutions
offered by manufacturers are much greater, and this disparity can be expected
because the cost to develop packaged solutions consists of many complex aspects,
such as the extensive research and development phase, which is essential for
creating reliable devices that must blend into living spaces, along with the
development of user-friendly applications that facilitate data presentation and
intuitive user control. Moreover, these applications are also designed to ensure
interoperability with other smart home products within the brand's ecosystem, which
further contributes to costs.

Problems Addressed by the Web Application


This web application leverages the affordability of individual core sensors. It
addresses the interest among individuals who wish to gather data from sensors
around their homes and seek insights from the collected data. The target audience
for this web application are tech-savvy individuals who have the skills to wire up
devices using core sensors and a data-sending device capable of transmitting
information to a remote server, which in this case, is the web server of the proposed
application. By doing so, these users can harness the data-capturing potential of
sensors to learn more about their living environments and extract valuable insights.
Benefits of Data
There are many benefits to collecting data from within the home:

1. By visualising home temperature data collected from multiple rooms over a


prolonged period of time, a user can understand which rooms face excessive
heat loss and use this insight to efficiently target the source of the heat loss,
fix it, and save on heating costs.

2. Data collected from humidity sensors can reveal which rooms have high
humidity values and allow users to take preventative action before respiratory
health is affected, or walls are damaged because of mould.

3. Visualising electricity generation data from solar panels can allow users to
identify the different periods of the day that generate the most electricity and
optimally plan their electricity consumption to utilise solar energy and reduce
their dependency on the grid.

Proposed Solution
The proposed solution will consist of an easy-to-use web application that aims to
achieve the following goals as set out from the start of the project:

1. Provides an interface for users to define the schema for each sensor, i.e., the
different fields the data will contain and their corresponding data types. For
example, a temperature sensor will transmit two fields — the time,
represented as a UTC string, and the temperature, represented as a floating-
point number. Enforcing a strict schema will assist in creating consistent
visualisation options for each field the sensor collects data for.

2. Provides an API endpoint for users to push their sensor data to. It is expected
that the application will be used by at least 500 users, and as such, the
endpoint must be able to handle many requests per second.

3. Provides interesting data blocks that can be designed as UI Components,


which can be combined to form the visualisation dashboard. A simple
example of a data block could be a 2D graph chart component, which allows
the user to choose any two fields from the data and display it. For example, a
simple temperature plot would show time on the x-axis and the temperature
on the y-axis.

4. Has an easy-to-use drag and drop editor that allows users to link data blocks
to their sensors and drag and drop them to the editor to form their custom
visualisation dashboard.

5. Provides tools to analyse the data in further detail. At the very least, the user
should be able to view and compare current data overlayed with historical
data.
Impacts
In essence, individuals proficient in technology can leverage the cost-effectiveness
of smart home sensors to gather data from various aspects of their living spaces,
enabling a deeper understanding of factors such as temperature variations, humidity
levels, and electricity generation statistics. The web application allows users to do
this by providing a solution for the challenges associated with collecting, storing, and
analysing data. It achieves this by offering a comprehensive solution that not only
addresses the management of the large volume of data generated by the sensors,
but also provides robust storage and visualisation tools, facilitating a seamless and
insightful exploration of the captured information.

Literature and Information Sources


The table below lists the literature and information sources consulted during the
development of the project.

Description Source
This source provides an overview of how certain https://
server-side backend frameworks can provide a developer.mozilla.org/en-
mechanism for accepting web requests, processing US/docs/Learn/Server-
them, and generating an output in the form of a HTML side/First_steps/
template. It also provides a comparison of popular Web_frameworks
backend frameworks based on metrics like
performance and ease of development. This
information was useful in the initial stages of the
project when the technology stack was decided. This
source can be trusted as it was updated recently and
written by the Mozilla Corporation which is a prominent
web technology company.
This source provides a comparison between different https://
API architectures — namely SOAP, REST, GraphQL, www.altexsoft.com/blog/
and RPC — on key metrics like performance and ease- soap-vs-rest-vs-graphql-
of-use. This source was useful for the project as it vs-rpc/
helped in making an informed decision of choosing
REST as the API style for this application. This source
is almost three years old and is written by a software
development company, suggesting that it may be
slightly biased. However, the quality of the article is
very good, and it does not convey any strong points
without logical justification.
This source explains the importance of data https://www.tableau.com/
visualisation and lists many types of visualisation en-gb/learn/articles/data-
methods. This source also has links to many other data visualization
visualisation examples, which was useful when
brainstorming about different data blocks to add in the
web application. This article is written by a leading
company in the data visualisation and business
intelligence field, which makes it worthy of placing trust
upon.
This source describes design aspects for drag and https://blog.prototypr.io/
drop interfaces and provides an understanding of building-a-responsive-
accessibility issues on smaller screens. It also drag-and-drop-ui-
compares several drag and drop libraries in React, 5761fd5281d5
which was useful for the design and development of
the drag and drop editor in this web application. This
article is written on a public blogpost but given the
number of positive reactions it has received; it can be
deemed trustworthy.
This article describes the different ways in which https://
sensor data can be stored. It also compares database developer.ibm.com/
technologies for storing such data and provides tutorials/iot-lp301-iot-
information on how large-scale data can be processed. manage-data/
This was useful in the architectural and development
stage of the web application. This source can be
trusted as it is written by IBM, a leading technology
company.
This article explains some rules which good user https://www.interaction-
interfaces follow. The information gained from this design.org/literature/
article helped shape some design elements that are article/user-interface-
explained in the Project Work section. design-guidelines-10-
rules-of-thumb
This article explains common techniques for storing https://thenewstack.io/
and generating analytics from IoT devices, which best-practices-to-build-iot-
inherently generate large amounts of data. This source analytics/
was very useful in understanding scalable storage and
retrieval architectures. Although this article is
sponsored by InfluxDB and promotes the use of
InfluxDB and their services, the generic information
gained from this article regarding storing and analysing
large volumes of data has been highly valuable in the
development of the application.
Account of Project Work and its Outcome

Initial Phase – Deciding Technology Stack


The initial phase of the project focused on refining the project idea, exploring its
potential use-cases, and researching suitable technologies. This involved
understanding various backend, database, and frontend technologies with the aim of
enabling the web application to meet its initial requirements effectively and be
developed efficiently within the timeframe of the project.

Ultimately, after reading through many sources, the decision was made to choose
NodeJS as the backend programming language, and ExpressJS as the backend
framework. The selection of NodeJS and ExpressJS was based on performance
capabilities compared to alternatives like Django/PHP as indicated by Oskow (2022).
Another reason for choosing a JavaScript based backend was my familiarity with the
language, which would help save time on development.

As the web application involves intricate user interfaces, such as the screen allowing
users to define their schema, as well as the drag and drop editor interface, it was
clear that using a frontend framework like React or Angular would significantly
expedite the development process. Using a frontend framework is beneficial due to
features such as simplified state and event management, re-usable components,
and extensive third-party library support, making the development process much
simpler and maintainable. ReactJS was the chosen framework due to its simpler
learning curve and better documentation. Using a JavaScript based frontend and
backend also offers the advantage of having a consistent technology stack
throughout the project, helping in streamlining the development process.

During the initial phase, PostgreSQL was selected as the database technology for
several reasons. Its relational architecture made it an ideal choice for storing
structured schemas, which in this application consists of users, that are each
associated with multiple sensors, which in turn have a sensor schema object stored
with it along with linked data points that are submitted by the user at regular
intervals. Another reason for choosing PostgreSQL was its support for advanced
data types like JSON, which is used for the sensor schemas.

This decision served as a good starting point. However, it was anticipated that
adjustments would most probably be needed in the future for the storage approach
for the sensor data. This is because the sensor data will become substantial in
volume as the users use the application, and also the fact that there will be many
queries made to the sensor data to display it visually, leading to performance issues.
Solving this challenge would require careful considerations, and therefore, this
decision was postponed until after completing the core implementation of the
application. This deliberate delay was intended to ensure that a more informed and
well-considered choice could be made regarding how to efficiently store and query
sensor data.
Application Flow
This section explains the various aspects of the application and details their
implementation. The following subsections are structured to facilitate the reader's
understanding of the various components in a sequential order.

Authentication and Security


To enable seamless usage of the web application by multiple users each having the
capability to manage multiple sensors, the implementation of a user account system
was necessary. This allows users to log in to the web application with their email
address and password and have sensors segregated per user.

The development of the user authentication system consisted of creating a


registration webpage and a login webpage, as well as the backend logic for input
validation and making the necessary calls to the database for inserting/reading user
information. Screenshots of the Registration and Login page can be found in Figure
1 and Figure 2 of Appendix A.

Password Hashing
Given that users may store sensitive information as part of their sensor data, it is
important that sufficient security measures are in place to secure a user’s account
from potential compromise. One such security measure implemented in the web
application is the hashing of the user’s password using a one-way hash function.
Instead of storing the plaintext password in the database, the hash of the password
is stored instead. Whenever the user enters their password, the server checks
whether the hash of the entered password is equal to the hash stored in the
database to authenticate the user. Storing hashed passwords instead of plaintext
passwords means that even if the database were to be compromised, an adversary
would not be able to extract the user’s password in plaintext, preventing their
information from being compromised on other sites where the same credentials are
used. The password hashing is implemented using the Argon2 hash with the help of
the “argon2” library in NodeJS. A screenshot showing the hash of the password
stored in the “app_user” database is shown in Figure 3 of Appendix A.

Session Authentication
To prevent users from entering their credentials before every interaction they have
with the web application, it was necessary to implement session authentication
cookies. This involves generating a unique session ID on the server, which is then
linked to the user by saving it in the database alongside the user's ID. Additionally, a
session cookie containing the unique session ID is sent to the user's browser. For
every subsequent request, the user sends back this cookie to the server, and
authentication is established if the unique ID in the cookie matches the one stored on
the server against the user. Implementing session-based authentication was
necessary to securely identify and authenticate users across the app's multiple
screens without requiring login credentials on every screen. This was achieved with
the “express session” middleware for Express, which stores session data under the
“session” table of the application database. Figure 4 of Appendix A provides further
explanation about a session entry in the database.
API Key
The “app_user” table also consists of a column that stores the user’s API key. The
API key is a long and unique string that the user must include in the requests to push
sensor data in order to authenticate themselves. This is a common form of
authentication in APIs (RestCase, 2019) and prevents unauthorised entities from
sending data for sensors not belonging to their account.

Defining Sensor Schema

Frontend
The frontend development for the screen where users can define the schema of their
sensors required careful implementation, primarily due to the task of presenting an
abundance of initial information in a concise manner such that it is easy to
understand, and also because of developing an intuitive schema creation interface
that supports up to three nested levels. The initial information consists of (1)
explaining what the term “schema” refers to in this context, (2) explaining the
supported data types, and (3) some information about nested fields should the user
wish to utilise this feature. Figure 6 of Appendix A shows a screenshot of the
interface.

Sensor Data Fields


The sensor's schema consists of its individual fields and their corresponding data
types, which include integer, float, string, boolean, datetime, or an object. Individual
fields can be specified as follows:

More fields can be added by following the thin grey bar on the left of the field to the
bottom and clicking on the blue button with a “+” icon underneath. Individual fields
can be deleted by clicking on the red trashcan icon to the right of the screen.
Nesting
The "object" data type provides users the capability to create nested structures,
enabling more complex and hierarchical data representation within the sensor's
schema.

The task was challenging from a UX perspective, as it is difficult to present


comprehensive information and complex schemas concisely, considering the diverse
possibilities of fields users might create. To address these challenges, the nested
levels were made more prominent by adding incremental left indentation margins for
each level and introducing horizontal stripes to the left of the input boxes to indicate
their nesting level as shown below:

The code responsible for adding, removing, and defining field names and types was
housed within a React component called "DefineSchema". To ensure simplicity in the
frontend code for this screen, React's recursive component feature was utilised when
the "object" field type is chosen, as an object itself can contain multiple fields,
making it a perfect candidate for this recursive approach. As explained further in
Figure 7 of Appendix A, this allowed for a streamlined and efficient implementation of
nested levels.

Sensor UUID
Each sensor is assigned an ID that is unique among all sensors. This ID string is
used as part of the requests for pushing sensor data. As part of the request, users
authenticate themselves using their API key, and specify the sensor ID of the sensor
for which the user intends to push data for.

Input Validation
Validation for user input on the Add Sensor page was implemented through recursive
backend logic that checks whether each specified field conforms to the accepted
data types, which include "string," "integer," "float," "boolean," "datetime," or "object."
Furthermore, thorough checks are implemented to guarantee that each field name is
of appropriate length, and that unique names are enforced for every field respective
to their level. This validation is performed for every field, and each child of each sub-
field, enhancing the reliability and integrity of the application.
Descriptive Error Messages
Another feature of the input validation implementation is the fact that it provides
detailed error messages. The implemented logic detects errors at every step and
provides specific error messages linked to the particular field responsible for the
error. This information is then relayed to the front-end, allowing the specific incorrect
field to be highlighted and shown to the user. The approach is particularly beneficial,
given that users may have defined intricate schemas, and vague error messages
would otherwise require significant effort to identify the cause of the problem. An
example of this is shown in Figure 8 of Appendix A.

Managing Sensors

Users also have the ability to view all of the sensors they have added on the web
application. The functionality of managing sensors is implemented through two
screens. The first screen allows the user to view all of their sensors at a quick
glance. By clicking the button denoted by an "eye” icon, users can view detailed
information about that particular sensor. This functionality is useful if the user wants
to revisit the schema definition they provided for a particular sensor, or also if they
want to simply check the sensor’s ID before pushing data. It is important to note that
the user cannot modify the sensor schema after creating it. This is because all
sensor data is validated against the original schema, and changing the schema
could invalidate the data that has already been received earlier. Figures 9a and 9b of
Appendix A explain the two screens in further detail.

Instructions for Submitting Data


At the top of the “Manage Sensors” screen, the user is provided with the contextual
information needed for pushing sensor data to the web application. This information
includes details like how the user can authenticate themselves in the request using
their API key, as well as information regarding what format/structure to provide the
data in. The user can also verify if they have successfully managed to relay the
information across to the web application if they receive an HTTP status code of 200,
indicating a successful transmission.
Reusing Schema Definition Component
The screen that allows the user to view the schema definition of the sensor they
have added on the web application reuses the SchemaDefinition component, but
with the property “isReadOnly” set to true. Although the SchemaDefinition
component required some changes to support both cases of the “isReadOnly”
property, the changes were minimal with respect to the overall complexity of the
component. Using this approach improves code quality as it significantly reduces
code duplication in the project. Another advantage of this approach is that if changes
are made to the SchemaDefinition component, it will automatically apply on this
screen, making the code easier to maintain.

Viewing Recent Data


Another way to verify if the web application has successfully received the sensor
data pushed by the user is to check the “Recently received data” section on the
sensor page. This section lists the ten most recently submitted data points and acts
as an effective method of troubleshooting common errors faced by the user when
trying to transmit the sensor data. The screenshot below shows the “Recently
received data” section on the “Manage Sensor” page.

Sending Sensor Data

Sensor data can be transmitted to the web application by sending a HTTP POST
request to the “/push-sensor-data” endpoint. The data specified in the request is
validated against the schema definition the user specified when adding the sensor to
the web application. If the data complies with the schema, the data is stored in a
time-series database that allows for quick retrieval.
Validation against Schema
It is necessary to validate the sensor data against the schema, as downstream
applications like the dashboard data blocks depend on certain fields always bearing
a value and all being of the same type. If the data is not validated, it could cause
downstream applications to not function as intended.

Validation Step One – Convert Schema JSON to Graph


The sensor data is validated by first converting both the JSON data pushed by the
user, as well as the JSON sensor schema into a graph-like data structure that will
allow for easier comparison for validity between the schema and the user submitted
data. Consider the following JSON sensor schema as an example:

[
{
"name": "temperature",
"type": "Float",
"child": null
},
{
"name": "humidity",
"type": "Float",
"child": null
},
{
"name": "metadata",
"type": "Object",
"child": [
{
"name": "current_firmware_version",
"type": "String",
"child": null
},
{
"name": "battery_health",
"type": "Boolean",
"child": null
}
]
}
]

This sensor schema can be converted into a graph using the depth-first-search
algorithm. The algorithm starts with a list of top-level fields, which in the case of the
example above are the “temperature”, “humidity”, and “metadata” fields. Notice that
the “current_firmware_version” and “battery_health” fields are omitted for now as
they are not top-level fields, but in fact a child of “metadata”.
For each of these top-level fields, the algorithm checks if the field has any children
by checking whether the “child” key in the field object is “null”.

If the field has no children, as is the case for the “temperature” and “humidity” fields,
a Node class is instantiated for these fields but with no children elements. The Node
class requires three parameters as constructors: the name of the field, the type of
the field, and the children of the field as a JavaScript object. While an array could
have been chosen for storing the class member for “child”, an object was selected
for its better lookup performance, which will prove beneficial in later stages of the
validation.

In the case for the field’s “temperature” and “humidity”, there are no children, so the
Node class is instantiated with an empty object of children. This Node class is then
added to a JavaScript object that stores the entry-point nodes of the graph. Each
node is added to this object under the key of its respective name. It is important to
emphasise that field names are unique within their nesting levels, preventing any
overwrites when indexing by name.

If the field has children, the algorithm will still create a Node class, but this time, it will
pass in an invocation of the graph construction function as the child parameter,
resulting in a recursive process that returns an object representing the field's
children. This recursive approach drills down through the field's children, giving the
algorithm its apt name of “depth-first search”.

The following code excerpt shows an implementation of the recursive depth-first-


search algorithm, returning the top level of nodes of the schema as a JavaScript
object.

// Returns a list of nodes that represent the sensor schema


const constructGraph = (sensorSchema) => {
let nodes = {};

for (const key in sensorSchema) {


const field = sensorSchema[key];
if (field.child) {
nodes[field.name] = new Node(field.name, field.type, constructGraph(field.child));
} else {
nodes[field.name] = new Node(field.name, field.type, {});
}
}

return nodes;
};

The resulting graph generated from the schema JSON object is visualised below:
Note: Instead of storing the “Entry Point” as a node as the diagram above depicts,
the starting nodes (“temperature”, “humidity”, and “metadata”) are stored in a
JavaScript object.

Validation Step Two – Validate Data Against Schema Graph

Once having converted the schema into a graph data structure, the data submitted
by the user can be checked against the graph. This is achieved through the use of
depth-first search again, but in a slightly different manner to adapt to this problem.
An excerpt of a section of the code responsible for this functionality is shown below:

// Stack of [data, schema] pairs


const dataStack = [];
// List of fields that were not found in the schema
const notFoundFields = [];

// Initialise the stack with the top level fields


for (const key in data) {
dataStack.push([{ name: key, value: data[key] }, rootChildren]);
}

// Perform a DFS on the data and schema


while (dataStack.length > 0) {
const [field, schema] = dataStack.pop();

const node = findNode(field, schema);

// If the field is not found in the schema, add it to the list of not found fields
if (node == null) {
notFoundFields.push(field.name);
continue;
}
// If the field is found, check if the type matches
if (TypeMapping[node.type] != typeof field.value) {
node.setError(`Field ${field.name} is of type ${typeof field.value} but should be $
{TypeMapping[node.type]}`);
// Mark the node as found so that it is not marked as missing
node.setFound(true);
continue;
}

node.setFound(true);

if (node.type == SensorType.Object) {
// If the field is an object, add its children to the stack
for (const name in field.value) {
dataStack.push([{ name: name, value: field.value[name] }, node.children]);
}
} else {
node.value = field.value;
}
}

First, a stack (a data structure that resembles an array but limited to


removing/appending an element at the end) denoted by the variable name
“dataStack” is created. This stack is then appended with the initial top-level fields
submitted by the user. From the earlier example, the initial top-level fields would be
“temperature”, “humidity”, and “metadata” assuming the user has submitted the data
correctly. Each entry in the stack is accompanied by an object containing the initial
entry-point nodes, denoted as "rootChildren" in the code, that were obtained in the
previous step of converting the schema JSON into an object of entry-point Node
classes. Together, this forms a tuple pair of the field, and the top-level schema nodes
representing the list of possible options.

Then, a while loop is used on the stack until it becomes empty. For every iteration of
the while loop, the last element of the stack is removed and stored in a variable
denoting the current element to be processed. The current field is checked with its
corresponding possible field options. A helper function called “findNode()” checks for
a match between the field specified by the user and the list of possible options at that
level. If a match has been found, further checks such as checking whether the type
of the field matches the schema are also performed. Error messages related to this
check are stored within the schema Node object itself in order to assist the
functionality of providing descriptive error messages. The schema Node object is
also marked as “found” — this step is crucial for the next validation stage that checks
whether all expected fields have been submitted by the user. The stack is also
appending with the tuple of the field’s children, and the schema node’s children.

If, however, the field has not been found, it is added to a special array for fields that
could not be matched with the schema. It is important to store this information and
continue validating the remaining data rather than just declaring the data as invalid
immediately. This is so that helpful error messages can be provided to the user.

Validation Step Three – Check Schema Graph for Missing Fields


In the previous validation step, nodes that were found to be valid in the data
submitted by the user were marked as being “found” by using the “setFound()” setter
function. If all of the data submitted by the user is valid, then all the nodes in the
schema graph must be set to “found”, and no nodes should have specific errors
attached to them. To determine validity, the depth-first search algorithm is used again
to traverse through all of the schema nodes to check if each node has been found
and that there are no errors attached to them.

The final check required to determine that the data is valid is to check if the array of
“notFoundFields” is empty. When both conditions have been met, it can be
concluded that the data submitted by the user is indeed valid.

Descriptive Error Messages


The requirement of providing descriptive error messages is important when creating
applications that are user-friendly, and especially more so in this case, as there are
numerous possibilities of errors occurring, making troubleshooting vague error
messages a very difficult task for the user.

To address this concern, this application implements the functionality of capturing


errors in the validation process as they occur and storing them in a manner that
facilitates a clear output describing errors in detail.
As demonstrated in the sensor data submission example above, it becomes
apparent that an issue arises with the "is_heating_on" field. In this instance, the
value is provided as the string "false" within quotation marks, when it should be
specified without quotation marks, as the field is expected to be a Boolean. Further,
there is an issue with the “wind_speed” field that is the child of the “outside_weather”
object. The field is expected to be a float but was incorrectly specified as a string
bearing the value “<INCORRECT_VALUE>”. The sensor data validator picks up on
these issues and returns a clear error output that matches the nesting of the data.
For example, the error associated with the field “wind_speed” is provided under the
“outside_weather” key as “wind_speed” is a child of “outside_weather”, allowing the
user to easily trace this while troubleshooting.

Data Storage
As the fields of the sensor data will be queried extensively for the process of
visualisation, the way in which the data is stored directly impacts the usability of the
visualisation feature. Therefore, it is important to store the data in a manner that
facilitates quick retrieval for the application to be scalable and function as intended
for a large number of users.
To achieve this, the decision was made to store the sensor data into a time-series
database, as the visualisations in this web application are dependent on time and the
required data being queried with respect to a specific time period. The chosen time-
series database was InfluxDB, a NoSQL database, primarily due to its popularity and
ease of querying data (Timescale, 2023).

Time-series databases are able to perform better than traditional relational


databases at storing and ingesting high-volume data that are generated by sensors.
This is because they often include additional optimisation features such as their
ability to schedule cron jobs to aggregate data making aggregate queries faster, and
their special optimisation techniques allowing for better data ingestion (Timescale,
2023).

Influx DB requires data to be stored in buckets, which are designated containers for
data. For this web application, it was decided that each sensor has its own assigned
bucket. Within a bucket, the data for each field of the sensor can be stored as a
“Point”. This term is used to describe a field comprising of a key, value, and
timestamp. For this web application, a Point is created for all of the fields and sub-
fields and inserted into Influx DB. Influx DB attaches the timestamp of insertion to the
Points, and these timestamps are used when querying data over a time period.

Editing Sensor Dashboard

The sensor dashboard can be created or edited by clicking on the “Edit Visualisation
Dashboard” on the main dashboard as shown in Figure 10 of Appendix A.

Dashboard Design
On this screen, the user can click on the “Add Block” button to add a data block to
their visualisation dashboard, as shown below:

The initial plan was to create a drag and drop editor, but upon starting to research
drag and drop libraries in React, it became evident that a substantial amount of time
would have been required to implement a drag and drop editor due to the steep
learning curve associated with those libraries. This would have significantly hindered
the completion of the project. Therefore, it was decided to implement a simple grid-
based editor with two columns. Users can add a data block by clicking on the “Add
Block” button and can start editing the block by clicking anywhere on the box, as
shown above.

Linking Data Block with Sensor

When clicking on an uninitialised block, the user is presented with a modal screen
allowing them to link the block to a sensor and its specific fields. The interface is
displayed below:

In this interface, users can define a data block's name, associate it with a specific
sensor, and, for the Line Chart data block, specify the fields to be used as the x and
y-axis data. Utilising React's state management features, the dropdown fields are
revealed progressively, appearing only after the previous selections have been
made. This approach allows for the automatic adjustment of subsequent dropdown
fields based on the choices made previously.

Upon completing the initial setup of a data block, a summary of the user's selections
is displayed within the same container. The user also has the ability to edit the data
block by clicking on the “Edit Block” button. This enables them to make changes to
any aspect of the data block, with the alterations being saved to the database.

Data Blocks

The web application currently only supports two data blocks, both of which utilise the
Chart.JS implementation of Line Chart and Scatter Chart respectively. Chart.JS is an
open-source charting library that has implementations for various chart types. It only
requires a data source, and it is highly configurable in terms of adding designs or
animations to the graphs. The usage of a third-party library for implementing the data
blocks was essential due to the intricate elements of front-end design associated
with creating charts, as well as the performance optimisations that are required to
render the charts — potentially displaying vast amounts of data — in the user’s
browser. Chart.js, with its years of development and refinement, seemed to be a
logical choice to better the overall usability of the web application.

Although the original plan for the web application was to have multiple and more
complex data blocks, pursuing this would have required more time than initially
anticipated. This is because there are various types of data blocks, and each require
their own field mappings and configuration options. For example, a 2D graph
requires only 2 fields — a field for the x-axis, and another field for the y-axis.
However, other complex data blocks like multi-dimensional graphs, heatmaps, or
historical bar charts require more intricate mapping of fields as well as configurable
options. Creating a user interface for users to (1) define these options, and (2) to
accurately map their chosen configurations to the front-end code responsible for
rendering the data block would have amounted in a lot of work, which was not
feasible for the completion of the project. Therefore, to streamline the development
process and to ensure a more manageable project scope, it was decided that only
two data blocks would be implemented while dedicating time to researching and
addressing the infrastructure for adding support for more data blocks on the web
application.

Idea for Streamlining the Process of Adding New Data Blocks

One potential method of streamlining the process of adding new data blocks is to
define a JSON object consisting of the fields of information to collect from the user
via the “Edit Data Block” modal and store it for every sensor.

This JSON object should be used to render the dropdown fields on the modal, and
therefore, the object should contain a list of entries each consisting of:

 The label to display for the field.


 The type of the field, i.e., an input field, a dropdown field, a numeric field, etc.
 The URL of an API endpoint that returns the options of the field if it is a
dropdown.

The JSON object should also contain a list of properties to pass to the React
component handling the display of the data block. If the properties require any
values of the data collected from the user, they should be referenced using a custom
template literal for referencing variables collected from the user.

By using this architectue, the process of introducing a new data block simplifies to
the following two key tasks:

 The development of a JSON file that outlines the required user input and
properties for the React component.
o The "Edit Data Block" modal will display the fields specified by the
JSON object corresponding to the user’s chosen data block.
 The creation of the React component tasked with rendering the data block.

This makes the process of adding a new data block much more structured and this
allows for scalability and consistent user experiences. It is important to note that this
architecture has not been implemented in the web application; however, it presents a
promising framework for systematically implementing more complex data blocks in
the future.

Viewing Sensor Dashboard

The sensor dashboard can be viewed by clicking on the “View Visualisation


Dashboard” on the main dashboard as shown in Figure 10 of Appendix A.

Time Period Based Querying

Users can specify a desired time range to view the data on the sensor dashboard.
This allows the limiting of the amount of data that is passed to the frontend. Limiting
the data is essential for reducing the strain on the frontend, which is rendered on the
user’s browser. As the sensors are expected to push vast amounts of data, fetching
all data at once would lead to slower loading times and potentially crash the user’s
browser, making the web application unusable. Allowing users to select a specific
time range ensures that only relevant data is transmitted to the frontend, optimising
performance, and preventing data overload.

While this approach effectively addresses performance concerns, an even more


refined solution involves users being able to specify aggregation configurations,
allowing them to condense the data from extensive time periods into smaller sizes,
which also keeping the amount of data sent to the frontend at a minimum. One
example of an aggregation configuration would be the presentation of weekly or
monthly average values. This would allow users to utilise the visualization dashboard
for sensor data spanning numerous months — potentially comprising millions of data
points — with ease, as all of the data points for entire weeks or months are
condensed into a single value. Implementing the idea discussed for streamlining the
process of adding new data blocks in the “Editing Sensor Dashboard” section can
pave the way for incorporating aggregate visualisations.

Example Visualisation
An example of a Line Chart and Scatter Chart visualisation for displaying
temperature data against time, and humidity data against time respectively is shown
below:
The data for this visualisation was generated and sent to the web application by
using Python, offering the opportunity to simulate data transmission from sensor
devices to the web application. Using this approach provided a way to validate the
functionality and reliability of the entire web application and test it as it would be used
in a real-world setting.

Front-End Implementation

The front-end of the web application was created with a UI Library called Chakra UI.
This library is available as a package that can be used with React and has nice
implementations of common web elements. These elements are also highly
configurable, facilitating the ability to make quick changes and adapt the style of the
element to match the design of the application. Utilising a UI library saved a lot of
time in development, as it eliminated the tedious process of manually applying CSS
styles to elements and allowed time to be better spent on other important parts of
development, enhancing overall project efficiency.

Several usability features were made implemented based on the list of user interface
design guidelines by Wong (2022), including the addition of a global “toast” UI
component — a brief, closable, notification at the bottom of the screen providing
assurance messages to relay to users that their action has been performed. This
design element meets the guideline of “Visibility of system status”. Examples of
some implemented toast messages are shown in Figure 11 of Appendix A.

To adhere to the guideline of "Recognition rather than recall," action buttons


throughout the dashboard and sensor schema definition pages were supplemented
with icons summarising their respective actions. By incorporating icons, users can
quickly recognise the purpose of each button without having to recall its functionality,
thereby reducing cognitive load.

The user interface design elements, including a common teal colour scheme, re-
used UI components, a consistent navigation bar that is present across all screens,
with a link to return to the main dashboard, and the background-shadow CSS
property indicating interactive buttons, collectively satisfy the guidelines of
"Consistency and Standards." The consistent use of the teal colour scheme creates
visual cohesion across the application, while re-using UI components makes each
page feel familiar and easy to use. The presence of a standardised navigation bar on
every screen serves as a reliable anchor point for orientation when the user wants to
return back to the dashboard. Finally, the increase of the background-shadow CSS
property provides visual feedback for interactive buttons when hovered upon. By
adhering to these principles, the interface promotes a coherent and intuitive
experience, reducing user confusion.

Review of Current Stage of Project Work


At its current stage, the project has reached a significant milestone. The web
application has been successfully developed and stands as a robust proof of concept
for the idea of collecting and visualising sensor data. Although not every initial goal
has been fully achieved, the current iteration of the web application represents a
foundational version of the overall concept. The shortcomings have been identified
and innovative solutions to address them have been suggested.

Goals Analysis

This section revisits the initial goals of the project and assesses whether they have
been achieved.

1. “Provides an interface for users to define the schema for each sensor, i.e., the
different fields the data will contain and their corresponding data types. For example,
a temperature sensor will transmit two fields — the time, represented as a UTC
string, and the temperature, represented as a floating-point number. Enforcing a
strict schema will assist in creating consistent visualisation options for each field the
sensor collects data for.”

This goal has been fully achieved. Users can define their sensor schemas,
with support for up to three nested fields. This schema allows for various
variable types, and the sensor data is strictly validated against the schema.

2. “Provides an API endpoint for users to push their sensor data to. It is expected that
the application will be used by at least 500 users, and as such, the endpoint must be
able to handle many requests per second.”

This goal has been partially achieved. Although the web application has been
implemented in a way that is performant and can efficiently process and store
the sensor data, other techniques such as horizontal scaling, involving the use
of additional web servers, and the use of message queues such as Apache
Kafka would be required to scale the web application to handled lots of
requests per second.

3. “Provides interesting data blocks that can be designed as UI Components, which can
be combined to form the visualisation dashboard. A simple example of a data block
could be a 2D graph chart component, which allows the user to choose any two
fields from the data and display it. For example, a simple temperature plot would
show time on the x-axis and the temperature on the y-axis.”

This goal has been partially achieved as only two simple types of data blocks
have been implemented. The main issue faced in fulfilling this goal was the
complex adaptations required to the frontend and backend code to add a data
block. However, a solution for this problem has been identified in the
subsection titled “Idea for Streamlining the Process of Adding New Data
Blocks”.

4. “Has an easy-to-use drag and drop editor that allows users to link data blocks to their
sensors and drag and drop them to the editor to form their custom visualisation
dashboard.”

This goal has not been achieved due to time constraints, but a suitable
alternative has been implemented. This alternative uses a 2 x 2 grid, where
users can click the “Add Block” button to add a data block to the 2 x 2 grid.
From a usability and accessibility perspective, this is a much better solution.
However, the current implementation may be too restrictive for users wanting
to customise their dashboards further, and therefore a future iteration that
would not be too challenging to implement would be to allow the user to
customise the width and height of each block, allowing the user to create
custom dashboard layouts.

5. “Provides tools to analyse the data in further detail. At the very least, the user should
be able to view and compare current data overlayed with historical data.”

This goal has not been achieved. However, as detailed in the analysis for
Goal 3, a solution for this problem has been identified in the subsection titled
“Idea for Streamlining the Process of Adding New Data Blocks”.

Legal, Social, Ethical, and Professional Issues (LSEPIs)


There are several legal, social, ethical, and professional issues that arise with users
using the web application. As users could potentially store sensitive data on the web
application through their sensors, for example, through a heart rate sensor that can
potentially reveal the user’s health status or daily routines, one major concern is that
the user’s sensor data could be accessed by a malicious adversary, which would
give rise to a significant breach of the user’s privacy and the security of the web
application.

Identifying this issue early in the project led to a deeper consideration of the security
aspects of the web application, as it is the responsibility of the developer of such
systems to implement sufficient protection when dealing with sensitive user data. As
a result, this led to the implementation of developing the user authentication system,
making a user’s sensors and their corresponding data only accessible by the user
themselves knowing the password to log in to their account.

To strengthen this form of protection, additional rules enforcing the strength of the
password were also implemented on the backend to make passwords secure and
more resilient to brute-force attacks, whereby malicious adversaries have a higher
probability of success when dealing with shorter, less secure passwords. Further,
locked routes were also implemented in the web application using the “react-router”
library. This enforces user’s to be authenticated before being able to view certain
webpages containing sensitive information like the webpage showing the user’s
sensors, the visualisation dashboard, and the recently received data from the sensor.

Another concern relates to data control and ownership. Users trust the web
application to store their data often without clearly understanding the ways in which
the platform hosting the data can use it. Therefore, it is important to provide the user
with relevant controls regarding their data and informing them clearly on how their
data is used. Although this has not been addressed in the web application due to
time constraints, a potential solution to mitigate the impacts of this issue would be to
allow users to easily request a copy of all of their data that is stored by the web
application, and also to request to delete the sensor data entries they have sent to
the web application, giving them complete control on their data.
Equality, Diversity, and Inclusion (EDI) concerns
A primary concern relating to equality, diversity, and inclusion is the accessibility of
the web application. Accessibility is the foundation for ensuring the web application
can be used easily by individuals across diverse demographics having different
abilities. To address this issue, efforts were made to abide by the guidelines outlined
in the WCAG (Web Content Accessibility Guidelines) 2.1 during the development of
the web application. This includes the usage of well-contrasted text, often opting for
black text on a white background or white text against a dark teal backdrop, both
surpassing the required contrast ratios.

Furthermore, the application also follows Guideline 3.3 of WCAG 2.1, which is "Input
Assistance," by providing descriptive error messages on the sensor definition
screens, as well as the API response messages on the sensor data endpoint. These
error messages provide users with detailed information regarding the specific field
that caused the issue and a clear explanation of the problem, thereby assisting users
in understanding the error and facilitating their ability to take corrective action in
order to resolve the issue. This is also complemented through frontend error
elements, e.g., outlining fields containing errors with red, which guide users to
correcting errors, thus enhancing accessibility and usability for all users.

Additionally, the application also follows Guideline 3.1, which requires informational
text to be both readable and comprehensible. This objective is achieved by providing
clear and concise instructions, particularly within intricate user interfaces such as the
manage sensor page and the schema definition page. These instructions use easy
to understand words presented in a way that makes the important information stand
out. This is achieved through bold font enhancements for key information and
strategic line breaks that separate different pieces of important information, further
enhancing the user experience and meeting accessibility standards.
Review of Project Management

For this project, an agile methodology was adopted, which involves breaking down
the project tasks into smaller, manageable subtasks, and iterating on them in sprints.
Subtasks are defined as short one or two lines of text describing the work that needs
to be completed. This terse approach was used to prevent the issue of too much
time being spent on planning and managing the tasks that need to be completed and
having little time being left for development.

The use of a Kanban board has been instrumental in visualising and organising
these tasks, enabling tasks and sub-tasks to be worked on in a systematic manner.
This approach allows for flexibility as the order of working through the tasks can be
freely chosen. Tasks which are most closely related to the one most recently worked
on can be picked up to work on next, making it easier to transition from completing
one task and starting another, reducing friction, and saving time in the development
process. GitHub was not only used for source control, but also for utilising the
Kanban board functionality, as pictured below:

However, the original project plan had certain limitations. The initial plan was
relatively coarse, and some assumptions made at the beginning proved to be naive
as the project progressed, such as the implementation of a drag and drop editor.
Consequently, the plan became outdated, and over time, the plan was not being
used at all. Instead, a mental plan was being relied upon, and this hindered
productivity due to the lack of clarity regarding the product roadmap. At that point,
the focus was primarily on addressing immediate priorities, which led to too much
time being spent on one area. As a result, the initial plan was later revised based on
the knowledge and experience gained from working on the project, and this led to a
more controlled project management experience as the Kanban board could be filled
with more relevant tasks that would lead to the revised goals of the project.
In hindsight, the use of a Kanban board, and especially the decision of keeping task
descriptions terse made it a good project management tool because it made the
process of adding new tasks quick, easy, and therefore less prone to leaving it for
later and eventually forgetting. As a result, I developed a habit of creating tasks for
every bug encountered during development, as well as for any spontaneous ideas I
had regarding new implementations. This habit greatly enhanced the project's agility
and responsiveness and enabled a more efficient way of addressing issues and
exploring innovative ideas.

While the agile project management approach was initially chosen, it soon became
evident that the project's lifecycle began to resemble a more traditional waterfall
style. This was primarily driven by the interdependencies between various modules
within the project. For instance, consider the Edit Dashboard page, which requires
the completion of sensor schema work before it can retrieve the necessary data
fields, enabling users to associate fields with the data block. Another case of these
strong dependencies arises in the development of the Manage Sensors page, which
can be most effectively developed once the Sensor Definition page is fully
developed. These significant interdependencies were not initially anticipated when
the project began.

Resources Used for Development


The following table details the list of technological resources used in the
development of the web application:

Resource Description Source


Node JS The backend programming https://nodejs.org/en
language.
Express JS The backend web-framework. https://expressjs.com
React JS The frontend framework. https://react.dev
React-Router The library for implementing https://reactrouter.com/en/main
routes (different screens) within
the single page React
application.
PostgreSQL The primary database https://www.postgresql.org
technology used to store
information about the user, the
sensor schema, and the
dashboard.
InfluxDB The database technology used https://www.influxdata.com
to store sensor data to make it
easier to query.
Chakra UI The UI library used to make the https://chakra-ui.com
styles used in the application
more consistent and to save
time in tedious CSS
adjustments.
Chart JS The library used to implement https://www.chartjs.org
the presentation of the two data
blocks, Line Chart and Scatter
Chart.
Postman The tool used for testing https://www.postman.com
purposes to push data to the
API endpoint of the web
application.
Python The tool used to generate https://www.python.org
sample sensor data.

Review of Personal Development


The undertaking of this project has been a great experience of personal growth for
me on multiple levels. This project is the single largest and most structured task I
have ever undertaken, and throughout its course, I have honed and improved my
abilities in project management, time management, software development, and
research, all of which have contributed to my overall personal growth and
competence.

One of the most important project management skills I have learned is having
adaptability. It is very difficult to exactly plan the individual steps in a project as new
insights and experiences gained during working on the project inevitably leads to
necessary changes. It is necessary to adapt plans once things do not work to ensure
the goals of the project can still be met. This happened several times during this
project, however, the adaptations made to the project helped towards its completion.

Another area of personal growth is software development and more specifically,


database technologies. Working on this project has allowed me to gain experience
with traditional database technologies including PostgreSQL as well as newer
NoSQL time-series databases like InfluxDB. I have also gained a deeper
understanding of the challenges faced when storing and querying large volumes of
data, such as sensor data.

Throughout this project, I have identified that my productivity and learning efficiency
are maximised when I allocate extended periods of focused attention to a single
task, as opposed to juggling multiple tasks simultaneously or working for short
periods of time. Working on one task for a long period of time helps me work more
effectively as it immerses my mind completely in the task at hand, helping in
generating creative ideas and quickly finding solutions to encountered problems.

To effectively extend this project and fully achieve its goals of providing more
complex visualisation options, I would need to improve my skills and knowledge for
developing highly configurable systems. This would include improving my ability to
design and implement adaptable frontends that allow users to customise and
configure their visualisations according to their specific needs, as well as expertise in
data modelling and structuring to support the implementation of complex data blocks.
References
Dahlqvist, F. et al. (2019) Growing opportunities in the internet of things, McKinsey &
Company. McKinsey & Company. Available at:
https://www.mckinsey.com/industries/private-equity-and-principal-investors/our-
insights/growing-opportunities-in-the-internet-of-things (Accessed: 20 Feb. 2023).

Dukes, E. (2018) The Cost of IoT Sensors Is Dropping Fast. Available at:
https://www.iofficecorp.com/blog/cost-of-iot-sensors (Accessed: 01 Aug. 2023).

Oskow, K. (2022). Node.js vs Django vs Laravel: Which is the Best Back-End Web
Framework? https://www.uplers.com/blog/best-back-end-framework/ (Accessed: 02
Aug. 2023).

Wong, E. (2022) User interface design guidelines: 10 rules of thumb, The Interaction
Design Foundation. Available at:
https://www.interaction-design.org/literature/article/user-interface-design-guidelines-
10-rules-of-thumb (Accessed: 01 Jun. 2023).

RestCase. (2019). 4 Most Used REST API Authentication Methods. Available at:
https://blog.restcase.com/4-most-used-rest-api-authentication-methods/ (Accessed:
05 Aug. 2023).

Toth, A. (2023). Time-Series Database: An Explainer. Available at:


https://www.timescale.com/blog/what-is-a-time-series-database/ (Accessed 08 Aug.
2023).
Appendix A – Project Work

Figure 1: Screenshot of the account registration page.

Figure 2: Screenshot of the login page showing an example of input validation.


Figure 3: The highlighted cell shows the Argon2 hash generated from the user-
inputted password. Only the hash is stored in the database.

Figure 4: The highlighted row shows a session entry consisting of the mapping
between the session ID, denoted as “sid”, and the user ID which is embedded in the
“sess” JSON field under the “userId” key.
Figure 5: A screenshot of the “Add new sensor” screen showcasing the information
box, as well as the schema creating interface allowing users the ability to specify a
schema that also supports nested fields.
Figure 6: The schema validation process carefully checks each field entered by the
user, and it accurately associates errors with the fields that caused them. As a result,
the frontend can easily highlight the particular field where an error occurred.
Figure 7: The recursive DefineSchema component. When the chosen field type is
“Object”, as specified by the condition on Line 158, the React component generates
a new instance of itself, with the nesting level property being incremented by 1 (line
172). Starting from line 162 onwards, it becomes necessary to establish a custom
field state update function for this recursive component. In essence, this function is
responsible for updating the fields within the parent field's context whenever a
change occurs within the Object schema.
Figure 8: The input validation logic pinpoints the field(s) that caused an error in
saving the data. The frontend uses this information to highlight the specific field
causing the error in red and displays the corresponding error message under the
field.
Figure 9a: The screen displays all of the user's sensors and their corresponding
sensor ID’s, accompanied by a button that allows them to access a page with more
detailed information about the sensor. At the top of the screen, there are instructions
detailing the contextual information the user needs to understand how to push data
for their sensor.
Figure 9b: When clicking on the button denoted by an “eye” icon, the user can view
the schema definition for the sensor.

Figure 10: A screenshot of the main dashboard of the web application.


Figure 11: Examples of “Toast” messages displaying errors to the user.

You might also like