Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Data Security: Unstructured Data, Storage and Databases

ISSD: Chapter 8,11 &12 Info Security the Complete Reference


|
Structured and Unstructured Data
• Structured data conforms to some sort of strict data
model and is confined by that model. The model defines
business processes that control the flow of information
across a range of service-oriented architecture (SOA)
systems. E.g. it might define how data is formatted before
storage or before being rendered. In most cased
structured data “lives” in a database and is organized
based on the DB schema and associated database rules

• Unstructured data refers to information that either does


not have a pre-defined data model or is not organized in a
pre-defined manner. Most often categorized as qualitative
data

|
Security for Structured Data
• Databases reside within a data centre that is surrounded by brick walls, metal cages,
network firewalls, and other security mechanisms that allow you to control access to the
data.
• The data itself is structured in a manner that typically allows for easy classification of the
data. For example, you can identify a specific person’s medical record in a database and
apply security controls accordingly.
• Since structure and location of data is known, it is possible to have tight control over who
can access it. The security controls are relatively easy to define and apply to structured
data using either the built-in features of the structure or third-party tools designed for the
specific structure.

|
Unstructured Data

• Unstructured data may exist anywhere, in any format, and on any device, and can move
across any network. Consider, a patient record that is extracted from the database,
displayed on a web page, copied from the web page into a spreadsheet, attached to an
e-mail, and then e-mailed to another location. Contrastingly, unstructured data is more
difficult to manage and secure due to the loosely defined formats.

• Suppose for the same patient record, a user copies the data from the web page into the
spreadsheet after altering its contents (maybe removing certain fields) and further
modifies the spreadsheet (say the headers to be more descriptive). As this information
flows from one format to another, its original structure has been effectively changed.

|
Unstructured Data
The key areas where unstructured
data can reside include:
• Databases
• Applications
• Networks
• Computers
• Storage
• Physical world (printed
documents)

|
Databases as Sources of Unstructured Data

• The database is the center of the data world. The majority of information you are
trying to secure was either created and inserted into, is stored in, or has been
retrieved from a database.

• The database was once considered the realm of structured data, but with new
developments in database technology, increasing amounts of unstructured data
are now stored in the database. E.g. the storage of an application’s images,
videos, and other unstructured data

|
Databases as Sources of Unstructured Data

• In its most basic form, the database is accessed over a


network and a query is run against the database service
• This causes a database process to run and access the
data store to retrieve the queried data, which is then
piped back over the network.
• The data store can also export data into backups that are
restored on development systems or staging
environments.
• Unstructured data can therefore reside in different areas
of the database—either at rest in the schema in the
database data files, in backups, or sometimes exported to
other development or staging databases

|
Security of Unstructured Data in Databases
1. Encryption of Data at rest in a DB:

– Encryption of data itself such that it is stored in normal data files in an encrypted state. The database doesn’t necessarily
know (or care) whether or how the data is encrypted, so it passes the encrypted data to the application to decrypt.

– Partial encryption of the database schema so that specific rows, columns, or records are encrypted as a function of the
storage of the data. In this case the database handles the encryption of data and performs the decryption to the application.

– Full encryption of the database data files such that any information that resides in them is encrypted.

Depending on the method of encryption applied, database exports and backups can be protected without additional technology. Often,
data that needs to be used in development environments can be declassified by using data masking technologies.

2. Access control: credentials used to authenticate and provide authorization to access data can either be stored within the
database platform or reside within an external identity directory.

3. Securing Data Exports and Export points: Encryption can be applied at the export phase. The mechanism is different from that
used for the encryption of data. Data exports may be accompanied with a passphrase to use as the key for one-time encryption
of a specific data export. This allows sets of data to be protected in transit because the encrypted export and passphrase are
shared separately when communicated to the user importing the data.

|
Applications as Sources of Unstructured Data
• Unstructured data is typically created through user activity on their workstations, or as applications access and manipulate
structured data and reformat it into a document, e-mail, or image. The number of applications is growing ; with cloud
development platforms, it is now a relatively trivial task to create a collection of data from a wide variety of sources and
consolidate it in a different application.

• Application security can be categorized into the following groups:

– Application access controls that ensure an identity is authenticated and authorized to view protected data, to which that
identity is authorized, via the application

– Network and session security to ensure the connection between the database, application, and user is secure

– Auditing and logging of activity to provide reporting of valid and invalid application activity

– Application code and configuration management that ensure code and changes to the application configuration are secure

• Securing applications is one of the most important ways to protect data, because applications are the interface between the end
user and the data; a great deal of the security investment is devoted to the development of an application

|
Networks as sources of Unstructured Data
• As data moves from the protected realm of the database into the application and on to the end
user, the communication occurs via a local process, or via a network. End users are rarely on the
same network as the application and server, hence need to secure the network.

• Some of the technologies designed to secure unstructured data on the network include.

– Network security technologies that are able to analyse traffic and detect threats. Network
intrusion prevention and detection systems actively monitor the network for malicious activity
and, upon detection, prevent intrusion into the network or alert on successful intrusions.

– Malware protection technologies prevent Trojans from deploying and planting back doors on
your trusted network clients. The newest polymorphic advanced persistent threats (APTs), can
be blocked using solutions that detect illegitimate traffic and deny it.

|
Computers as sources of Unstructured Data
• Once a legitimate user has securely connected across the network to the application to access data residing in the database, the
information is ultimately presented in a web page on the client side. They can move the data to an unstructured files such as a
PDFs or Excel spreadsheets, or download and store the data on a local unprotected drive.

• Therefore, the security of the device from which the user interacts with the application and resulting unstructured content
becomes critical. Servers are usually limited in number and physically under your control, or at least under the control of your
cloud services provider. Networks and their gateways are also limited in number and usually within your control. But end-user
computers may number in the hundreds or thousands and may often be beyond your security control.

• Furthermore, those computers may be running a large number of platforms, a wide variety of OS versions, and a wide variety of
software, and may be used by a wide variety of people. Within the context of the security of unstructured content on computers,
the following areas may be considered:

– User Control: only legitimate users can access the computer (identity access control)

– Ports and peripherals control: Controlling the flow of information over local interfaces and connection points (USB, DVD, etc.)

– Securing data residing at rest on the computer

|
DLP for Security of Unstructured Data
• Data loss prevention (DLP) refers to a set of tools and processes designed to monitor, discover, and
protect data. They are broader in scope, more data-centric, and less platform dependent DLP solutions
typically can be broken down into three types:
– Network DLP: Usually a network appliance that acts as a gateway between major network
perimeters (most commonly between your corporate network and the Internet). Network DLP
monitors traffic that passes through the gateway in an attempt to detect sensitive data and do
something about it, typically block it from leaving the network.
– Storage DLP Software running either on an appliance or directly on the file server, performing the
same functions as network DLP. Storage DLP scans storage systems looking for sensitive data.
When found, it can delete it, move it to quarantine, or simply notify an administrator.
– Endpoint DLP Software running on endpoint systems that monitors operating system activity and
applications, watching memory and network traffic to detect inappropriate use of sensitive
information.

|
DLP for Security of Unstructured Data
• Network, storage, and endpoint DLP are often used together as part of a comprehensive DLP solution to meet
some or all of the following objectives:

– Monitoring: Passive monitoring and reporting of network traffic and other information communication
channels such as file copies to attached storage

– Discovery: Scanning local or remote data storage and classifying information in data repositories or on
endpoints

– Capture: Storage of reconstructed network sessions for later analysis and classification/policy refinement

– Prevention/blocking: Prevention of data transfers based on information from the monitoring and discovery
components, either by interrupting a network session or by interacting with a computer via local agents to stop
the flow of information

• DLP solutions may comprise a mixture of the above, and almost all DLP solutions leverage some form of
centralized server where policies are configured to define what data should be protected and how.

|
Holistic Approach to security of Unstructured data
• Identify critical unstructured information assets: The value of information assets depends on nature and relevance
at a point in time. Profiling what constitutes vital information assets, and the time at which they are relevant is an
important step. While it may not be possible to create a register for this information, employees can be made to
comply with information classification policies and specific mechanisms can be set up to secure unstructured
data.

• Identify users that possess critical unstructured data: Vital information assets are not in the possession of many
employees. Critical unstructured data usually remains with a few employees. Identification of employees who hold
vital data can help with targeted security safeguards. This will be more effective than a generic widespread security
program.

• Implement separate technology and process controls to protect data assets: Technical and procedural safeguards
for workstations can be enhanced for the identified key employees. Safeguards to secure unstructured data should
be a combination of procedural measures to reduce the exposure of information, and technical countermeasures
such as drive encryption, encrypted communication channels or DLP or email monitoring

|
Holistic Approach to security of Unstructured data
• Set up a reporting mechanism to identify suspicious transactions: Suspicious events should be
carefully examined for the possibility of information leaks, and for determining if a certain employee is
a common factor in more than one such occurrence. A confidential reporting mechanism backed by
clear processes may also help in identifying suspicious behaviour in employees.
• Design training programs for targeted employees: Many employees do not understand risks that are
unseen or occur over long periods of time. Develop training programs that deal specifically with
security risks related to unstructured information.
• Create a culture where it is not improper to refuse information on a need-to-know basis: Top
management should lead in creating a security-aware culture in the enterprise. Information should be
shared and circulated only on a need-to-know basis to reduce unauthorized disclosure.
• Iterative Audit: Review data for governance and compliance for unstructured data as per regulatory
standards.

|
Storage Security

|
Storage Architecture
• Primary storage is composed of a storage device such as
a NAS appliance or a storage array.
• The contents of the storage components are managed
and served via a server infrastructure, with an operating
system that is compatible with servers and workstations
in the end-user environment(e.g. EMC’s Hypermax or
Dell’s SCOS).
• The network connections between the primary storage
and the storage servers should be independent of the
corporate IP network, because the communications that
take place on these local connections are internal and
don’t require access from the rest of the network—they
are specialized in their functionality.

|
Storage Architecture
• Offsite storage is primarily used to replicate and
back up data from a local storage device or
facility to a storage facility located away from the
client's main premises.

• The key objective behind offsite storage is to


maintain backup copies of data, in case the
primary site is offline, unavailable or is destroyed.

• The data is usually transferred through a secure


VPN or Internet connection (FTP).

• Securing storage components must take into


account the three primary components :
• Storage networks
• Arrays
• Servers

|
Storage Networks
• Separation of duties should be applied within the storage
infrastructure. Since all storage devices are connected physically,
separating access to the physical servers prevents a storage
administrator from connecting a rogue server into the
environment and then provisioning it access to restricted logical
unit numbers (LUNs).

• A LUN is the mechanism an array uses to present its storage to a


host operating system. Likewise, while someone may connect a
server to the environment and configure it, methods of protecting
the LUNs are applied so that the server cannot gain access to
restricted LUNs.

• Isolating data traffic between LUNs via the switch is accomplished


through the use of zoning—comparable to virtual LANs (VLANs) in
the network world.

|
Storage Networks
• Port Zoning: the accessibility of the host to the LUNs is defined by
the switch port. The advantage to zoning in this manner is that an
intruder cannot connect a host to the switch, spoof a good name,
and access LUNs of another host. Since the protection is enforced
on the port interface, the intruder would need to disconnect the
good host interface and connect the intruding host into the
defined port.

• WWN Zoning: the zones are created relative to the ports the
servers are connected to on the switch, is WWN zoning, which
defines the individual zone based on the WWN ID of the host bus
adapter (HBA). The WWN is very much like the MAC address of a
network card. It is a 16-digit hexadecimal number that uniquely
identifies the HBA within the SAN fabric

|
Storage Arrays
• Another area of risk is the storage array itself. When LUNs are created, it is necessary for
the array to provide a screen to prevent the data that resides on the array from being
accessed by other hosts that are able to connect to the array.

• Storage arrays therefore ought to equipped with a mechanism that provides LUN
masking. This allows multiple hosts to communicate with the array and only access LUNs
that are assigned through the application that provides the LUN-masking protection.

• Once data is stored on the server, the potential still exists for that data to be accessed by
other hosts on other networks. LUN masking adds a layer of protection to the data once
that data resides on the storage array

|
Storage Arrays

|
Servers

• Finally, there is need to consider the risks that reside on the host itself.

• As long as the data “rests” on the server, the potential to access that data exists.
In the worst-case scenario, an attacker may obtain access to the server and
escalate his authority to attempt to read the data.

• Therefore, when securing data, a comprehensive solution is necessary. The


operating system must be secured and patched, file permissions must be
planned and applied to reduce access as much as possible, and monitoring
needs to be performed. Finally, confidential data should also be encrypted to
protect it from unwanted access.

|
Risk Remediation for Storage Security

• This categorizes the risks associated with data storage according to the classic
CIA triad of Confidentiality, Integrity, and Availability.

• For each identified risk, where possible, security controls consistent with
defence, detection, and deterrence—are applied in an effort to mitigate the risk

• What’s left after those controls are applied to mitigate the risks is then identified
as residual risk.

|
Defense vs. Detection Controls
• Use watermarking and data classification along with
• Data Leakage: DLP to block inappropriate data access.
monitoring software to track data flow
• Sniffing: Encrypt data at rest as well as in transit
• Use IAM and PAM to detect inappropriate data access as
• Inappropriate Admin use: Reduce the number of
well as IDSs
administrators for each function and log activities.
• Review provider’s administrative access logs
• Storage persistence: wiping or file shredding when disks
• Frequent Disk scanning and failure alerts
are decommissioned or replaced
• Implement an IDS on the storage network
• strong compartmentalization and role-based access
control • Use watermarking and data classification along with
monitoring software to track data flow
• Data Misuse: Network based access control
• regular audits on computing system access and data
• Fraud: checks and balances along with separation of
usage giving special attention to unauthorized access
duties and approvals
• Routinely monitor logs, looking for unexpected behaviour
• Integrated identity management solutions

|
Defense vs. Detection Controls
• Phishing: anti-phishing technologies to block rogue web sites • Use an application firewall to detect when remote web sites
and detect false URLs are trying to copy or emulate your sites

• Malfunctions: appropriate built in RAID redundancy • Employ integrity verification software that uses checksums or

• Data Loss: critical data is redundantly stored other means of data verification

• Data Tampering: Utilize version control applications • Maintain and review audit logs of data deletion

• App failure/instability: Ensure that all software updates are • Use integrity-checking apps to monitor and report key data

applied alterations

• Slow Service: Using adaptive redundant storage and network • Implement service monitoring to detect and alert when an

connections application does not respond correctly.

• High Availability: Monitor the health of secondary systems • Monitor response time of applications

• Backup failure: Leverage storage elasticity • Frequent fail-over testing

• Frequently perform recovery testing to validate the resilience


of data

|
Deterrence Controls vs. Residual Risk
• Data Leaks: Establish security policies with penalties for data leakage • Data persistence within the storage environment can expose data long

• Packet Sniffing: NDAs and SLAs for third-parties after it is no longer needed

• Inappropriate admin access: security policies especially for • Data can be stolen from the network

administrators • Administrators can abuse their access privileges either intentionally or

• Storage persistence: Establish data-wiping requirements before accidentally

selecting a storage product • Storage is persistent across different media

• Platform attacks: commitment to identifying and prosecuting • SANs will still be vulnerable to attacks
attackers. • Users will always find ways around controls and they will occur despite
• Data Misuse, fraud and hijacking: Strict security policy and awareness controls
training and penalties for fraud • Best Trained employees still fall to phishing scams
• Phishing: Maintain educational and awareness programs • Technical failures are inevitable
• Malfunctions: get/purchase technical product support • DOS may affect both the enterprise and the SP
• DOS: outsource for DR with 99.999% SLA • Unforeseen bugs will always exist data
• Instability and App failure: Warrantees and SLAs by vendors

|
Databases

|
Database Goals
Application support: Ranging from Secure storage of sensitive
simple employee lists to enterprise- information: Relational databases
level tracking software, relational offer one of the most secure
databases are the most used methods of centrally storing
method for storing data. important data

Data warehousing: Many


Online transaction processing (OLTP): organizations go to great lengths to
OLTP services are often the most collect and store as much
common functions of databases in information as possible. The primary
many organizations. These systems business reason for storing many
are responsible for receiving and types of information is to use this
storing information accessed by client data eventually to help make
applications and other servers. business decisions

|
Database Security Layers
Server-Level Security: A database application is only as secure as the server it is running on.
Therefore, it’s important to start considering security settings at the level of the physical server or
servers on which your databases will be hosted

Network-Level Security: databases are accessed by users through network links for the data they
need. Therefore, general operating system and network-level security also applies to databases. If
the underlying platform is not secure, this can create significant vulnerabilities for the database

Data Encryption: This ensures safety of information stored in the database Most modern
databases support encrypted connections between the client and the server

Operating System Security: On most platforms, database security goes hand in hand with operating
system security. Network configuration settings, file system permissions, authentication mechanisms,
and operating system encryption features can all play a role in ensuring that databases remain secure

|
Database Level Security
• Database Administration Security: One important task related to working with a relational
database is maintenance of the server itself. Important tasks include creating databases,
removing unneeded databases, managing disk space allocation, monitoring performance, and
performing backup and recovery operations

• Database Roles and Permissions: having a valid server login only allows a user the permission to
connect to a server. In order to actually access a database, the user’s login must be authorized to
use it.

• Object-Level Security: Relational databases support many different types of objects. Tables,
however, are the fundamental unit of data storage. Permissions are granted to execute one or
more of the most commonly used SQL commands

|
Database Security
Two key security issues in Databases are aggregation and inference:
• Aggregation- Act of combining information from separate sources. This
combination generates new information which otherwise would not be available
Prevention:
– Content-dependent access control, subjects should be prevented from
accessing any information and its associated components beyond their
clearance level
– Context-dependent access control, subjects previous actions are recorded and
access provisioned based on it

|
Database Security

• Inference: Ability to derive information not explicitly available. It is the intended


result of Aggregation

Prevention:

– Content and context dependent access control

– Cell suppression, portioning, noise and perturbation


• Cell Suppression: Hiding specific cells that may have information

• Partitioning: Dividing the database into different parts and applying access control

• Noise and perturbation :Technique of inserting additional information (padding info) to confuse
the attacker

|
Other Database Objects for Security

• Use of Views
• Use of Stored Procedures
• Use of triggers
• Using Application-Security
• Use of tablespace quotas for users
• System resource limits for the users
• Database Backup and Recovery
• Database Auditing and Monitoring

|
Embedded Systems Security
• Embedded System: cyber-physical computing device that is part of a electrical or
mechanical system. They are are small, cheap, rugged and use very little power.
Ensuring security of the software is the biggest challenge in protecting these devices
• Industrial Control Systems: specifically designed to control physical devices in Industrial
processes.
– Programmable Logic Controllers (PLC): designed to control electromechanical
processes. Devices connect to PLCs via standard RS-232 interfaces
– Distributed Control Systems (DCS): Network of control devices that are part of one or
more industrial processes within close distance. Protocols are not optimized for WAN
communications. DCS consists of devices within a single plant

|
Embedded Systems Security

• Supervisory Control and Data Acquisition (SCADA): Controls large scale physical
processes involving nodes across significant distances. Involves 3 kinds of
devices

– Endpoints: Remote terminal unit that connects directly to sensor or actuators

– Data Acquisition Server: Backend that receive all data from endpoints and
perform correlation or analysis

– User Station: Human machine interface that displays data from endpoints and
allows users to issue commands to the actuators

|
Ole Sangale Road, Madaraka Estate. PO Box 59857-00200, Nairobi, Kenya
Tel: (+254) (0)703 034000/200/300 Fax : +254 (0)20 607498
Email: info@strathmore.edu Website: www.strathmore.edu
|

You might also like