Professional Documents
Culture Documents
Unit 2
Unit 2
Unit 2
Input validation and sensitive data play a vital role in securing web and database applications.
This unit details input validation vulnerability category that can be a threat to application security and can be executed through various methods
like buffer overflow, cross-site scripting, SQL injection, & canonicalization attack.
In addition, sensitive data will make you aware of the processes of sensitive information access, sensitive data in storage, information
disclosure, and data tampering.
Basics of input validation
Input validation is a method in which the correctness of the user’s input data is confirmed prior to sending to the server for processing of the request(s).
The sole responsibility of input validation lies on the shoulders of application developers who try to figure out ways to stop input validation attack.
These attacks are execute the basics of input validation by embedding malicious strings in form fields, query string, cookies, and HTTP headers.
Now the question arises: can input validation attacks be completely stopped.
The answer to this question is little tricky, and its difficulty can be understood by considering programmer’s perspective in mind.
For a software developer who is creating the code to validate input, it is challenging since there may or may not have finite and unique answer to
consider the input to be valid within or across applications, as there can be multitudinous ways user can input data that do not violate any coded rules
but may be regarded as malicious.
Moreover, there is no single definition of malicious input that can help developer to guard against security breaches.
Aggravating to the situation, risk of exploitation of the input is dependent on what task the application performs with the input.
For instance, do applications utilize your data stored in your machine in the form of cookies etc., or do applications offer data services to provide data for you
to utilize.
Necessity of input validation
The security of information is one of the most important aspect for a developer, who tries to make the developed software as robust and secured as
possible.
Out of many ways in which application security can be breached one most common method that malicious coders utilize to exploit the system’s database is by
way of altering and searching vulnerabilities in application by manipulating input that the users send and receive.
Carefully crafted input is the most important tool for a hacker to access application databases, modify it, and can even damage it.
For the aforementioned reason, input must be tested and verified in order to countermeasure malicious characteristics that hacker inject into input.
Because of that, application security attacks can be safeguarded from malicious intent through accurate input validation.
Sound validation of input is an efficient method to countermeasure buffer overflow, cross site scripting, SQL injection, and canonicalization attacks.
In order to make input validation unfailing, you must adhere to some of the best practices that can help mitigate input validation attack.
Implementation of input validation
1.Consider every input to be malicious
● It is a basic supposition that every programmer must consider, unless proven wrong, that all input generated
by users are harmful.
● Irrespective of the source of generation of input whether the input arrives from a user, database, file share, or
otherwise, validate all input if you think the input source is outside the preview of your trust limit.
● For instance, if you send for request to any external web service that returns strings, what is the guarantee
that the strings do not have malicious commands embedded in it?
● Moreover, if multi-applications write to a shared database and you are reading data from such data source,
how can you ensure that the data is safe?
2.Centralize the input validation process:
It is one of the most essential strategy that every programmer must pay heed to, and
always rely upon, while designing application that is to create input validation and
filtering code in a shared library file rather than on each and every web page.
● This centralized approach to input validation process not only ensures consistency in
application of rules but also provide a one place access to modify or rectify the code
that can be applied uniformly to each and every program needing such validation
function.
● Use specifically designed regular expressions and make it global functions that can be
accessed from anywhere within the application to validate individual fields such as
emails, postal codes, titles, names, places, phone numbers, and so forth.
The most effective countermeasures of input validation (cont.) are as follows:
● 3.Never rely on client-side validation: It is an unfailing concept that all programmers must understand and implement in
their coding that is to perform client-side validation as a means to reduce number of round trips to servers thereby
increasing server efficiency, nevertheless never completely relying on client-side validation and always implementing
server side validation as a rule of thumb. i) Server side validation ii) Client side validation
● The steadfast reason for server-side validation is that an attacker can bypass client-side validation, terminate client-side
validation function for example by disabling JavaScript, or through any means invented; and if server-side validation
is in place then this strategy will fail and the application becomes doubly secured, thereby providing in-depth defence.
● 4. Caution against canonicalization issues: It is an important design consideration that every software programmer
must consider that is to avoid designing software that takes input from user for accessing files through file names as
this can be susceptible to canonicalization attack.
● Rather application should ascertain which file is needed to be opened for such purpose providing files automatically from the
system. Let’s discuss about the meaning of canonical form and canonicalization.
● Canonical form is the simplest, reduced, and most vital form possible without
oversimplification, and the process of converting data to its canonical form is called
canonicalization.
● Data can be represented in a canonical form for example Uniform Resource Location URLs
and file paths, which are prone to exploits due to its For instance, file paths can be in the
canonical form such as c:\temp\foo.dat.
● This file can be very easily attacked using various string representations like ..\foo.dat, etc.
Therefore, caution against canonicalization issue by avoiding file name input from
user.
● If you indeed need to accept file names input from user, then ensure that it is strictly
formed prior to granting or declining permission to access a particular file, thereby
helping in securing the application from canonicalization breach.
● Undergo strict testing of the input data: Strict testing of input data will help developers to control quite an input validation. The first interaction with the
input data is to constrain all known good data that is what is permitted to get through and restricting invalid data via pattern, type, range, and length
matching.
Understand the expectation of your input, and code to meet the expectation so as to receive only the valid input; try putting the range on the finite data set
that is valid in contrast to the infinite data set that is malicious, thereby invalid.
For providing defence in depth, you need to make sure that the code should also reject known bad input and subsequently sanitize the input. The figure will
help you to understand the technique better.
The detailed approach, description, and trade-off of the strict testing of inputs are as follows:
● Constrain input: It is the technique of permitting only good data by filtering allowable input through format, type, range, and length testing.
Enforce the acceptable value for the form fields present on your application.
● By checking all the constrains, rejecting all other input as bad data. On the server side, set character sets in order that you can create canonical form to
constrain input in a localized manner.
● Validate data for format, type, range, and length: It is essential to validate data in the input fields by using strong data types, parameterized
stored procedures for data access, and regular expressions for the string fields, thereby controlling malicious attack to a finer level of security.
Validate all input data in terms of format, type, range, and length, which can make attacking via input validation more difficult; length testing is most
difficult for attackers as they can get through type checking, but length checking make the attack more tough.
● Reject known bad input: It is a less reliable technique but quite effective if used in combination with allow good data approach. In order
to reject bad data, the developed software must know all the variations of input data that can be considered as malicious or bad.
● This approach is quite difficult to achieve as number of ways in which data can be malicious may or may not be finite and even may not
be constant. Range of bad data change over time, nevertheless valid data remains constant throughout time.
● For instance, the number of ways to represent characters is numerous for this reason allow is the preferred approach. In addition, the deny
approach is not as rich as allow approach because bad data like patters that can cause general attack do not remain constant.
● Sanitize input:
if the range of allowed input do not guarantee the safeness of input, by stripping a null character from the end of the user input string,
escaping out values so as to make it a literal, or using URL encoding or HTML encoding to make data as literal text instead of
executable script.
To make a valid URI request, escape out HTML characters and encode URL via HtmlEncode and UrlEncode methods respectively.
Practical solutions
In order to practically validate the input field using the preceding approaches go through the following:
● Last name field: Validate this field by constraining input by permitting strings data in the ASCII range A to Z and a to z together with
apostrophes and hyphens to manage names such as O’Connor and James-Carter. Also, limit the length of input field to the longest value that
are quite likely.
● Quantity field: Validate this field by constraining input by checking range and type of data; for example, if your input data need a positive
integer in the range of 0 to 100 put a range in the validation field and reject all data that is not an integer and if it is an integer but do not lie in
the given range.
● Free-text field: Validate this field by constraining data by permitting letters, spaces, and more generic characters such as hyphens,
commas, and apostrophes and disallowing signs like greater than, less than, braces, and brackets.
● Exception can occur if you want that URL links or mark-up tags such as <i> for italics and <b> for bold to be allowed in the free-text
field. In case of URL input ensure that the value is encoded in order to treat the URL as URL.
● User input is not validated in an existing web application: Ideally, web application checks user input for each entry point or input field,
but there are chances that an existing web application is not validating their user’s input data, and in such case you need to take makeshift
approach to minify the inherent risk unless a countermeasure is utilized in your web application’s input validation process.
● There are two approaches to mitigate the risks, and these approaches do not ensure safe handling of data but provide a short-term and
quick fixes as the following describes:
● Sanitize input while writing back to the vulnerable website using HTML/URL encoding: In this approach potentially malicious input is
made safe by encoding HTML or URL data and output is written back in a protected format. This is sanitation of input data in action.
● Reject script characters that are malicious that come from the vulnerable website: In this approach bad input is rejected by using a
configurable set of malicious As you have already known that definition of bad data does not remain constant and is context dependent, so
not completely secured.
Input validation vulnerability
Input validation vulnerability can be exploited by an attacker using the following methods:
● Buffer overflow
● Cross-site scripting
● SQL injection
● Canonicalization
Buffer overflow is a technique to handle over saturation of buffer memory, which in the current computing sense is the information stored
in the random access memory (RAM).
This condition occurs when a program tries to store more data in a buffer that has less memory space to tackle such data, or when the
program tries to store the data in a memory location outside the bounds of allocated buffer memory space, thereby crashing the program,
corrupting data, executing the malicious code, or breaching security.
Buffer overflow is basically a software vulnerability that can be maliciously exploited by an attacker.
The steps taken to carry out buffer overflow are shown as follows:
● Entrance: There must be an entrance available to the hacker in order to enter a server such that he can mess with the stack of buffer by
causing an overflow or by adding commands. A Trojan horse is used for carrying out this step as the Trojan sets up a backdoor
software on the server.
● Smashing the stack: The stack is filled up with meaning less characters and this step is carried. This causes the operating systems
to crash under normal circumstance as it is no longer in contact of the course which are necessary for its functions to be
performed. The language command of the load machine can also be smashed by the hacker if he wants to do more.
● Running commands: An operating system can be commanded by the overflow of the stack Command shell can be created by this method.
For example: By using “inetd” in UNIX a backdoor can be created, which can be used to manipulate the session of X-windows. The
hacker inserts a code that works on a same principle as does some communication software. The control of key board, monitor and
the services of mouse can be taken over by the user if this code is used.
If a UNIX server is being attacked by creating a backdoor, the attacker will eventually succeed
in carrying out the attack.
A program knowns as ‘wininet.dll’ can be created by the attacker if the machine to be attacked
runs on a window platform.
Expertise and patience is required to carry out this kind of attack as this attack highly
complicated and highly technical.
Knowledge of the various languages which are used on a machine and the knowledge of C
programming are some pre-requisites to carry out this kind of attack.
Buffer overflow attack
ASLR uses random address space that can hinder security attacks from an
attacker who try to predict target addresses.
Buffer overflow
A web application can be affected by buffer overflow attack; which attackers wield for
spoiling the implementation stack.
Despite the odds, ingenious attackers have identified and managed buffer overflow
attack in array of components and products quite successfully.
Flaws of buffer overflow can also be present in application server or web server that
governs the dynamic or static nature of a web application.
The web application that uses graphics library to render images are also prone to buffer
overflow attacks.
Custom web applications used by organizations can also suffer from buffer overflow
vulnerability though the threat to security is less likely owing to the fact that there are quite a
few hackers who will try to exploit the defects of specific web application.
Therefore, in custom web application discovery of buffer overflow risk is quite less and
if the vulnerability is discovered the threat to the application is reduced as error messages
and source code for the application are rarely available to the attacker.
● Affected environments of buffer overflow
The susceptibility of buffer overflows is present in nearly all identifiable web application,
application server, and web server environment.
The exception to this is in J2EE and Java environments in which buffer overflows threat
can have no effect or is immune to such attacks, though Java Virtual Machine (JVM) can be
affected by buffer overflows.
● The latest bug report that you have obtained to determine the vulnerability of the server
product or software libraries you are utilizing should be referred, and always upgrade the
patches developed by the original software vendor to fix those issues immediately.
● You can at regular time intervals scan your web application for buffer overflow fault by
running specific scanners that are easily available through the Internet.
● When the fault is found fix those fault by appropriated size checking of inputs and rid of
operational issues and denial of service attack.
Cross-site scripting
● Knowing about cross-site scripting (XSS) is not like knowing about the scripting languages and the
various categories of scripting thereof.
● Knowing about cross-site scripting is like understanding the potential threat this malicious
attack can create in web applications.
● This attack can help attackers to evade access control mechanisms like the same-origin policy
(SOP), which empowers a web browser to allows code present in one web page to access
data in a second web page with a condition that both web pages should have the same origin
like port number, hostname, and URI scheme, thereby preventing a malicious code located on
one web page to gain access to sensitive data on another web page via that page’s document
object model (DOM).
● Application security breach is mainly executed by attackers using XSS or
SQL injection attack, which arises in the software due to blemished or
unsophisticated coding and by not sanitizing the input.
● Concerned about protecting your sensitive business data from the prying
eyes of hackers.
● If “data integrity” is the first priority, you need OWASP compliant server the
organization.
● The following sections depicts some to the important leakage protection
mechanisms of sensitive data:
Protection mechanism
● Enforce strict data encryption: The first step is to categorize and identify
the sensitive data points. Once you have identified the critical data which
requires an extra protective cover, the next step would be to implement a
proven encryption technique. Make sure that sensitive data is kept under
the wrap of encryption all the time. Such data should neither be stored nor
transmitted in clear text format.
● Use SSL for user authentication: Protect all authentication gateways on
your website with secure HTTPS (SSL/TLS) protocol. SSL authentication
uses the concept of public/private key It means, a user must supply the
corresponding decryption key for gaining access to sensitive data
points.
● Implement strong password hashing algorithm: Password hashing is one of those things
that's pretty straightforward to implement, yet it is taken lightly by a large number of web
developers. Hackers can exploit the weakness in password hashing algorithm to steal
sensitive information stored on a web or application Only cryptographic hash functions
should be used to implement password hashing.
● Make use of penetration testing: It's a good idea to make your application undergo a third
party 'penetration test'. It will give you a fair insight on how secure the application is.
It's a fact that even the best programmers are susceptible to occasional mistakes. So it
makes sense to employ a trustworthy security expert to review the application for
potential vulnerabilities. For best results, the security review process should ideally
continue throughout the life-cycle of application development at a periodic interval.
Sensitive data access
Basics of sensitive data access
● The useful data generated for specific purpose can be categorized as sensitive
and non-sensitive.
● Sensitive data are those data whose value is quite important for the user
and loss of the sensitive information can lead to loss of personal
information, bank information, and business information, which can lead to
loss of money, privacy, and data leak.
● So, in order to manage, protect, and secure sensitive data you need to
understand the methodology of how to restrict access to the sensitive data.
● For that you need to understand how sensitive data is stored, is the data
encrypted before storing, is the data placed in secure server, and a lot more.
Hence, safeguarding access to sensitive data is of utmost priority for any
organization, institutions, and other data management bodies.
Protection of sensitive data access
● In order to protect sensitive information access in an organization, take stringent security measures.
Allow authorized personnel to get past the security processes on a need to know basis. Allow the
sensitive information transfer, retrieval, pickup, and delivery only to the authorized personnel.
Defend sensitive information theft and revelation to unauthorized individuals on laptops, desktops,
mobile devices, network, and postal service workstations. The
● access can be restricted to the unauthorized person for accessing sensitive information from
hardcopy such as printouts, or softcopy using endpoint media such as USB flash drive, hard disk
drive, optical disk, and memory card.
● Protect sensitive information access by encrypting it using advanced encryption algorithm that are
stored or archived.
● Label sensitive information in electronic or printed material as ‘restricted information’, thereby restraining
others from accessing.
● In work environment create strong login password and change it once a month, also lock your
computer every time when you leave it is unattended.
● Track and inventory sensitive information from creation to destruction.
● Before disposal of sensitive information present in hardcopy shred it so that no
one else can access it. Some of the donots for restricting access to sensitive
information include:
● never disclose sensitive information without the permission of higher
management.
● Never take out printout of the sensitive information in publicly accessible
printing machine, as it can result in unauthorized viewing of the hardcopy.
● Never copy sensitive information unless you can secure the copied data
using cryptographic technique.
● Never email sensitive data unless it is encrypted using algorithm.
● Never talk about sensitive information where other can overhear.
● Never transmit sensitive information using FAX without the permission from
higher management.
Types of sensitive data
● Confidential data is used in a general sense to mean sensitive data whose access is subject
to restriction, and may refer to data about an individual as well as that which pertains to
a business.
● Even though they are often used interchangeably, personal data is sometimes
distinguished from private data, or personally identifiable data.
● The latter is distinct from the former in that private data can be used to identify a unique
individual.
● Personal data, on the other hand, is data belonging to the private life of an individual
that cannot be used to uniquely identify that individual.
● This can range from an individual’s favourite colour, to the details of their domestic life.
● The latter is a common example of personal data that is also regarded as sensitive, where
the individual sharing these details with a trusted listener would prefer for it not to be
shared with anyone else, and the sharing of which may result in unwanted
consequences.
Confidential business data
● Confidential business data is that data whose revelation can ruin or harm the business, its
operations, and hamper its sustainability.
● Examples of such confidential data include patentable inventions, customer and supplier data,
financial data, trade secrets, and more.
Classified sensitive data
● Classified sensitive data are those data that come in the purview of distinct security categorization
regulations as levied by several national governments, and the disclosure of which may cause
harm to national interests and security.
● The protocol of restriction imposed upon such data is categorized into a hierarchy of
classification levels in almost every national government worldwide, with the most restricted
levels containing information that may cause the greatest danger to national security if leaked.
● Authorized access is granted to individuals on a need to know basis who have also passed the
appropriate level of security clearance.
● Classified sensitive data can be reclassified or declassified.
Sensitive data in storage
Basics of sensitive data in storage
● As a measure of security do not store sensitive data; since, if there is no sensitive data
stored in the digital format imply there is no fear of its stealing and so no loss.
● This is not what is in practice in reality, else the study of application security will lose its
importance.
● If storage of sensitive data is of utmost priority and necessary as in the case for financial
institutions, health care institutions, schools and colleges, government and private
organizations, and nation’s military, ammunitions, and other secret agencies; then,
identify those crucial sensitive data whose integrity and privacy breach can bring
catastrophe to individuals such as bank account holders, patients and doctors,
students and teachers, government and private employees, as well as to nation that
can create potentiality for war.
Techniques
● Cryptographic hash functions are otherwise known as one-way hash functions that generate one-way hash value that
cannot be reversed except after extensive brute force attack which is practically infeasible.
● Therefore, one cannot obtain the original data from the hashed value in feasible amount of time.
● If the database containing hashed table of passwords is compromised, then this data security breach cannot disclose
the password information even after re-hashing to obtain clear text data, thereby safeguarding its integrity and security.
● Some of the hash functions include message digest algorithm (MD5) and secure hash algorithm (SHA-1, 2, or 3). On the
other hand, if you are storing sensitive data such as credit card numbers of customers, do not store it directly or hash it as the
hashing will not help;
● Since, you need the number for verification not comparison unlike the case of passwords where the user input
password is hashed and compared with the hash table to get a match to grant access or mismatch to deny access.
● Instead, use a strong cryptographic technique either symmetric key cryptography, which uses the same key for
encryption and decryption; or asymmetric key cryptography, which uses one key for encryption and another key for
decryption.
● For securing credit card number as the case is use symmetric key cryptographic technique such as Triple DES (3DES) to
encrypt the sensitive data
Protection of sensitive data in storage
Sensitive data disclosure is one of the most challenging issue that can be prevented by using
encryption, secure socket layer technology, etc.
The severity of the data revelation is completely reliant on the sensitivity of the data being revealed.
For instance, revelation of bank information could cause financial loss, personal information could
cause identity theft, and user credentials could mean loss of privacy to the social networking sites,
or other websites where user name and password is mandatory to login to the website.
For a server-based application path disclosure can reveal the structure of the web application and
can be a threat for information security breach.
Information disclosure is influenced by the following
● Security flaw: If the stored sensitive data is not encrypted then it exposes security weakness and is susceptible to security
attack. Therefore, always store sensitive data by encrypting it using a strong encrypting algorithm, strong key
generation and management methodology, and strong password hashing technique.
● Flaws in a browser is easy to notice but difficult to abuse in a big scale. While flaws on the server side is difficult to discover
owing to the access limitation and is equally difficult to
● Attack vectors: Cyber attackers or attack vectors generally do not try to decrypt cryptographic data directly rather they
perform man-in-the-middle attack, thieve cryptographic keys, steal clear text information, or do something else to
breach security either from the user’s browser of from the data during transfer process.
● Threat agents: Include both internal as well as external threats that can gain access to sensitive data present in your
cloud or system, during the transmission process, or directly from client’s web browser.
● Business impacts: Loss of sensitive data can impact business as it can damage business reputation, imply penalty
from customers whose data is breached, and consequently result in decline of
● Technical impacts: Loss of sensitive data can impact technology shortcomings. The data which should be protected is
compromised due to frequent failures of security system of an organization.
Destinat
Source
ion
intruder
Information disclosure vulnerability
● It is essential that you should understand the importance of sensitive data and take extra precautionary measure while storing it
in digital format.
● If the sensitive data is present in your storage device and can be accessed through the network or needs to be transmitted to
external source for specific purpose, then you need to protect your data and minimize the vulnerability of information
disclosure.
● In order to safeguard your sensitive data from exposure keep in mind the following points:
● Never store sensitive data in clear text format including the backup for long term.
● Never transmit sensitive data in clear text format either internally or externally.
● Always use advanced and latest cryptographic algorithms to encrypt sensitive data.
● Use strong cryptographic keys and key management technique to prevent data breach.
● Ensure the presence of browser security headers or directives while transmitting sensitive data.
Information disclosure prevention
● It is essential to use advanced data protection methods, strong cryptographic algorithm, and secure
socket layer to prevent accidental disclosure of information, malicious attack by hacker, or thwarting
attack by insider.
● Encrypt all sensitive data in stored form and during transmission process.
● Avoid unnecessary storage of sensitive data and throw away if not needed.
● Always use advanced encrypting algorithm, strong key, & sound key management practice.
● Use specifically designed password protection algorithm to store and manage passwords.
● Disable caching for web pages and auto complete on forms that gather sensitive data
Data tampering
● You have already understood the basics of sensitive data, its storage,
and its susceptibility to network eavesdropping.
● In this section you will go through the data tampering and how to protect
sensitive data from being tampered once it is transmitted.
Data tampering protection in URL
It is sometimes required by a legitimate website to transmit sensitive data from one page to another or from a client to
the server machine and vice versa using URL.
This is something that needs special care in order to protect the sensitive data in transit by way of URL, as the data can
be accessed by the attacker.
Transmission of data through URL can be completed using GET or POST method in the <form> tag or via the key/value
pair.
You can protect sensitive data from tampering using various techniques including input validation technique.
For instance, in a database query it is required that integer input be sent via the URL.
By verifying the data for it type as integer the input can be verified for its correctness.
The input validation method can of course help prevent the unexpected behavior of the program but in no way help
defend against data tampering.
So, how to safeguard data against tampering attack; the following topic discusses in brief about how to prevent data tampering:
● Data tampering protection in database:
● If you want to reference a row in a database without leaking information for example how many rows
are present in the database.
● You can simply store a randomly generated unique identifier in an extra column and reference that
instead.
For example, when an id variable of a user is passed from one web page to another as the user keeps
on browsing the website the web application expects the id to remain immutable.
Each successive browsed page can verify the value of the id variable for mutation by calculating,
transferring, and processing the hash value of the data together with the id value. (sender)
.
(Receiver) Re-hashing and matching the hash value with the previous
web page hash value can convey the application whether the data is
being tampered by any malicious user or not.
If both the hash value matches, then the data is not tampered else
something fishy is going on.
There is one gimmick to the preceding procedure that is the id variable is
visible to the attacker, nevertheless the attacker could not be able to
know the secret and the hashing algorithm for generating the hash
value in order to tamper with the data
Data tampering protection through encryption
● Sensitive data can be protected by using symmetric keys – (same key can be
used for encryption and decryption) that do not disclose the actual data value.
● The idea is similar to hashing function, nonetheless uses symmetric keys for
encryption and decryption of sensitive data.
● You can also use a secure encryption library to encrypt sensitive data.
● When sensitive data is sent via URL, an encode technique is used in the URL
to encrypt the value of the input id and it is received on the recipient system
and subsequently decode technique is used to decrypt the URL to retrieve
the original data.