Unit 2

Unit-II Introduction to input validation & sensitive data
Input validation requires the understanding of the following:
● Basics of input validation.

● Necessity of input validation.
● Implementation of input validation.
Input validation and sensitive data play a vital role in securing web and database applications.
This unit details input validation vulnerability category that can be a threat to application security and can be executed through various methods
like buffer overflow, cross-site scripting, SQL injection, & canonicalization attack.
In addition, sensitive data will make you aware of the processes of sensitive information access, sensitive data in storage, information
disclosure, and data tampering.
Basics of input validation
Input validation is a method in which the correctness of the user’s input data is confirmed prior to sending to the server for processing of the request(s).
The sole responsibility of input validation lies on the shoulders of application developers who try to figure out ways to stop input validation attack.
These attacks are execute the basics of input validation by embedding malicious strings in form fields, query string, cookies, and HTTP headers.
Now the question arises: can input validation attacks be completely stopped.
The answer to this question is little tricky, and its difficulty can be understood by considering programmer’s perspective in mind.
For a software developer who is creating the code to validate input, it is challenging since there may or may not have finite and unique answer to
consider the input to be valid within or across applications, as there can be multitudinous ways user can input data that do not violate any coded rules
but may be regarded as malicious.
Moreover, there is no single definition of malicious input that can help developer to guard against security breaches.
Aggravating to the situation, risk of exploitation of the input is dependent on what task the application performs with the input.
For instance, do applications utilize your data stored in your machine in the form of cookies etc., or do applications offer data services to provide data for you
to utilize.
Necessity of input validation
Modern world in completely reliant on web or database applications for information.
The security of information is one of the most important aspect for a developer, who tries to make the developed software as robust and secured as
possible.
Out of many ways in which application security can be breached one most common method that malicious coders utilize to exploit the system’s database is by
way of altering and searching vulnerabilities in application by manipulating input that the users send and receive.
Carefully crafted input is the most important tool for a hacker to access application databases, modify it, and can even damage it.
For the aforementioned reason, input must be tested and verified in order to countermeasure malicious characteristics that hacker inject into input.
Because of that, application security attacks can be safeguarded from malicious intent through accurate input validation.
Sound validation of input is an efficient method to countermeasure buffer overflow, cross site scripting, SQL injection, and canonicalization attacks.
Implementation of input validation
In order to make input validation unfailing, you must adhere to some of the best practices that can help mitigate input validation attack.
Implementation of input validation
1.Consider every input to be malicious
● It is a basic supposition that every programmer must consider, unless proven wrong, that all input generated
by users are harmful.
● Irrespective of the source of generation of input whether the input arrives from a user, database, file share, or
otherwise, validate all input if you think the input source is outside the preview of your trust limit.
● For instance, if you send for request to any external web service that returns strings, what is the guarantee
that the strings do not have malicious commands embedded in it?
● Moreover, if multi-applications write to a shared database and you are reading data from such data source,
how can you ensure that the data is safe?
2.Centralize the input validation process:
It is one of the most essential strategy that every programmer must pay heed to, and
always rely upon, while designing application that is to create input validation and
filtering code in a shared library file rather than on each and every web page.
● This centralized approach to input validation process not only ensures consistency in
application of rules but also provide a one place access to modify or rectify the code
that can be applied uniformly to each and every program needing such validation
function.
● Thereby reducing software development effort and future maintenance issue.
● Use specifically designed regular expressions and make it global functions that can be
accessed from anywhere within the application to validate individual fields such as
emails, postal codes, titles, names, places, phone numbers, and so forth.
The most effective countermeasures of input validation (cont.) are as follows:
● 3.Never rely on client-side validation: It is an unfailing concept that all programmers must understand and implement in
their coding that is to perform client-side validation as a means to reduce number of round trips to servers thereby
increasing server efficiency, nevertheless never completely relying on client-side validation and always implementing
server side validation as a rule of thumb. i) Server side validation ii) Client side validation
● The steadfast reason for server-side validation is that an attacker can bypass client-side validation, terminate client-side
validation function for example by disabling JavaScript, or through any means invented; and if server-side validation
is in place then this strategy will fail and the application becomes doubly secured, thereby providing in-depth defence.
● 4. Caution against canonicalization issues: It is an important design consideration that every software programmer
must consider that is to avoid designing software that takes input from user for accessing files through file names as
this can be susceptible to canonicalization attack.
● Rather application should ascertain which file is needed to be opened for such purpose providing files automatically from the
system. Let’s discuss about the meaning of canonical form and canonicalization.
● Canonical form is the simplest, reduced, and most vital form possible without
oversimplification, and the process of converting data to its canonical form is called
canonicalization.
● Data can be represented in a canonical form for example Uniform Resource Location URLs
and file paths, which are prone to exploits due to its For instance, file paths can be in the
canonical form such as c:\temp\foo.dat.
● This file can be very easily attacked using various string representations like ..\foo.dat, etc.
Therefore, caution against canonicalization issue by avoiding file name input from
user.
● If you indeed need to accept file names input from user, then ensure that it is strictly
formed prior to granting or declining permission to access a particular file, thereby
helping in securing the application from canonicalization breach.
● Undergo strict testing of the input data: Strict testing of input data will help developers to control quite an input validation. The first interaction with the
input data is to constrain all known good data that is what is permitted to get through and restricting invalid data via pattern, type, range, and length
matching.
Understand the expectation of your input, and code to meet the expectation so as to receive only the valid input; try putting the range on the finite data set
that is valid in contrast to the infinite data set that is malicious, thereby invalid.
For providing defence in depth, you need to make sure that the code should also reject known bad input and subsequently sanitize the input. The figure will
help you to understand the technique better.
The detailed approach, description, and trade-off of the strict testing of inputs are as follows:
● Constrain input: It is the technique of permitting only good data by filtering allowable input through format, type, range, and length testing.
Enforce the acceptable value for the form fields present on your application.
● By checking all the constrains, rejecting all other input as bad data. On the server side, set character sets in order that you can create canonical form to
constrain input in a localized manner.
● Validate data for format, type, range, and length: It is essential to validate data in the input fields by using strong data types, parameterized
stored procedures for data access, and regular expressions for the string fields, thereby controlling malicious attack to a finer level of security.
Validate all input data in terms of format, type, range, and length, which can make attacking via input validation more difficult; length testing is most
difficult for attackers as they can get through type checking, but length checking make the attack more tough.
● Reject known bad input: It is a less reliable technique but quite effective if used in combination with allow good data approach. In order
to reject bad data, the developed software must know all the variations of input data that can be considered as malicious or bad.
● This approach is quite difficult to achieve as number of ways in which data can be malicious may or may not be finite and even may not
be constant. Range of bad data change over time, nevertheless valid data remains constant throughout time.
● For instance, the number of ways to represent characters is numerous for this reason allow is the preferred approach. In addition, the deny
approach is not as rich as allow approach because bad data like patters that can cause general attack do not remain constant.
● Sanitize input:
It is the technique in which potentially malicious data is made safe,
if the range of allowed input do not guarantee the safeness of input, by stripping a null character from the end of the user input string,
escaping out values so as to make it a literal, or using URL encoding or HTML encoding to make data as literal text instead of
executable script.
To make a valid URI request, escape out HTML characters and encode URL via HtmlEncode and UrlEncode methods respectively.
Practical solutions
In order to practically validate the input field using the preceding approaches go through the following:
● Last name field: Validate this field by constraining input by permitting strings data in the ASCII range A to Z and a to z together with
apostrophes and hyphens to manage names such as O’Connor and James-Carter. Also, limit the length of input field to the longest value that
are quite likely.
● Quantity field: Validate this field by constraining input by checking range and type of data; for example, if your input data need a positive
integer in the range of 0 to 100 put a range in the validation field and reject all data that is not an integer and if it is an integer but do not lie in
the given range.
● Free-text field: Validate this field by constraining data by permitting letters, spaces, and more generic characters such as hyphens,
commas, and apostrophes and disallowing signs like greater than, less than, braces, and brackets.
● Exception can occur if you want that URL links or mark-up tags such as <i> for italics and <b> for bold to be allowed in the free-text
field. In case of URL input ensure that the value is encoded in order to treat the URL as URL.
● User input is not validated in an existing web application: Ideally, web application checks user input for each entry point or input field,
but there are chances that an existing web application is not validating their user’s input data, and in such case you need to take makeshift
approach to minify the inherent risk unless a countermeasure is utilized in your web application’s input validation process.
● There are two approaches to mitigate the risks, and these approaches do not ensure safe handling of data but provide a short-term and
quick fixes as the following describes:
● Sanitize input while writing back to the vulnerable website using HTML/URL encoding: In this approach potentially malicious input is
made safe by encoding HTML or URL data and output is written back in a protected format. This is sanitation of input data in action.
● Reject script characters that are malicious that come from the vulnerable website: In this approach bad input is rejected by using a
configurable set of malicious As you have already known that definition of bad data does not remain constant and is context dependent, so
not completely secured.
Input validation vulnerability
Input validation vulnerability can be exploited by an attacker using the following methods:
● Buffer overflow
● Cross-site scripting
● SQL injection
● Canonicalization
Basics of buffer overflow
Buffer overflow is a technique to handle over saturation of buffer memory, which in the current computing sense is the information stored
in the random access memory (RAM).
This condition occurs when a program tries to store more data in a buffer that has less memory space to tackle such data, or when the
program tries to store the data in a memory location outside the bounds of allocated buffer memory space, thereby crashing the program,
corrupting data, executing the malicious code, or breaching security.
Buffer overflow is basically a software vulnerability that can be maliciously exploited by an attacker.
The steps taken to carry out buffer overflow are shown as follows:
● Entrance: There must be an entrance available to the hacker in order to enter a server such that he can mess with the stack of buffer by
causing an overflow or by adding commands. A Trojan horse is used for carrying out this step as the Trojan sets up a backdoor
software on the server.
● Smashing the stack: The stack is filled up with meaning less characters and this step is carried. This causes the operating systems
to crash under normal circumstance as it is no longer in contact of the course which are necessary for its functions to be
performed. The language command of the load machine can also be smashed by the hacker if he wants to do more.
● Running commands: An operating system can be commanded by the overflow of the stack Command shell can be created by this method.
For example: By using “inetd” in UNIX a backdoor can be created, which can be used to manipulate the session of X-windows. The
hacker inserts a code that works on a same principle as does some communication software. The control of key board, monitor and
the services of mouse can be taken over by the user if this code is used.
If a UNIX server is being attacked by creating a backdoor, the attacker will eventually succeed
in carrying out the attack.
A command shell can then be run by the attacker.
A program knowns as ‘wininet.dll’ can be created by the attacker if the machine to be attacked
runs on a window platform.
Expertise and patience is required to carry out this kind of attack as this attack highly
complicated and highly technical.
Knowledge of the various languages which are used on a machine and the knowledge of C
programming are some pre-requisites to carry out this kind of attack.
Buffer overflow attack
Buffer overflow attack is mostly prevalent in programming languages, such

as C and C++ that do not provide any built-in protection against overwriting
or accessing of memory location an array uses and checking for array bounds
is not automatically performed, which could have prevented buffer overflow attack.
There is a technique called address space layout randomization (ASLR) that

can protect against such attack.
ASLR uses random address space that can hinder security attacks from an
attacker who try to predict target addresses.
Buffer overflow
A web application can be affected by buffer overflow attack; which attackers wield for
spoiling the implementation stack.
Attackers can successfully take control of a computer by running a specially created

malicious code that is input by sending it to the web application.
Discovery of buffer overflow vulnerability in an application is quite a difficult task; and

after its discovery, the exploitation of the application is even more arduous.
Despite the odds, ingenious attackers have identified and managed buffer overflow
attack in array of components and products quite successfully.
Flaws of buffer overflow can also be present in application server or web server that
governs the dynamic or static nature of a web application.
The web application that uses graphics library to render images are also prone to buffer
overflow attacks.
Custom web applications used by organizations can also suffer from buffer overflow
vulnerability though the threat to security is less likely owing to the fact that there are quite a
few hackers who will try to exploit the defects of specific web application.
Therefore, in custom web application discovery of buffer overflow risk is quite less and
if the vulnerability is discovered the threat to the application is reduced as error messages
and source code for the application are rarely available to the attacker.
● Affected environments of buffer overflow
The susceptibility of buffer overflows is present in nearly all identifiable web application,
application server, and web server environment.
The exception to this is in J2EE and Java environments in which buffer overflows threat
can have no effect or is immune to such attacks, though Java Virtual Machine (JVM) can be
affected by buffer overflows.
● Determination of buffer overflow vulnerability

Keep yourself informed with the latest error or bug reports for libraries and server
products that are being used by you to make safe and certain it is not vulnerable.
In case of custom-built web application, you must review the code via view source
code and check whether inputs through HTTP request could be able to deal with
randomly large data, if yes then the application is immune else vulnerable.
Protection from buffer overflow attack
● You must analyse your developed software in terms of memory management so as to
ensure that arbitrarily large input data is perfectly handled to control buffer overflow.
● The latest bug report that you have obtained to determine the vulnerability of the server
product or software libraries you are utilizing should be referred, and always upgrade the
patches developed by the original software vendor to fix those issues immediately.
● You can at regular time intervals scan your web application for buffer overflow fault by
running specific scanners that are easily available through the Internet.
● When the fault is found fix those fault by appropriated size checking of inputs and rid of
operational issues and denial of service attack.
Cross-site scripting
● Knowing about cross-site scripting (XSS) is not like knowing about the scripting languages and the
various categories of scripting thereof.
● Knowing about cross-site scripting is like understanding the potential threat this malicious
attack can create in web applications.
● Principally, cross-site scripting is an application security vulnerability in which attackers

inject a client-side code into vulnerable web pages that are browsed by users.
● This attack can help attackers to evade access control mechanisms like the same-origin policy
(SOP), which empowers a web browser to allows code present in one web page to access
data in a second web page with a condition that both web pages should have the same origin
like port number, hostname, and URI scheme, thereby preventing a malicious code located on
one web page to gain access to sensitive data on another web page via that page’s document
object model (DOM).
● Application security breach is mainly executed by attackers using XSS or
SQL injection attack, which arises in the software due to blemished or
unsophisticated coding and by not sanitizing the input.
● XSS attack’s main focus is to steal session cookies through session

hijacking, thieve password by attempting credential theft, forge data
transmitted to user so as to gain monetary advantage by deception, and
thieve private, financial, or other sensitive data whose disclosure imply
loss of identity, money, and privacy.
Detailed description of cross-site scripting
● Cross-site scripting vulnerability is present in all those web applications in which
software developers either do not validate the input entered by users in the web pages
containing forms or the validation is not strong enough to filter the encoded malicious
code.
● In the former case, the web applications take any input, whether correct or incorrect,
from users.
● While in the latter case, the filtration mechanism needs to be updated to undergo
rigorous validation.
● The vulnerability in the security of application allows the applications, in any case, to
utilize input data without any hindrance to generate result(s) as output.
● Now, an attacker searches the Internet to find the vulnerable websites that do not
validate user’s input using some tools.
● Once the vulnerable website is found the attacker with malevolent intention sends
malicious code usually in the form of a script to inject it into the user’s input data.
● The input data together with script gets executed, thereby making the attack successful. The end-
user’s web browser has no means to know the trustworthiness of the script and allows it to run.
● The browser assumes the script has arrived from a trusted source, but in actuality it is an
untrustworthy, different website script originated from dissimilar origin and should not be trusted
without strong validation.
● Also, the attackers can use encoding technique such as Unicode to make the malicious code look
unsuspicious to the user, so it becomes quite difficult to detect using filtering techniques.
● Though, if the encoding is not performed an easy detection is to find tags written using the symbol <
>.
● Nevertheless, there are numerous variations that are used by attackers that do not use tags. Moreover,
attacker can use ingenious ways to trick web applications to relay malicious code, which developers could
not be able to filter out and so there is a high probability of it being overlooked.
● In addition, XSS attack can propagate through embedded active content such as JavaScript,
VBScript, ActiveX (OLE), Flash, etc.
● You should know that XSS attacks are of four types: reflected, stored, DOM injection, and hybrid.
● In reflected type, scripts that are injected is reflected off the web server like in search
result, error message, or other response.
● And the script is sent by the attacker to the users by way of other web server, email
containing deceptive form for users to input, or via malicious link to be clicked by the users.
● In stored type, scripts that are injected is stored on the targeted web server like in
visitor log, message forum, database, or comment section.
● The stored type XSS is highly dangerous in blogs, forums, or content management
system (CMS) where users will see a large number of inputs from other users.
● In DOM injection, the web application’s JavaScript variables and codes are altered
instead of HTML elements.
● And finally there is hybrid type, it is the blend of all three types.
● All these XSS attack type compel the user to erroneously executes the XSS code.
● Therefore, the outcome of an XSS attack is same and is independent of the types, rather only
difference lies in the way payload reaches the server.
Affected environments of cross-site scripting
● The susceptibility of cross-site scripting is present in nearly all identifiable web
application, application server, and web server environment.
Determination of cross-site scripting vulnerability

● Web servers and web applications in case of the occurrence of any unexpected
condition returns error web page displaying a message such as HTTP 404 – page
not found, HTTP 500 – internal server error, etc.
● These error pages can help you to determine the vulnerability of the website or
web servers.
● What you need to do is to find whether the error web pages reflect any data that
users have entered such as the accessed URL then you can very well make out
there is a vulnerability of cross site scripting attack.
Protection from cross-site scripting attack
● The best way to protect a web application from XSS attacks is ensure that your application
performs validation of all headers, cookies, query strings, form fields, and hidden fields (i.e.,
all parameters) against a rigorous specification of what should be allowed.
● The validation should not attempt to identify active content and remove, filter, or sanitize it.
● There are too many types of active content and too many ways of encoding it to get around
filters for such content.
● We strongly recommend a ‘positive’ security policy that specifies what is allowed.
● ‘Negative’ or attack signature based policies are difficult to maintain and are likely to be
incomplete.
● Encoding user supplied output can also defeat XSS vulnerabilities by preventing inserted
scripts from being transmitted to users in an executable form.
● XSS has a surprising number of variants that make it easy to bypass blacklist validation.
● Watch out for canonicalization errors. Inputs must be decoded and canonicalized to the
application’s current internal representation before being validated.
SQL injection
Basics of SQL injection
● Structured query language (SQL, pronounced sequel) is one of the foundation in database handling.
You must know that data needs to be saved in your memory drive, the how and where part is handled
by database, which is governed by the 5th generation query language know by the name of SQL.
● Once you are familiar with the utility of SQL in programming then you can understand the essence
of SQL injection.
● It is basically injecting or inserting SQL query in dynamic database handling mechanism and help
in inserting, deleting data in a database.
● At present, the most famous and common method of hacking is carried out by SQL injections.
● The data base of any website can be accessed by an unauthorized person by using this method.
● All the detail of the data base can be acquired by the attacker.
● Below are some of the base in which this type of attack can be carried out:
● Log ins can be surpassed.
● Secret data can be accessed.
● Website content can be modified.
● My SQL server can be shut down.
● SQL injection is a code injection technique, used to attack data-driven applications, in
which malicious SQL statements are inserted into an entry field for execution
● for example to dump the database contents to the attacker.
● SQL injection must exploit a security vulnerability in an application’s software;
● for instance, when user input is either incorrectly filtered for string literal escape
characters embedded in SQL statements or user input is not strongly typed and
unexpectedly executed.
● SQL injection is mostly known as an attack vector for websites but can be used to
attack any type of SQL database.
● SQL injection attacks allow attackers to spoof identity, tamper with existing data,
cause repudiation issues such as voiding transactions or changing balances, allow the
complete disclosure of all data on the system, destroy the data or make it otherwise
unavailable, and become administrators of the database server.
Detailed description of SQL injection flaws
● Injection flaws allow attackers to relay malicious code through a web application to another system.
● These attacks include calls to the operating system via system calls, the use of external programs via
shell commands, as well as calls to backend databases via SQL that is SQL injection.
● Whole scripts written in Perl, python, and other languages can be injected into poorly designed web
applications and executed.
● Any time a web application uses an interpreter of any type there is a danger of an injection attack.
● Many web applications use operating system features and external programs to perform their
functions.
● Send mail is probably the most frequently invoked external program, but many other programs are
used as well.
● When a web application passes information from an HTTP request through as part of an external
request, it must be carefully scrubbed.
● Otherwise, the attacker can inject special (meta) characters, malicious commands, or command
modifiers into the information and the web application will blindly pass these on to the external
system for execution.
● SQL injection is a particularly widespread and dangerous form of injection.
● To exploit a SQL injection flaw, the attacker must find a parameter that the web application passes
through to a database.
● By carefully embedding malicious SQL commands into the content of the parameter, the
attacker can trick the web application into forwarding a malicious query to the database.
● These attacks are not difficult to attempt and more tools are emerging that scan for these
flaws.
● The consequences are particularly damaging, as an attacker can obtain, corrupt, or destroy
database contents.
● Injection attacks can be very easy to discover and exploit, but they can also be extremely
obscure.
● The consequences can also run the entire range of severity, from trivial to complete system
compromise or destruction.
● In any case, the use of external calls is quite widespread, so the likelihood of a web
application having a command injection flaw should be considered high.
Affected environments of SQL Injection
● Every web application environment allows the execution of external

commands such as system calls, shell commands, and SQL requests.
● The susceptibility of an external call to command injection depends on
how the call is made and the specific component that is being called, but
almost all external calls can be attacked if the web application is not
properly coded.
Illustration of SQL injection
● A malicious parameter could modify the actions taken by a system call that
normally retrieves the current user’s file to access another user’s file for example
by including path traversal “../” characters as part of a filename request.
● Additional commands could be tacked on to the end of a parameter that is
passed to a shell script to execute an additional shell command.
● For instance, “; rm -r *” can be appended to the existing command to execute
SQL injection attack; the command removes all directories and their content
recursively.
● SQL queries can also be altered by appending in the where clause
constraints such as “ OR 1=1” to access data from the attacked database.
Determination of SQL injection vulnerability
● In order to determine the vulnerability of your web application to SQL
injection, you need to scan the entire software and find instances where
application code calls using any kind of syntax such as structured queries,
fork, exec, system, etc.
● Making requests to databases, interpreters, or some external resources.
Understand that there are numerous ways in which an external commands or
SQL codes can be executed.
● Therefore, software developer must review their program to find instances
where input from an HTTP request can succeed in making unsolicited calls to
the system.
● The steps that attackers use to carry out SQL injection will help you to
implement robust application security technique. These steps are depicted as
follows:
● Finding vulnerable website: The website which are vulnerable to external
hacking can be found out by using a google application known as “google
dork”.
● Google searching tricks is used by “google dorks” to find out the vulnerable
website.
● Checking the vulnerability: The single quote (‘) is used to check the vulnerability
of the website.
● This single quote is added at the URL and then it is processed.
● There should be no space between the last digit of the URL and the single quote.
● The website is not vulnerable if the new URL which has been entered directs
to the same page or shows that the page is not If an error which is
associated with the query of SQL is shown, then the website is vulnerable.
Protection from SQL injection attack
● In order to make your application immune to SQL injection vulnerability try not to access
external interpreters as much as possible.
● And if the same functionality is essential, then use programming language specific
libraries to perform the task as these libraries do not require the involvement of the
interpreter, thereby eliminating the necessity to make system calls or write shell command to
provides the same feature but create vulnerability.
● For the case where access to backend database is essential, it is advisable to validate the
data so that no malicious content can bypass it.
● Also, try to structure requests in such a way that all the input parameters are treated as
data and not as executable codes.
● By adhering to the preceding protection mechanism you can of course mitigate the
vulnerability involved in making external calls, but never eliminate the treat.
● So make sure that you always, as a rule of thumb, validate your input data
rigorously.
● Second and most important countermeasure that you must undertake is
to make your application free from the need to use admin rights and
provide only those permissions that are essential to do the operation.
● Therefore, you will not provide functionality in your application that uses root
to run a web server or DBADMIN to access a database.
● This will protect your application from attacker who can misuse admin right
via SQL injection to read, write, or delete your database content.
● Use reusable components produced by OWASP Filters project to stop
SQL injection and application level firewall, CodeSeeker from OWASP.
Canonicalization
Basics of canonicalization
● Canonicalization is the technique of reducing something to the simplest and most
significant form possible without loss of generality so as to manage ambiguity.
● The simplicity implicate vulnerability.
● The simplest form can be easily and accurately guessed, which can then by exploited
by an attacker for malicious intent.
● For this reason, web application developer has to ensure that the developed software
has built-in safety mechanisms to deal with canonicalization issues from URL encoding to
IP address translation.
● Unicode encoding is a technique that is used to store characters using many bytes.
● Attackers can conceal malicious code using Unicode and inject this code into user
input data, accordingly accomplishing umpteen attacks.
● Apart from this, traditional data transfer technique uses GET or POST method to send input
data from client to server.
● In the GET method data is transmitted in the query section of a URL, while in the POST
method data is transmitted in the HTTP headers.
● URL encoding thus transmits encoded data complying with URL syntax such as RFC1738 defines
URLs and RFC2396 defines URIs.
● Since URL encoding technique allows virtually any data to be passed to the server, you must
develop your web application taking utmost precautionary measures to deal with malicious attack.
● For instance, use HTTP POST method instead of HTTP GET method to submit forms data; this
will avoid appending sensitive data to URL.
● If transmitting data to the server using URL is required, then limit the size and data type of the
input data and disallowing text data through strict validation code and sanitize the URL encoded
suffix.
● And, finally perform server-side validation apart from client-side validation for complete
precaution.
● As a developer, you must understand that just to make your application
secure you need not transform it to the internationalized format, rather it
should have security measures to handle cases when malformed or
Unicode input is inserted.
● Employ strict validation rules to sanitise and ensure submitted data is the
correct type and size.
Affected environments of canonicalization
● Every application environment is susceptible to canonicalization issue.
Determination of canonicalization vulnerability
● Internally a web application functions via ASCII, Unicode, UTF-16, or ISO 8859-1 encoding
technique.
● The user of your web application may be utilizing a different locale and an attacker might opt
for their character set and locale with complete freedom. (web applications)
● In order to determine canonicalization vulnerability due to diverse input format scan the web
application code to determine if it has internal code page, culture, or locale. (input)
● If the default character set locale is not verified it can be one of the following: HTTP POSTS, HTTP
GETS, .NET, JSP, Java, or PHP.
● In order to protect from this vulnerability, set the character set and asserted language locale
as per the need.
● If an attacker injects malicious code into the user input via double encoding, then a single
check to determine the input by de-encoding to Unicode values will fail.
● Determine by double encoding the XSS code using XSS Cheat Sheet double encoder utility
and the match ensures vulnerability.
Protection from canonicalization attack
● It is essential that an appropriate canonical from is opted and to taking any
authorization judgement. all input from the users is canonicalized into that
form prior
● Ensure that after completion of the UTF-8 decoding, the application should
perform security checks.
● Also, make safe and certain that UTF-8 encoding is a correct canonical
encoding for the symbol in denotes.
Sensitive data
● Sensitive data are those data that comprises of varied information including personal details such as
identification name, permanent account number, adhaar number, parent’s name, date of birth, etc.;
financial details including credit card data, bank account details such as bank name, bank account
number, IFSC code, user name, and password for internet banking, etc.; organizational details
including the details of organization, customers, employees, students, or patients; and even
pertaining to classified information or matters affecting national security. Protection of sensitive
data is essential to prevent identity theft, financial theft, or national security breach and must be
protected by laws, policies, and regulations. Because of the vulnerability that sensitive data
possesses it is subject to diverse threats.
● Data can be exposed in three generic ways: social engineering, phishing, and intrusion.
● In social engineering swindlers collect your sensitive data by shamming themselves as representative
of authentic organization and use your sensitive data for fraudulent purpose.
● In phishing attackers use ingenious methods to exact your sensitive data over network by sending an
email with attachments or links that can ask you to input your sensitive information.
● Intrusion on the other hand search for vulnerabilities in your computer to gain access to your sensitive
files containing sensitive data.
Detailed description of sensitive data
● In day to day administration organization collect large amounts of personal data. Out of
these collected data a large part of the data is not considered sensitive as these data are
easily available from the Internet through search engines that can generate your data
from LinkedIn, Face book, Twitter, or any social networking sites, etc. The information
that are stored in this sites include your name, telephone number, address, date of birth,
educational details, professional details, and preferences.
● None of these information can be considered sensitive data as the breach of these data will
not harm you or cause financial loss. Nevertheless, organization can ask for sensitive data
such as user Permanent Account Number, Aadhaar Number, Bank Account Number,
Legal Information, etc. that can be considered sensitive.
● Data are considered sensitive if it is protected by government law or organizational
policy. The term "sensitive" is descriptive. Sensitive data may fit into various
classifications based on the legal requirements and use.
Examples of sensitive data
Personal data
● Permanent Account Number (PAN)
● Aadhaar Number
● Driving License Number
● Passport Number
● Visa Number
● Mother’s maiden name
Financial data
● Debit Card Number
● Credit Card Number
● Card Expiry Date
● Card Verification Value (CVV) Number
● Credit Details
● Tax Information
Government protected data
● Student information & grades.
● Medical, health, or psychological information.
Protection of sensitive data leakages
● Concerned about protecting your sensitive business data from the prying
eyes of hackers.
● If “data integrity” is the first priority, you need OWASP compliant server the
organization.
● The following sections depicts some to the important leakage protection
mechanisms of sensitive data:
Protection mechanism
● Enforce strict data encryption: The first step is to categorize and identify
the sensitive data points. Once you have identified the critical data which
requires an extra protective cover, the next step would be to implement a
proven encryption technique. Make sure that sensitive data is kept under
the wrap of encryption all the time. Such data should neither be stored nor
transmitted in clear text format.
● Use SSL for user authentication: Protect all authentication gateways on
your website with secure HTTPS (SSL/TLS) protocol. SSL authentication
uses the concept of public/private key It means, a user must supply the
corresponding decryption key for gaining access to sensitive data
points.
● Implement strong password hashing algorithm: Password hashing is one of those things
that's pretty straightforward to implement, yet it is taken lightly by a large number of web
developers. Hackers can exploit the weakness in password hashing algorithm to steal
sensitive information stored on a web or application Only cryptographic hash functions
should be used to implement password hashing.
● Make use of penetration testing: It's a good idea to make your application undergo a third
party 'penetration test'. It will give you a fair insight on how secure the application is.
It's a fact that even the best programmers are susceptible to occasional mistakes. So it
makes sense to employ a trustworthy security expert to review the application for
potential vulnerabilities. For best results, the security review process should ideally
continue throughout the life-cycle of application development at a periodic interval.
Sensitive data access
Basics of sensitive data access
● The useful data generated for specific purpose can be categorized as sensitive
and non-sensitive.
● Sensitive data are those data whose value is quite important for the user
and loss of the sensitive information can lead to loss of personal
information, bank information, and business information, which can lead to
loss of money, privacy, and data leak.
● So, in order to manage, protect, and secure sensitive data you need to
understand the methodology of how to restrict access to the sensitive data.
● For that you need to understand how sensitive data is stored, is the data
encrypted before storing, is the data placed in secure server, and a lot more.
Hence, safeguarding access to sensitive data is of utmost priority for any
organization, institutions, and other data management bodies.
Protection of sensitive data access
● In order to protect sensitive information access in an organization, take stringent security measures.
Allow authorized personnel to get past the security processes on a need to know basis. Allow the
sensitive information transfer, retrieval, pickup, and delivery only to the authorized personnel.
Defend sensitive information theft and revelation to unauthorized individuals on laptops, desktops,
mobile devices, network, and postal service workstations. The
● access can be restricted to the unauthorized person for accessing sensitive information from
hardcopy such as printouts, or softcopy using endpoint media such as USB flash drive, hard disk
drive, optical disk, and memory card.
● Protect sensitive information access by encrypting it using advanced encryption algorithm that are
stored or archived.
● Label sensitive information in electronic or printed material as ‘restricted information’, thereby restraining
others from accessing.
● In work environment create strong login password and change it once a month, also lock your
computer every time when you leave it is unattended.
● Track and inventory sensitive information from creation to destruction.
● Before disposal of sensitive information present in hardcopy shred it so that no
one else can access it. Some of the donots for restricting access to sensitive
information include:
● never disclose sensitive information without the permission of higher
management.
● Never take out printout of the sensitive information in publicly accessible
printing machine, as it can result in unauthorized viewing of the hardcopy.
● Never copy sensitive information unless you can secure the copied data
using cryptographic technique.
● Never email sensitive data unless it is encrypted using algorithm.
● Never talk about sensitive information where other can overhear.
● Never transmit sensitive information using FAX without the permission from
higher management.
Types of sensitive data
● Confidential data is used in a general sense to mean sensitive data whose access is subject
to restriction, and may refer to data about an individual as well as that which pertains to
a business.
● Even though they are often used interchangeably, personal data is sometimes
distinguished from private data, or personally identifiable data.
● The latter is distinct from the former in that private data can be used to identify a unique
individual.
● Personal data, on the other hand, is data belonging to the private life of an individual
that cannot be used to uniquely identify that individual.
● This can range from an individual’s favourite colour, to the details of their domestic life.
● The latter is a common example of personal data that is also regarded as sensitive, where
the individual sharing these details with a trusted listener would prefer for it not to be
shared with anyone else, and the sharing of which may result in unwanted
consequences.
Confidential business data
● Confidential business data is that data whose revelation can ruin or harm the business, its
operations, and hamper its sustainability.
● Examples of such confidential data include patentable inventions, customer and supplier data,
financial data, trade secrets, and more.
Classified sensitive data
● Classified sensitive data are those data that come in the purview of distinct security categorization
regulations as levied by several national governments, and the disclosure of which may cause
harm to national interests and security.
● The protocol of restriction imposed upon such data is categorized into a hierarchy of
classification levels in almost every national government worldwide, with the most restricted
levels containing information that may cause the greatest danger to national security if leaked.
● Authorized access is granted to individuals on a need to know basis who have also passed the
appropriate level of security clearance.
● Classified sensitive data can be reclassified or declassified.
Sensitive data in storage
Basics of sensitive data in storage
● As a measure of security do not store sensitive data; since, if there is no sensitive data
stored in the digital format imply there is no fear of its stealing and so no loss.
● This is not what is in practice in reality, else the study of application security will lose its
importance.
● If storage of sensitive data is of utmost priority and necessary as in the case for financial
institutions, health care institutions, schools and colleges, government and private
organizations, and nation’s military, ammunitions, and other secret agencies; then,
identify those crucial sensitive data whose integrity and privacy breach can bring
catastrophe to individuals such as bank account holders, patients and doctors,
students and teachers, government and private employees, as well as to nation that
can create potentiality for war.
Techniques
● Cryptographic hash functions are otherwise known as one-way hash functions that generate one-way hash value that
cannot be reversed except after extensive brute force attack which is practically infeasible.
● Therefore, one cannot obtain the original data from the hashed value in feasible amount of time.
● If the database containing hashed table of passwords is compromised, then this data security breach cannot disclose
the password information even after re-hashing to obtain clear text data, thereby safeguarding its integrity and security.
● Some of the hash functions include message digest algorithm (MD5) and secure hash algorithm (SHA-1, 2, or 3). On the
other hand, if you are storing sensitive data such as credit card numbers of customers, do not store it directly or hash it as the
hashing will not help;
● Since, you need the number for verification not comparison unlike the case of passwords where the user input
password is hashed and compared with the hash table to get a match to grant access or mismatch to deny access.
● Instead, use a strong cryptographic technique either symmetric key cryptography, which uses the same key for
encryption and decryption; or asymmetric key cryptography, which uses one key for encryption and another key for
decryption.
● For securing credit card number as the case is use symmetric key cryptographic technique such as Triple DES (3DES) to
encrypt the sensitive data
Protection of sensitive data in storage
● In order to protect sensitive data in storage follow the succeeding points:
○ Utilize restricted access control list (ACL) to store sensitive data.
○ Utilize latest and advanced encryption algorithm to encrypt sensitive data.
○ Utilize role-based and identity-based authorization to access sensitive data.

Information disclosure
● Basics of information disclosure:
Sensitive data disclosure is one of the most challenging issue that can be prevented by using
encryption, secure socket layer technology, etc.
The severity of the data revelation is completely reliant on the sensitivity of the data being revealed.
For instance, revelation of bank information could cause financial loss, personal information could
cause identity theft, and user credentials could mean loss of privacy to the social networking sites,
or other websites where user name and password is mandatory to login to the website.
For a server-based application path disclosure can reveal the structure of the web application and
can be a threat for information security breach.
Information disclosure is influenced by the following
● Security flaw: If the stored sensitive data is not encrypted then it exposes security weakness and is susceptible to security
attack. Therefore, always store sensitive data by encrypting it using a strong encrypting algorithm, strong key
generation and management methodology, and strong password hashing technique.
● Flaws in a browser is easy to notice but difficult to abuse in a big scale. While flaws on the server side is difficult to discover
owing to the access limitation and is equally difficult to
● Attack vectors: Cyber attackers or attack vectors generally do not try to decrypt cryptographic data directly rather they
perform man-in-the-middle attack, thieve cryptographic keys, steal clear text information, or do something else to
breach security either from the user’s browser of from the data during transfer process.
● Threat agents: Include both internal as well as external threats that can gain access to sensitive data present in your
cloud or system, during the transmission process, or directly from client’s web browser.
● Business impacts: Loss of sensitive data can impact business as it can damage business reputation, imply penalty
from customers whose data is breached, and consequently result in decline of
● Technical impacts: Loss of sensitive data can impact technology shortcomings. The data which should be protected is
compromised due to frequent failures of security system of an organization.
Destinat
Source
ion
intruder
Information disclosure vulnerability
● It is essential that you should understand the importance of sensitive data and take extra precautionary measure while storing it
in digital format.
● If the sensitive data is present in your storage device and can be accessed through the network or needs to be transmitted to
external source for specific purpose, then you need to protect your data and minimize the vulnerability of information
disclosure.
● In order to safeguard your sensitive data from exposure keep in mind the following points:
● Never store sensitive data in clear text format including the backup for long term.
● Never transmit sensitive data in clear text format either internally or externally.
● Always use advanced and latest cryptographic algorithms to encrypt sensitive data.
● Use strong cryptographic keys and key management technique to prevent data breach.
● Ensure the presence of browser security headers or directives while transmitting sensitive data.
Information disclosure prevention
● It is essential to use advanced data protection methods, strong cryptographic algorithm, and secure
socket layer to prevent accidental disclosure of information, malicious attack by hacker, or thwarting
attack by insider.
● For all sensitive data, do the following:
● Encrypt all sensitive data in stored form and during transmission process.
● Avoid unnecessary storage of sensitive data and throw away if not needed.
● Always use advanced encrypting algorithm, strong key, & sound key management practice.
● Use specifically designed password protection algorithm to store and manage passwords.
● Disable caching for web pages and auto complete on forms that gather sensitive data
Data tampering
● Tampering of data is one of the challenges that the contemporary world

is facing.
● You have already understood the basics of sensitive data, its storage,
and its susceptibility to network eavesdropping.
● In this section you will go through the data tampering and how to protect
sensitive data from being tampered once it is transmitted.
Data tampering protection in URL
It is sometimes required by a legitimate website to transmit sensitive data from one page to another or from a client to
the server machine and vice versa using URL.
This is something that needs special care in order to protect the sensitive data in transit by way of URL, as the data can
be accessed by the attacker.
Transmission of data through URL can be completed using GET or POST method in the <form> tag or via the key/value
pair.
You can protect sensitive data from tampering using various techniques including input validation technique.
For instance, in a database query it is required that integer input be sent via the URL.
By verifying the data for it type as integer the input can be verified for its correctness.
The input validation method can of course help prevent the unexpected behavior of the program but in no way help
defend against data tampering.
So, how to safeguard data against tampering attack; the following topic discusses in brief about how to prevent data tampering:
● Data tampering protection in database:
● If you want to reference a row in a database without leaking information for example how many rows
are present in the database.
● You can simply store a randomly generated unique identifier in an extra column and reference that
instead.
● Data tampering protection through hashing.

Data tampering can be detected using a hashing algorithm.
For example, when an id variable of a user is passed from one web page to another as the user keeps
on browsing the website the web application expects the id to remain immutable.
Each successive browsed page can verify the value of the id variable for mutation by calculating,
transferring, and processing the hash value of the data together with the id value. (sender)
.
(Receiver) Re-hashing and matching the hash value with the previous
web page hash value can convey the application whether the data is
being tampered by any malicious user or not.
If both the hash value matches, then the data is not tampered else
something fishy is going on.
There is one gimmick to the preceding procedure that is the id variable is
visible to the attacker, nevertheless the attacker could not be able to
know the secret and the hashing algorithm for generating the hash
value in order to tamper with the data
Data tampering protection through encryption
● Sensitive data can be protected by using symmetric keys – (same key can be
used for encryption and decryption) that do not disclose the actual data value.
● The idea is similar to hashing function, nonetheless uses symmetric keys for
encryption and decryption of sensitive data.
● You can also use a secure encryption library to encrypt sensitive data.
● When sensitive data is sent via URL, an encode technique is used in the URL
to encrypt the value of the input id and it is received on the recipient system
and subsequently decode technique is used to decrypt the URL to retrieve
the original data.

Unit 2

Uploaded by

Copyright:

Available Formats

You might also like

Unit 2

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 2

Uploaded by

Copyright:

Available Formats

Unit-II Introduction to input validation & sensitive data

Input validation requires the understanding of the following:

● Basics of input validation.

Modern world in completely reliant on web or database applications for information.

Implementation of input validation

● Thereby reducing software development effort and future maintenance issue.

It is the technique in which potentially malicious data is made safe,

Basics of buffer overflow

A command shell can then be run by the attacker.

Buffer overflow attack is mostly prevalent in programming languages, such

There is a technique called address space layout randomization (ASLR) that

Attackers can successfully take control of a computer by running a specially created

Discovery of buffer overflow vulnerability in an application is quite a difficult task; and

● Determination of buffer overflow vulnerability

● Principally, cross-site scripting is an application security vulnerability in which attackers

● XSS attack’s main focus is to steal session cookies through session

Determination of cross-site scripting vulnerability

● Every web application environment allows the execution of external

● In order to protect sensitive data in storage follow the succeeding points:

○ Utilize restricted access control list (ACL) to store sensitive data.

○ Utilize latest and advanced encryption algorithm to encrypt sensitive data.

○ Utilize role-based and identity-based authorization to access sensitive data.

● For all sensitive data, do the following:

● Tampering of data is one of the challenges that the contemporary world

● Data tampering protection through hashing.

You might also like