Computer
Networking
and
Practical
Security

Fall 2006
course
navigation

Web Hacking

As demands for interaction in websites continue to increase, web developers must create ever more complicated solutions operating just under the surface. Web hacking, hacking using only (or little more than) a web browser, is a common method to circumvent the network's defenses and get a web program to execute according to your own parameters. Since it doesn't require any actual networking knowledge it tends to appeal to script kiddies and neophyte hackers, but its effectiveness should not be underestimated.

Contents

XSS - Cross Site Scripting

Cross site scripting refers to the act of code injection into a webpage. The name itself is a bit misleading, since there is often no cross site behavior (or even scripting[1]). One of the most common occurrences is code entered through page forms that aren't "sterilized". An unsterilized form reads input without filtering potentially harmful characters or strings.
For a trivial example, imagine a "guest book" form on a website. A simple underlying cgi or PHP script appends the form's input string to a text file, and on loading the page outputs the file with the appropriate HTML tags. But what happens when a guest enters their own HTML? If a guest enters enters the infamous "blink" tag, the browser will interpret it as just another HTML tag and perform the expected action. If the user also fails to include an ending tag, the behavior could continue until the end of the page. Conversely, a "" tag could bring the page to a screeching halt.

A Simple Sterilization Example

Of course there are situations where site users have very valid reasons to enter HTML tags, such as modifying the appearance of a "livejournal" post, or sharing code in a forum. Here we cannot simply cut out any code-shaped input. Note that these two code examples have different purposes. The first should be interpreted by the browser normally. The second should be displayed verbatim. How do we do both (possibly at the same time)? Since we have two purposes, we need two different types of code. One easy solution is to use double brackets " " instead of greater/less than signs for "interpreted" HTML. Then we run two stages of parsing, first replacing greater/less than signs with HTML character code (for a guide, see [2]), and then replacing the double brackets with greater/less than signs. Of course we could just force users to use character code when they need it, but this would be distracting, and a hassle to convert if the user is copy-pasting.
We can also choose which interpreted codes we want to allow, since there's no reason a user would need to use the "" tag. Other dangerous "escape" characters such as quote marks and back tics (`) that might allow a user to enter their own script should always be converted to character code or removed outright.

Javascript

The above HTML examples are fairly harmless. They disrupt the user's experience, but don't allow the attacker to gain any secret information or initiate a larger attack. Actual programming languages like Perl and Javascript can have much more devastating effects. In a moment we will give an example that is little more than social engineering, but can be used to trick users into downloading malicious files or revealing their passwords.
As you should know by now, WikiAcademia is a modified "wiki" that provides an effective collaboration method for students and teachers. It performs less sanitation than most implementations, relying primarily on authentication to limit who rather than what is allowed on pages.
Javascript is a very popular scripting language for websites (more information and language descriptions can be found at [3]). Unlike Perl and PHP, it is usually placed directly inside the HTML page, and can be interpreted by either server or browser. It can be used to modify behavior and appearance of the page based on user input, browser information, or the page's own data. For example, this code:
<script type="text/javascript"> if (navigator.appName == "Microsoft Internet Explorer") { document.write("<font color=red><b>ATTENTION! Your browser, " + navigator.appName + ", is not<br>"); document.write("standards compliant! It is highly recommended that you<br>"); document.write("upgrade to Firefox.<br></b></font>"); } </script>
displays this message when viewed in IE: ATTENTION! Your browser, Microsoft Internet Explorer, is not
standards compliant! It is highly recommended that you
upgrade to Firefox.
But the same code does nothing when viewed with a different browser.
Modern web developers use cascading stylesheets (CSS) and div tags to maintain consistent, easily modifiable pages. The following example uses this fact to subvert a WikiAcademia page outside of the normal editable text area:
<script type="text/javascript"> document.getElementById('column-one').innerHTML="<h1><a " + "href=\"http://www.maliciouswebsite.com\">" + "<span style='font-size:85%'>Computer<br>Networking<br>and<br>Practical<br>Security</span></a></h1><div>" + "<div id=\"semester\">Fall 2006</div></div><div><!-- navigation --><div><h5>course</h5><div><ul>" + "<li class='unselect'> <a href=\"http://www.maliciouswebsite.com\">home</a></li>" + "<li class='unselect'> " + "<a href=\"http://www.maliciouswebsite.com\">syllabus</a></li><li class='unselect'> " + "<a href=\"http://www.maliciouswebsite.com\">wiki</a></li><li class='unselect'> " + "<a href=\"http://www.maliciouswebsite.com\">assignments</a>" + "</li></ul></div></div><div><h5>student</h5><div><ul><li class='unselect'> " + "<a href=\"http://www.maliciouswebsite.com\">roster</a></li><li class='unselect'> <a " + "href=\"http://www.maliciouswebsite.com\">grades</a></li></ul></div></div><div>" + "<h5>navigation</h5><div><ul><li class='unselect'> <a href=\"/courses/help\">help</a>" + "</li><li class='unselect'> <a href=\"http://www.maliciouswebsite.com\">.. /</a></li>" + "<li class='unselect'> <a href=\"http://www.maliciouswebsite.com\">. /</a></li>" + "</ul></div></div></div><div><a href=\"http://www.maliciouswebsite.com\">" + "<img id=\"mc-logo\" alt=\"Marlboro College\" src=\"/courses/source/images/mc-logo.gif\"/></a>" + "</div></div><!-- end column-one -->" </script>
This code was created by simply viewing the page source, copying the "column-one" div block, and changing links to point to a "malicious" website. The attacker then enters this code on the wiki page of their choice. The next time an unsuspecting user points their browser at the page, it modifies the page as specified before displaying it, and the user can walk into whatever trap the attacker has laid.

SQL - Structured Query Language

SQL is a database language, and is used for scanning and modifying a database. Most websites that provide dynamic user-specific content rely on some database software, and of those, SQL is the most prominent. We will only give a quick overview (since database theory could easily encompass an entire class), focusing on the necessary parts for orchestrating an attack. The SQL database consists of interconnected tables, each with some collection of columns specifying attributes. By specifying a table and column parameters, we can query the database for all records that match our criteria. For example, say we have a table called baseballplayers, with columns "name", "position", and "average". If our query is this:
SELECT name FROM baseballplayers WHERE average > .300;
the output will be a list of names of all players with an average greater than .300. If instead we wanted all information about a particular player, we could use this query:
SELECT * FROM baseballplayers WHERE name='Bobby Joe';
Remember that "*" is a regular expression meaning "anything". Here it means essentially the same thing. The above line translates as "select any columns from the table where the name is Bobby Joe" and would output all data fields. Other important commands to be aware of are "CREATE" and "DROP". These are used for the creation and destruction of tables (as well as other things we won't cover here). For example,
DROP TABLE baseballplayers;
removes our table of players. If this was an important table with many entries, this could be disastrous.

SQL Injection

SQL Injection is similar to the code injections we've already seen, again resulting from improperly sanitized input. If we're running a Perl cgi script for our webpage, a query might look like the following:
SELECT lastsession FROM users WHERE (userid='$user' AND password='$password');
Here $userid and $password are variables corresponding to form fields for the page (it's worth noting that passwords should never be stored in plaintext like this, but we'll ignore that here). So when the SQL line is actually executed, it looks like
SELECT lastsession FROM users WHERE (userid='person92' AND password='cheesecake');
Now let's see what happens if the user "accidentally" enters in a single quote in the password field:
SELECT lastsession FROM users WHERE (userid='person92' AND password=' ' ');
Suddenly we have too many quote marks, our SQL syntax is invalid and we'll probably get an error message (this can actually be a good way to find out more information about the database). Let's expand our user's password entry to "' OR '1'='1":
SELECT lastsession FROM users WHERE (userid='person92' AND password='' OR '1'='1');
Now our syntax is valid again, but it's not doing what it originally did. '1'='1' is just a boolean argument that evaluates to true. Because the AND operator has higher precedence (is performed first) than the OR operator, we have a logical OR statement, one side of which is always true. The result should then be that the lastsession column for every record is returned. If we wanted, we could craft our own SQL line by entering the following: "'); DROP TABLE *; '1'='1". The first line should return nothing. The third just evaluates to "true". That second line is devastating though. Now all of our tables and records are lost. Anybody want to cripple an online retailer?
Of course it's rarely that easy in the real world. SQL databases have users and permissions just like operating systems, so even if the site designer failed to sanitize input, we can hope they knew better than to grant all privileges to the script user. Because designers would have to make so many mistakes (and more often are using heavily tested software suites), these attacks are growing harder and harder to pull off, but they can still occur through inexperience and laziness.

Sources and Further Reading

1) http://en.wikipedia.org/wiki/XSS
2) http://www.web-source.net/symbols.htm
3) http://www.w3schools.com/js/default.asp