Free SEO Training. Become a Certified SEO Expert.
Understanding Rewrite Rules, Part 2 of 3: Redirecting Referrers PDF Print E-mail
Webmaster Articles - System Administration
Written by Jeff Dunn   

Conveniences in this tutorial:
Yourdomain.com – replace with your domain
Console.html – page where the user is sent (we send our hotlinkers to a console hell page)

As explained in Part 1, rewrite rules can help stop hot-linkers to your site and cut down on your bandwidth usage. The simple example explained in Part 1 is all most webmasters need, but the functionality of rewrite rules allows a webmaster to have far more control over a website.

In Part 1, I explained how to keep referrals from sites other than your own from accessing important directories. Now, I will explain how to specifically prevent certain urls from accessing your site as well as explaining how to send different urls to different areas on your site. This is very useful for specifically blocking out chat sites that link to your images, and for sending quality traffic, such as search engine traffic, to a page with more marketing material.

First, here is a code to keep out the chat sites:

RewriteEngine on
RewriteCond %{HTTP_REFERER} .*bianca.com.* [NC,OR]
RewriteCond %{HTTP_REFERER} .*uol.com.* [NC,OR]
RewriteCond %{HTTP_REFERER} .*chatropolis.com.* [NC]
RewriteRule .* http://www.yourdomain.com/console.html [L]

NOTE: Some accesses may have no referrers (such as bookmarks, direct type-ins, and older browsers). To look for that, use the following code in the test pattern: ^$

If you do not already understand the "RewriteEngine" statement or the "RewriteRule" statement, please refer to Part 1 of this article series.

As you can see, the "RewriteCond" statements in this example each have the same syntax. The only difference is the domain that we are looking for in the user's referral. The "RewriteCond" statement takes this format:

RewriteCond %{<test variable>} <test pattern> [flags]

The first argument contained within %{} is a property of the client that will be examined. The <test pattern> is an expression that will be looked for within that variable. In our example above, we are looking at "HTTP_REFERER", which is the referring url of the visitor (the site he came to your site from). If you remember from Part 1, the expression ".*bianca.com.*" looks for biana.com with anything before or after it. The ".*" is essentially a wildcard. A ‘.' means "anything" and ‘*' means "0 or more of the preceding character. The ‘NC' flag tells the server "No-Case" meaning that it should not care whether the test string is upper or lower-case letters.

The flags on this example are a bit different from Part 1's example. When the server encounters more than one RewriteCond statement, it will only execute the following rewrite rule if all the statements match. In the example above, we need to tell the server that we don't want to match all the statements, only one of the statements. The "OR" flag at end of the RewriteCond statement overrides the server default and tells it to match the first statement OR the second statement OR the third statement. When the server matches at least one of the statements, the RewriteRule statement is executed.

NOTE: On the last RewriteCond, there is no "OR" flag. It is not needed since there is not another RewriteCond statement below it; however, it does not hurt to put it there.

Besides referrals, a webmaster can use RewriteCond for other purposes. The most common is to send the user to a different page based on the user's browser. The following would look for browsers other than Internet Explorer and Netscape.

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} !^Mozilla.* [NC]
RewriteCond %{HTTP_USER_AGENT} !^Windows-Media.* [NC]
RewriteRule .* http://www.yourdomain.com/oldbrowser.html [L]

The above code looks for any browser that is out of the mainstream. The most common use for this is to block accesses to a members' section by anything other than browsers. This would limit the use of auto download software such as Getright.

The '!' means "match if it does NOT equal." So, the Condition statement would be true if the user's browser did NOT match the test pattern.

 

NOTE: Since many download programs will emulate major browsers, this code will not work fully for the purpose of blocking download programs. The code needed for the full job of blocking all download agents is out of the scope of this article.

Another use for the RewriteCond statement would be to send users from different hosts to different pages. For example, if we wanted to send all aol.com users to a different more heavily ladened banner page, we could use the following code:

RewriteEngine on
RewriteCond %{REMOTE_HOST} .*aol.com.* [NC]
RewriteRule /* http://www.youdomain.com/aol.html [L]

NOTE: This will not work on many providers. It only works on providers that have their apache server setup to reverse dns the IP address. The speed drawbacks of turning on dns lookup will make most providers unwilling to do it unless you have your own server.

Other variables that may be useful for member site webmasters:

REMOTE_USER – For members' sections, this matches the user that was entered to access the site.

TIME_HOUR – This contains a number with the hour of the day. This could be useful if you wanted the user to see different pages during different times of the day. When dealing with numbers, you must use a number compare symbol such as =, <, >. For example, to send visitors to a different page before 6 in the morning, use this: RewriteCond %{TIME_HOUR} <6. There are many other time variables, look for a link to Apache's full list below.

HTTP_COOKIE – This contains a string with the user's cookies from your site. I just included this as an example of the functionality of rewrite. The usage of this variable is beyond the scope of this article.

For more information on Rewrite and a complete list of variables, visit Apache's rewrite page:

http://httpd.apache.org/docs/mod/mod_rewrite.html#RewriteCond