Join the World's #1 Web Host!
Understanding Rewrite Rules, Part 3 of 3: Advanced Rewrites PDF Print E-mail
Webmaster Articles - System Administration
Written by Jeff Dunn   

Conveniences in this tutorial:
Yourdomain.com – replace with your domain
Console.html – page where the user is sent (we send our hotlinkers to a console hell page)

As explained in part 1, rewrite rules can help stop hot-linkers to your site and cut down on your bandwidth usage. The simple example explained in the part 1 is all most webmasters need, but the functionality of rewrite rules allows webmasters to have far more control over their websites.

In Part 2, I explained how the RewriteCond statement functions and gave more uses for it.

In this article, I will break down and give more advanced uses for the real workhorse of Rewriting, the RewriteRule statement. I will also give some more RewriteCond statement examples.

RewriteRule <Test Pattern> <Substitution> [Flags]

When the server encounters a RewriteRule, it takes the surfer's URL and compares it to the Test Pattern. If it matches the test pattern, the surfer's url is changed to the Substitution Url.

In my previous articles, I have included examples with a RewriteEngine statement, then a RewriteCond statement, and then a RewriteRule statement. However, the RewriteRule statement is all that is needed. Many webmasters will wonder what is the use of the RewriteRule statement without using a RewriteCond statement. The most useful example I can give you is a situation where you have changed all the files on your website. This happens, for instance, when a webmaster changes all .html pages to .php pages that must have the extension .php. You may have search engine and link traffic pointing to your html pages, and you simply want those hits redirected to the appropriate pages with the new extension.

RewriteEngine on
RewriteRule (.*)\.html $1.php [L]

This is a short and sweet way to accomplish changing all requests with .html extensions to .php files. The test pattern here is to the point but contains parentheses which I haven't explained the use of. Parentheses () are used to designate text that will be sent to the substituted url. Here, I have put .* inside the parentheses. Recall that '.*' is basically a wildcard. The '.' means "anything" and the '*' means "0 or more of the proceeding character".

Following the parenthesis is "\.html" The backslash '\' here is used to tell the server that the period following it should be interpreted literally, as an actual period instead of its other meaning of "anything." The server will look for the occurrence of ".html", and anything before it will be passed to the substituted url in the form of a variable.

The substitution in this example is "$1.php" The $1 is a variable, which will be taken from the earlier test pattern. In this example, '$1' would be anything before the ".html" in the test pattern because it was enclosed in parentheses. The ".html" is not included inside the () and therefore it is replaced with ".php" The [L] flag at the end means "That is the last thing to do."

Ok, so lets put this statement into English. It says, "If the url has the extension '.html', then take the text in the url before it, but replace '.html' with '.php' and then send the new url back to the surfer."

What about situations where some of your site has been switched to .php but there are still .html pages left that you'd like to serve? In that case, we need a test to see if the .html file exists before replacing the extension.

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule (.*)\.html $1.php [L]

In this example, we have added a RewriteCond statement that checks if the requested file exists before executing the RewriteRule statement (For basic understanding of the RewriteCond statement, please refer to Part 2). The REQUEST_FILENAME variable in the RewriteCond statement stores the full system path of the requested file. The '-f' in the Test Pattern means "see if the file exists." In this case, we want to see if the file does NOT exist so we put a '!' in front of the –f. Recall that '!' means to "Not Match."

Ok, time to move onto helping webmasters lower their bandwidth costs. We need to block other sites from using our images. If we want to make money from our content, we want people to come to a page with our ads, not directly to our images.

RewriteEngine on
RewriteCond %{HTTP_REFERER} !.*yourdomain.com.* [NC]
RewriteRule .*\.[gif|jpg|jpeg|png] - [NC,F]

(For info on this RewriteCond statement, refer to Part 1)

This example looks at the url to see if the extension is an image. The brackets [] tell the server to look for one of the characters stored inside. In this case, we are looking for a sequence of characters, such as gif, so we separate them with '|' In this case the server looks for anything ending in gif, jpg, jpeg, or png; and if it does, it rewrites it. Here, we put a – to signify no url. The reason for this is that you cannot rewrite an image request to an html file. Once the browser starts looking for an image, it expects an image and will not be happy if it doesn't get one. Trying to send an html page where the browser expects an image will not only send the browser in a loop, but it will cause more bandwidth for you. So, I would suggest just forbidding access to images. In this example, there is a new flag 'F' which stands for "Forbidden" This tells the server to send the error code 403 to the browser.

NOTE: In this example, make sure to use the 'NC' flag, otherwise a request to .GIF would go through.

If you ever rewrite a url to a script that accepts arguments in the form of ?something=something, you will need to use the 'NE' flag. This tells the server not to send the hexicode to the browser and instead send the ? and = signs as they are.

Ok ok... so you really really want to send those image hotlinkers to an html page. It is possible. It will be a little fickle in operation though, so you will want to make sure it doesn't load down your server when you use it.

RewriteEngine on
RewriteCond %{HTTP_REFERER} !.*yourdomain.com.* [NC]
RewriteRule .*\.[gif|jpg|jpeg|png] http://www.yourdomain.com/console.html [NC,T=text/html]

The 'T' changes the "Mime Type" of the request. When the browser requests an image, the Mime type is "image/gif", which is why if the browser receives something else, it will be confused. So, in this example, we change the mime type to "text/html", which signifies that the browser will be getting a text page instead of an image.

NOTE: You may want to ask your administrator about the RewriteLog option. This makes it possible to log all rewriting that is done on the server. There are some limitations to it as it must be in the server's config files and not in an .htaccess. Your rewrite code must also be moved into the server config files for the logging to work; It will not log .htaccess rewrites.

That is it for now. There are many other useful features of the Rewrite Code. Most of it is for expert users and will not benefit the average webmaster. I hope this gets you on your way to being able to write your own rewrite code.

For more information visit the following sites:

Apache's mod_rewrite page: http://httpd.apache.org/docs/mod/mod_rewrite.html

Apache's Rewriting Guide: http://httpd.apache.org/docs/misc/rewriteguide.html