Today I needed to prevent a Drupal 6 website from getting indexed by the search engines. This would normally be a relatively easy task by having the robots.txt file disallow everything. The problem I ran into was that I was using Drupal's multisite feature and I needed to prevent a sub-domain from getting indexed while the main site continued getting indexed.
After a bit of searching I came across this article that used .htaccess to rewrite the robots.txt file depending on the domain the user requests.
First, I created a new file testsite_robots.txt. This file contained my disallow all crawling code.
Second, I opened Drupal's .htaccess file and added the following rewrite condition:
RewriteCond %{HTTP_HOST} =testsite.example.com
RewriteRule ^robots\.txt$ /testsite_robots.txt [L]
This code checks the domain the user is requesting, and if the user is requesting the robots.txt file we serve the proper file.