Best robots txt for drupal hosting

Every day, millions of people use Google Image Search to find pictures, products, and people. If you're using Drupal, chances are you're not getting any of this traffic.

Drupal's robots.txt file contains a major mistake. Amazingly, the mistake has been there for years, and very few people seem to know about it.

Take a look at this excerpt from the default Drupal robots.txt file. Can you spot the problem?

Best robots txt for drupal hosting Every day

By default, every image you upload to your Drupal site gets stored somewhere inside the "sites" directory. And, by default, Drupal is blocking every search engine from looking inside your "sites" directory. In other words, your images aren't getting indexed!

If you've got a Drupal site with images you want other people to find, this is a serious problem. (I discovered this by accident last week, when I noticed none of the images on my Photoshop Text Effects site were getting indexed by Google).

To illustrate just how common this problem is, let's take a quick look at Dries Buytaert's blog. Dries is, of course, the creator of Drupal, but he's also a very good photographer. In fact, Dries has uploaded thousands of photos to his blog, including hundreds of pictures from DrupalCon and dozens of insightful graphs and charts. But how many of these images has Google actually indexed?

Only 13. Unfortunately, Dries's robots.txt file contains the standard "Disallow: /sites/" line.

If Dries is affected, you probably are, too. Running an e-commerce site? Your entire product line could be missing from Google Image Search. Have a photography blog? Yahoo and Bing are probably ignoring everything you post.

If no one can search for your images, you're literally turning away traffic. And not just image search traffic: High-quality, indexable images are a key feature of any high-ranking site. If your images aren't indexable, you're making a major SEO mistake.

Even worse, this problem doesn't just affect images. PDFs, Flash files, text documents, and other uploads all go into the same "sites" folder. Google knows how to index these files, but your robots.txt file is stopping GoogleBot cold.

Fortunately, the solution is easy: Just remove "Disallow: /sites/" from your robots.txt file. The file is located in your main Drupal directory and can be edited with a standard text editor. Google should pick up the changes within a few days and start indexing your files shortly after.

Fixing the robots.txt file should be a priority for the next Drupal point release. This is a major problem with a simple solution. Fortunately, someone has already created an issue on Drupal.org. Unfortunately, it's been unresolved for over a year. Let's change that.

Best robots txt for drupal hosting these images has Google

Update: A fix for Drupal 6 was released on December 12th. If you're running Drupal 6.20 or later (including Drupal 7), this issue no longer affects you.

Did you find this article helpful? Check out my Drupal hosting review.

Posted by John on 2010-08-30

Watch this video!

Related articles

Hosting multiple sites drupalThere are many reasons why a systems administrator or developer may choose to host multiple sites on the same server in a multisite configuration. For example, such a configuration could:...
Field tools drupal hostingIntroduction An implementation of an effective search is one of the most difficult tasks in development, but it's also a key to success of many websites and applications. A quick search and...
Xml sitemap engines drupal hostingWhat is a sitemap? A sitemap is a file where you can list the web pages of your site to tell Google and other search engines about the organization of your site content. Search engine web...
Multi sites drupal hostingIf you are running more than one Drupal site, you can simplify the management and can upgrade your sites by using the multi-site feature. Multi-site allows you to share a single Drupal...
Faceted search apache solr drupal hostingNote: Extra special thanks to Doug Vann for providing motivation to finally post this blog post! Early in 2016, when the Search API and Solr-related modules for Drupal 8 were in early alpha...