Improve Magento SEO for your store using the robots.txt file

Lacey Tech Solutions are pleased to have been given the award for Best Website Development & SEO Agency 2019!

Magento is a great open source E-Commerce system but when it comes to SEO there are some simple changes you can make that will improve your standing in the search engines. The robots.txt file is used to help search engine crawlers (such as GoogleBot and BingBot) determine what sections of your website they should index.

Creating a robots.txt file for your Magento store is important if you want to improve the SEO of your store. By default there is no robots.txt in Magento Community or Enterprise distribution so you should create it yourself.

Please note: The robots.txt file works for one domain. If you have multiple domains/sub-domains for your Magento store then you’ll need to copy the robots.txt file over to the other domains.

How will robots.txt improve the SEO of my Magento store?

  • It will help you to prevent duplicate content issues that could damage your ranking in search engines
  • You can help prevent errors logs, reports, core files, .SVN/.git files from being indexed accidentally

Example robots.txt file for your Magento Store

You should never blindly copy and paste example files and use it on your store without reviewing it first. Every Magento store has its own structure so you may need to change the robots.txt file above to suit your needs.

## Enable robots.txt rules for all crawlers
User-agent: *

## Don't crawl development files and folders
Disallow: .cvs
Disallow: .svn
Disallow: .idea
Disallow: .sql
Disallow: .tgz

## Don't crawl Magento admin page
Disallow: /admin/

## Don't crawl common Magento folders
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/
Disallow: /media/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /shell/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/

## Don't crawl common Magento files
Disallow: /api.php
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /get.php
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /README.txt
Disallow: /RELEASE_NOTES.txt
Disallow: /STATUS.txt

## Don't crawl sub-category pages that are sorted or filtered.
Disallow: /*?dir*
Disallow: /*?dir=desc
Disallow: /*?dir=asc
Disallow: /*?limit=all
Disallow: /*?mode*

## Do not crawl links with session IDs
Disallow: /*?SID=

## Don't crawl the checkout and user account pages
Disallow: /checkout/
Disallow: /onestepcheckout/
Disallow: /customer/
Disallow: /customer/account/
Disallow: /customer/account/login/

## Don't crawl seach pages and catalog links
Disallow: /catalogsearch/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/

## Don't crawl common server folders / files
Disallow: /cgi-bin/
Disallow: /cleanup.php
Disallow: /apc.php
Disallow: /memcache.php
Disallow: /phpinfo.php

## Don't allow Google and Bing to index your images (Not recommended)
# User-agent: Googlebot-Image
# Disallow: /
# User-agent: msnbot-media
# Disallow: /

There are Magento extensions that can be installed to give you more flexibility over what pages are indexed in the search results, but this is the simplest approach that will help improve your search engine optimisation efforts.

Ben Lacey is the Managing Director of Lacey Tech Solutions. He is passionate about everything to do with websites from design and development through to search optimisation and hosting. He started the company blog as a platform to help educate current and prospective customers about the ever changing website development industry.

What are your thoughts?

We hope you found our article useful and look forward to answering your questions.