Crawling / Indexing

How to put rel="canonical" on non-html resources

When the rel="canonical" tag was introduced in 2009 it was quickly adopted by SEOs. Unfortunately, because the canonical tag resides in the HTML head you cannot insert it into non-HTML pages.

Why is this a problem? If you have images or PDF documents that play an important role in your website, they can outrank HTML pages on your site. If you created a redirect no one could read the document or see the image.

The solution is to create a rel="canonical" for the image or document. Since you cannot place the canonical tag in the HTML head on non-HTML documents, search engines provided the option to provide it as an HTTP header.

Instead of just showing you how to serve the rel="canonical" as an HTTP head, I am going to show you how to create any custom HTTP header.

How to create a custom HTTP header.

The syntax for creating a custom HTTP header is simple. You begin by identifying the file the HTTP header will be served for. This can be done with <Files> or <FilesMatch>.

Inside the opening <Files> or <FilesMatch> tag you should place a regular expression in quotation marks or the file name to identify the file. If you use a regular expression in <Files> you will need to insert ~ before your regular expression.

As a rule, it is recommended that you use <Files> for file names and <FilesMatch> for matching with regular expressions.

Once you have matched your file(s), you will need to create your HTTP header. To do this you will use the following syntax Header NAME "VALUE".

In the following HTTP header Status Code: 200: Status Code is the name and 200 is the value. You do not need to place a semicolon after the NAME in your .htaccess file.

Example: Custom HTTP Header

The following code will create a canonical tag on the file white-paper.pdf pointing to the desired HTML page.

<Files white-paper.pdf>
    Header add Link '<http://www.example.com/white-paper-download.html>; rel="canonical"'
</Files>

Creating canonical tags this way can be tedious on a large site. Because of that creating a global rule will allow the placement of the canonical tag programmatically. The best way to do this is to save the file name using an environment variable flag. Once we have the file name we can create an HTTP header for each matching file.

Example: Dynamic HTTP Header

The following code will create a canonical tag for each pdf pointing to an HTML page with the same name.

RewriteRule ([^/]+)\.pdf$ - [E=FILENAME:$1]
<FilesMatch "\.pdf$">
    Header add Link '<http://www.example.com/download/%{FILENAME}e>; rel="canonical"'
</FilesMatch>

This rule will create a canonical tag that will indicate that any PDF on the website is a canonical of an HTML page with the same file name located in the /download directory.

For example, any pdf named epic-white-paper.pdf will have a canonical link pointing to http://www.example.com/download/epic-white-paper.

You can include a file extension after the e and before the closing > if you use file extensions on your website.

Be careful when using this method. It will cause all pdf files on your server to have canonical links to HTML pages with the same file name. To ensure it works properly I recommend following these rules.

  • Do not use this method in your root .htaccess file. Place it in an additional .htaccess file in a child directory such as /downloads.
  • Use all lowercase characters and hyphens between words in the name of your pdf.
  • Create the canonical HTML page prior to uploading your pdf document.
  • Enter the canonical link from each pdf into your web browser and ensure it works properly.

If you use Joomla, you may want to read more about how to create a canonical tag in Joomla.

Warning:

If you are using Nginx as a reverse-proxy in front of an Apache web server and you are using Google PageSpeed (mod_pagespeed), you may have trouble with image requests not processing your .htaccess file. If this is the case, it will result in your canonical links not being placed on images. To correct this, you may need to change some of your mod_pagespeed settings.

Daniel Morell

I am a web developer and SEO with a focus on creative design, a passion for perfection, and an organic marketing green thumb.

© 2018 Daniel Morell.
+ Daniel + = this website.