A redirect happens when someone asks for a specific page but gets sent to a different page. Often, the site owner deleted the page and set up a redirect to send visitors and search engine crawlers to a relevant page. A much better approach then serving them an annoying, user experience breaking 404 message. Redirects play a big part in the lives of site owners, developers, and SEOs. So let’s answer a couple of recurring questions about redirects for SEO.

Optimize your site for search & social media and keep it optimized with Yoast SEO Premium »

Yoast SEO: the #1 WordPress SEO plugin Info

1. Are redirects bad for SEO?

Well, it depends, but in most cases, no. Redirects are not bad for SEO, but — as with so many things — only if you put them in place correctly. A bad implementation might cause all kinds of trouble, from loss of PageRank to loss of traffic. Redirecting pages is a must if you make any changes to your URLs. After all, you don’t want to see all the hard work you put into building an audience and gathering links to go down the drain.

2. Why should I redirect a URL?

By redirecting a changed URL, you send both users and crawlers to a new URL, therefore keeping annoyances to a minimum. Whenever you perform any kind of maintenance on your site you are actually taking stuff out. You could be deleting a post, changing your URL structure or moving your site to a new domain. You have to replace it or visitors will land on those dreaded 404 pages. If you make small changes, like delete an outdated article, you can redirect that old URL with a 301 to a relevant new article or give it a 410 to say that you deleted it. Don’t delete stuff without a plan. And don’t redirect your URLs to random articles that don’t have anything to do with the article you’re deleting.

Bigger projects need a URL migration strategy. Going from HTTP to HTTPS for instance — more on that later on in this article, changing the URL paths, or moving your site to a new domain. In these cases, you should look at all the URLs on your site and map these to their future locations on the new domain. After determining what goes where, you can start redirecting the URLs. Use the change of address tool in Google Search Console to notify Google of the changes.

3. What is a 301 redirect? And a 302 redirect?

Use a 301 redirect to permanently redirect a URL to a new destination. This way, you tell both visitors and search engine crawlers that this URL changed and a new destination is found. This the most common redirect. Don’t use a 301 if you ever want to use that specific URL ever again. If so, you need a 302 redirect.

A 302 redirect is a so-called temporary redirect. This means that you can use this to say this piece of content is temporarily unavailable at this address, but it is going to come back. Need more information on which redirect to pick?

4. What’s an easy way to manage redirects in WordPress?

We might be a bit biased, but we think the redirects manager in our Yoast SEO Premium WordPress plugin is incredible. We know that a lot of people struggle to understand the concept of redirects and the kind of work that goes into adding and managing them. That’s why one of the first things we wanted our WordPress SEO plugin to have was an easy to use redirect tool. I think we succeeded, but don’t take my word for it. Here’s what Lindsay recently said:

The redirects manager can help set up and manage redirect on your WordPress site. It’s an indispensable tool if you want to keep your site fresh and healthy. We made it as easy as possible. Here’s what happens when you delete a post:

  • Move a post to trash
  • A message pops up saying that you moved a post to thrash
  • Choose one of two options given by the redirects manager:
    • Redirect to another URL
    • Serve a 410 Content deleted header
  • If you pick redirect, a modal opens where you can enter the new URL for this particular post
  • Save and you’re done!

So convenient, right? Here’s an insightful article called What does the redirects manager in Yoast SEO do, that answers that question.

5. What is a redirect checker?

A redirect checker is a tool to determine if a certain URL is redirected and to analyze the path it follows. You can use this information to find bottlenecks, like a redirect chain in which a URL is redirected many times, making it much harder for Google to crawl that URL — and giving users a less than stellar user experience. These chains often happen without you knowing about it: if you delete a page that was already redirected, you add another piece to the chain. So, you need to keep an eye on your redirects and one of the tools to do that is a redirect checker.

You can use one of the SEO suites such as Sitebulb, Ahrefs and Screaming Frog to test your redirects and links. If you only need a quick check, you can also use a simpler tool like httpstatus.io to give you an insight into the life of a URL on your site. Another must-have tool is the Redirect Path extension for Chrome, made by Ayima.

6. Do I need to redirect HTTP to HTTPS?

Whenever you plan to move to the much-preferred HTTPS protocol for your site — you know, the one with the green padlock in the address bar — you must redirect your HTTP traffic to HTTPS. You could get into trouble with Google if you make your site available on both HTTP and HTTPS, so watch out for that. Also, browsers will show a NOT SECURE message when the site is — you guessed it — not secured by a HTTPS connection. Plus, Google prefers HTTPS sites, because these tend to be faster and more secure. Your visitors expect the extra security as well.

So, you need to set up a 301 redirect from HTTP to HTTPS. There are a couple of way of doing this and you must plan this to make sure everything goes like it should. First, the preferred way of doing this is at server level. Find out on what kind of server your site is running (NGINX, Apache, or something else) and find the code needed to add to your server config file or .htaccess file. Most often, your host will have a guide to help you set up a redirect for HTTP to HTTPS on server level. Jimmy, one of our developers also wrote a guide helping you move your website from HTTP to HTTPS.

There are also WordPress plugins that can handle the HTTPS/SSL stuff for your site, but for this specific issue, I wouldn’t rely on a plugin, but manage your redirect at a server level. Don’t forget to let Google know of the changes in Search Console.

Redirects for SEO

There are loads of questions about redirects to answer. If you think about it, the concept of a redirect isn’t too hard to grasp. Getting started with redirects isn’t that hard either. The hard part of working with redirects is managing them. Where are all these redirects leading? What if something breaks? Can you find redirect chains or redirect loops? Can you shorten the paths? You can gain a lot from optimizing your redirects, so you should dive in and fix them. Do you have burning questions about redirects? Let us know in the comments!

Read more: ‘How to properly delete a page from your site’ »

The post 6 questions about redirects for SEO appeared first on Yoast.

A few years ago, Google announced the AMP (Accelerated Mobile Pages) project, and it’s becoming increasingly important for all kinds of websites. AMP is a technology to make webpages faster on mobile devices, improving loading times by stripping some of the design.

Initially, AMP was mainly relevant for static content, like blogposts or news articles, that didn’t need interaction from the user. But these days, it’s also useful for dynamic types of pages that site owners of (small) businesses might want to use. Implementing AMP on your site can be a bit daunting if you’re new to technical SEO. But if you manage to get it right, you may even end up preferring the clean, focused look of your AMP pages.

That was definitely the case for William Anderson, who emailed us on the subject:

I’m thinking of redirecting all my responsive pages to my AMP pages because I prefer their look. The AMP pages click through rate is astounding but I’m wondering what the SEO implications will be.

Watch the video or read the transcript further down the page for my answer!

Redirecting responsive pages to AMP pages

“Well, to be honest, what do you call a responsive page? If you have separate mobile pages that you can redirect to your AMP pages: perfectly fine, go for it. If you have a responsive version of your website, then doing that is actually technically very hard and not something I’d recommend.

Become a technical SEO expert with our Technical SEO 1 training! »

Technical SEO 1 training Info

Google is pushing the idea of what they call canonical AMP, so the idea that AMP is the only version of your page. If that fits your business, by all means go for it. Because I think it’s a very good idea for your click-through rate and a lot of other things in terms of rankings. I hope that helps. Good luck.”

Ask Yoast

In the series Ask Yoast, we answer SEO questions from our readers. Have an SEO-related question? Perhaps we can help you out! Send an email to ask@yoast.com, and your question may be featured in one of our weekly Ask Yoast vlogs.

Note: please check our blog and knowledge base first, the answer to your question may already be out there! For urgent questions, for example about the Yoast SEO plugin not working properly, we’d like to refer you to our support page.

Read more: ‘Setting up WordPress for AMP’ »

The post Ask Yoast: Redirect responsive pages to AMP pages? appeared first on Yoast.

If you use meta robots tags on your pages, you can give search engines instructions on how you’d like them to crawl or index parts of your website. This page lists an overview of all the different values you can have in the meta robots tag, what they do, and which search engines support each value.

The different robots meta tag values

The following values (‘parameters’) can be placed on their own, or together in the content attribute of tag (separated by a comma), to control how search engines interact with your page.

Scroll down for an overview of which search engines support which specific parameters.

Allow search engines to add the page to their index, so that it can be discovered by people searching.
Note: This is assumed by default on all pages – you generally don’t need to add this parameter.
Disallow search engines from adding this page to their index, and therefore disallow them from showing it in their results.
Tells the search engines that it may follow links on the page, to discover other pages.
Note: This is assumed by default on all pages – you generally don’t need to add this parameter.
Tells the search engines robots to not follow any links on the page.
Note: It’s unclear whether this attribute prevents search engines from following links, or just prevents them from assigning any value to those links.
Note: It’s also unclear (and inconsistent between search engines) whether this applies to all links, or only internal links.
A shortcut for noindex, nofollow.
A shortcut for index, follow.
Note: This is assumed by default on all pages, and does nothing if specified.
Disallow search engines from indexing images on the page.
Note: If images are linked to directly from elsewhere, search engines can still index them, so using an X-Robots-Tag HTTP header is generally a better idea.
Prevents the search engines from showing a cached copy of this page in their search results listings.
Same as noarchive, but only used by MSN/Live.
Prevents the search engines from showing a text or video snippet (i.e., a meta description) of this page in the search results, and prevents them from showing a cached copy of this page in their search results listings.
Note: Snippets may still show an image thumbnail, unless noimageindex is also used.
Prevents search engines from showing translations of the page in their search results.
Tells search engines a date/time after which they should not show it in search results; a ‘timed’ version of noindex.
Note: Must be in RFC850 format (e.g., Monday, 15-Aug-05 15:52:01 UTC).
Prevents the search results snippet from using the page description from the Yandex Directory.
Note: Only supported by Yandex.
Blocks Yahoo from using the description for this page in the Yahoo directory as the snippet for your page in the search results.
Note: Since Yahoo closed its directory this tag is deprecated, but you might come across it once in awhile.

Which search engine supports which robots meta tag values?

This table shows which search engines support which values. Note that the documentation provided by some search engines is sparse, so there are many unknowns.

Robots value Google Yahoo Bing Ask Baidu Yandex
index Y* Y* Y* ? Y Y
noindex Y Y Y ? Y Y
follow Y* Y* Y* ? Y Y
nofollow Y Y Y ? Y Y
none Y ? ? ? N Y
all Y ? ? ? N Y
noimageindex Y N N ? N N
noarchive Y Y Y ? Y Y
nocache N N Y ? N N
nosnippet Y N Y ? N N
notranslate Y N N ? N N
unavailable_after Y N N ? N N
noodp N Y** Y** ? N N
noydir N Y** N ? N N
noyaca N N N N N Y

* Most search engines have no specific documentation for this, but we’re assuming that support for excluding parameters (e.g., nofollow) implies support for the positive equivalent (e.g., follow).
** Whilst the noodp and noydir attributes may still be ‘supported’, these directories no longer exist, and it’s likely that these values do nothing.

Rules for specifics search engines

Sometimes, you might want to provide specific instructions to a specific search engine, but not to others. Or you may want to provide completely different instructions to different search engines.

In these cases, you can change the value of the content attribute to a specific search engine (e.g., GOOGLEBOT or MSNBOT).

Note: Given that search engines will simply ignore instructions which they don’t support or understand, it’s very rare to need to use multiple meta robots tags to set instructions for specific crawlers.

Conflicting parameters, and robots.txt files

It’s important to remember that meta robots tags work differently to instructions in your robots.txt file, and that conflicting rules may cause unexpected behaviours. For example, search engines won’t be able to see your meta tags if the page is blocked via robots.txt.

You should also take care to avoid setting conflicting values in your meta robots tag (such as using both index and noindex parameters) – particularly if you’re setting different rules for different search engines. In cases of conflict, the most restrictive interpretation is usually chosen (i.e., “don’t show” usually beats “show”).

Become a technical SEO expert with our Technical SEO 1 training! »

Technical SEO 1 training Info

The resources from the search engines

The search engines themselves have pages about this subject as well:

And of course there’s always the official robots.txt pages and Danny Sullivan’s big robots meta write up.

Read more: ‘robots.txt: the ultimate guide’ »

The post The ultimate guide to the meta robots tag appeared first on Yoast.

If you want to keep your page out of the search results, there are a number of things you can do. Most of ’em are not hard and you can implement these without a ton of technical knowledge. If you can check a box, your content management system will probably have an option for that. Or allows nifty plugins like our own Yoast SEO to help you prevent the page from showing up in search results. In this post, I won’t give you difficult options to go about this. I will simply tell you what steps to take and things to consider. 

Become a technical SEO expert with our Technical SEO 1 training! »

Technical SEO 1 training Info

Why do you want to keep your page out of the search results?

It sounds like a simple question, but it’s not, really. Why do you want to keep your page out of the search results in the first place? If you don’t want that page indexed, perhaps you shouldn’t publish it? There are obvious reasons to keep for instance your internal search result pages out of Google’s search result pages or a “Thank you”-page after an order or newsletter subscription that is of no use for other visitors. But when it comes to your actual, informative pages, there really should be a good reason to block these. Feel free to drop yours in the comments below this post.

If you don’t have a good reason, simply don’t write that page.

Private pages

If your website contains a section that is targeted at, for instance, an internal audience or a, so-called, extranet, you should consider offering that information password-protected. A section of your site that can only be reached after filling out login details won’t be indexed. Search engines simply have no way to log in and visit these pages.

How to keep your page out of the search results

If you are using WordPress, and are planning a section like this on your site, please read Chris Lema’s article about the membership plugins he compared.

Noindex your page

Like that aforementioned “Thank you”-page, there might be more pages like that which you want to block. And you might even have pages left after looking critically if some pages should be on your site anyway. The right way to keep a page out of the search results is to add a robots meta tag. We have written a lengthy article about that robots meta tag before, be sure to read that.

Adding it to your page is simple: you need to add that tag to the <head> section of your page, in the source code. You’ll find examples from the major search engines linked in the robots meta article as well.

Are you using WordPress, TYPO3 or Magento? Things are even easier. Please read on.

Noindex your page with Yoast SEO

The above mentioned content management systems have the option to install our Yoast SEO plugin/extension. In that plugin or extension, you have the option to noindex a page right from your editor.

In this example, I’ll use screenshots from the meta box in Yoast SEO for WordPress. You’ll find it in the post or page editor, below the copy you’ve written. In Magento and TYPO3 you can find it in similar locations.

How to keep your site out of the search results using Yoast SEO

Click the Advanced tab in our Yoast SEO meta box. It’s the cog symbol on the left.
Use the selector at “Allow search engines to show this post/page in search results”, simply set that to “No” and you are done.

The second option in the screenshot is about following the links on that page. That allows you to keep your page out of the search results, but follow links on that page as these (internal) links matter for the other pages (again, read the robots meta article for more information). The third option: leave that as is, this is what you have set for the site-wide robots meta settings.

It’s really that simple: select the right value and your page will tell search engines to either keep the page in or out of the search results.

The last thing I want to mention here is: use with care. This robots meta setting will truly prevent a page from being indexed, unlike robots.txt suggestion to leave a page out of the search result pages. Google might ignore the latter, triggered by a lot of inbound links to the page. 

If you want to read up on how to keep your site from being indexed, please read Preventing your site from being indexed, the right way. Good luck optimizing!

The post How to keep your page out of the search results appeared first on Yoast.

In our major Yoast SEO 7.0 update, there was a bug concerning attachment URL’s. We quickly resolved the bug, but some people have suffered anyhow (because they updated before our patch). This post serves both as a warning and an apology. We want to ask all of you to check whether your settings for the redirect of the attachment URL’s are correct. And, for those of you who suffered from a decrease in rankings because of incorrect settings, we offer a solution that Google has OKed as well.

Is redirect attachment URLs set to “Yes”?

You need to check this manually: unless you have a very specific reason to allow attachment URLs to exist (more on that below), the setting should be set to “Yes” . If the setting says “Yes”, you’re all set. You can find this setting in Search Appearance, in the tab Media.

media attachment urls setting in Yoast SEO

Is your attachment URL set to “No”?

If your attachment URL is set to “no”, there are two different scenario’s which could apply to you. You could intentionally have set this setting to “no”, but the setting  could also be turned to “no” without your intent.

Intentionally set to “No”

If you intentionally put the setting of the attachment URL to “No”, you’ll probably be aware of that fact. In that case, your attachment URL’s are an important aspect of your site. You’re linking actively to these pages and these pages have real content on them (more than just a photo). This could for instance apply to a photography site. If you want this setting to say “No”, you’ll probably have put a lot of thought in this. In this case, you can leave your setting to “no”. You’re all set!

Unintentionally set to “No”

It is also possible that you notice that the setting is set to “No” and this was not intentionally. You’ve suffered from our bug. We’re so very sorry. You should switch your toggle to “Yes” and save the changes. Perhaps you need to do a little bit more, though. There are (again) two scenario’s:

Traffic and ranking is normal

Ask yourself the following question: have you noticed any dramatic differences in your rankings and traffic in the last three months (since our 7.0 update of march 6th)? If the answer to this question is no, than you should just turn the redirect setting of the attachment URL to “Yes” and leave it at that. You did not suffer from any harm in rankings, probably because you’re not using attachment URL’s all that much anyway. This will be the case for most sites. After switching your toggle to “Yes” and saving the changes, you’re good to go!

Traffic and ranking have decreased

In the second scenario, you notice that the redirect attachment URL setting is set to “No” and you did indeed suffer from a dramatic decrease in traffic and ranking. We’re so very sorry about that. Make sure to switch the setting of the attachment URL to “Yes” immediately.  In order to help you solve your ranking problem, we have built a search index purge plugin. Download and install this plugin here. More on the working of this separate plugin below.

What to do if you’re not sure

If you’re not sure whether you’ve been affected by this, and your Google Search Console is inconclusive: don’t do anything other than setting the setting to “Yes”. See “What did Google say” below for the rationale.

What do attachment URL’s do anyway?

When you upload an image in WordPress, WordPress does not only store the image, it also creates a separate so-called attachment URL for every image. These attachment URLs are very “thin”: they have little to no content outside of the image. Because of that fact, they’re bad for SEO: they inflate the number of pages on your site while not increasing the amount of quality content. This is something that WordPress does, which our plugin takes care off (if the setting is correctly turned to “Yes”).

Historically, we had had a (default off) setting that would redirect the attachment URL for an image to the post the image was attached to. So if I uploaded an image to this post, the attachment URL for that image would redirect to this post. In the old way of dealing with this, it meant that images added for other reasons (like say, a site icon, or a page header you’d add in the WordPress customizer), would not redirect.  It also meant that if you used an image twice, you could not be certain where it would redirect.

In Yoast SEO 7.0 we introduced a new feature to deal with these pages. Now, we default to redirecting the attachment URL to the image itself. This basically means attachment URLs no longer exist on your site at all. This actually is a significant improvement.

What did the bug do (wrong)?

The bug was simple yet very painful: when you updated from an earlier version of Yoast SEO to Yoast SEO 7.0-7.0.2 (specifically those versions), we would not always correctly convert the setting you had for the old setting into the new one. We accidentally set the setting to ‘no’. Because we overwrote the old settings during the update, we could not revert this bug later on.

The impact of the bug

For some sites our bug might have a truly bad impact. In Twitter and Facebook discussions I’ve had, I’ve been shown sites that had the number of indexed URLs on their site quintupled, without adding any content. Because with that setting being “No” XML sitemaps was enabled for attachments. As a result of that, lots and lots of attachment URLs got into Google’s index. Some of those sites are now suffering from Panda-like problems. The problem will be specifically big if you have a lot of pictures on your website and few high quality content-pages. In these cases,  Google will think you’ve created a lot of ‘thin content’ pages all of a sudden.

The vast majority of the websites running Yoast SEO probably hasn’t suffered at all. Still, we messed up. I myself, am sorry. More so than normal, because I came up with and coded this change myself…

What did Google say?

We have good contacts at Google and talk to them regularly about issues like these. In this case, we discussed it with John Mueller and his first assessment was similar to mine: sites should normally not suffer from this. That’s why we don’t think drastic measures are needed for everyone. Let me quote him:

“Sites generally shouldn’t be negatively affected by something like this. We often index pages like that for normal sites, and they usually don’t show up in search. If they do show up for normal queries, usually that’s a sign that the site has other, bigger problems. Also, over the time you mentioned, there have been various reports on twitter & co about changes in rankings, so if sites are seeing changes, I’d imagine it’s more due to normal search changes than anything like this.”

We’ve also discussed potential solutions with him. The following solution has been OK’d by him as the best and fastest solution.

What does this search index purge plugin do?

The purpose of the search index purge plugin is to purge attachment URLs out of the search results as fast as possible. Just setting the Yoast SEO attachment URL redirect setting to “Yes” isn’t fast enough. When you do that, you no longer have XML sitemaps or anything else that would make Google crawl those pages, and thus it could take months for Google to remove those URLs. That’s why I needed to be creative.

Installing this plugin will do the following two things:

  • Every attachment URL will return a 410 status code.
  • A static XML sitemap, containing all the attachment URLs on a given site will be created. The post modified date for each of those URLs is the activation date and time of the plugin.

The XML sitemap with recent post modified date will make sure that Google spiders all those URLs again. The 410 status code will make sure Google takes them out of its search results in the fastest way possible.

After six months the attachment URLs should be gone from the search results. You should then remove the search index purge plugin, and keep the redirect setting of the attachment URLs set to “Yes”.

Advice: keep informed!

We try to do the very best we can to help you get the best SEO out of your site. We regularly update our configuration wizard and there is no harm whatsoever in running through it again. Please regularly check if your site’s settings are still current for your site. We do make mistakes, and this release in particular has led us to a rigorous post mortem on all the stages of this release’s process.

We regularly write about things that change in Google, so stay up to date by subscribing to our newsletter below. If you want to understand more of the how and why of all this, please do also take our new, free, SEO for Beginners course, which you’ll get access to when you sign up.

The post Media / attachment URL: what to do with them? appeared first on Yoast.

Do you have a recipe site? If so, you might already be using structured data to mark up your recipes so they can get highlighted in the search results. Good work! But, Google recently made some changes that might make your implementation incomplete. It also expanded the possibilities of structured data for recipes by bringing guidance into the mix. The result? Google Home can now read your structured data powered recipes out loud!

Want rich snippets for your site? Try our Structured data training »

Structured data training Info

Structured data, recipes and Google Home

Google is betting big on voice search. While voice search is still in its infancy, there are signs that we are moving towards a future where we are relying much less on our screens. There are many instances where talking to your digital assistant makes much more sense than typing commands. What’s more, the AI in digital assistants will become smarter and more apt at entering into a natural dialogue with you. We’re talking about a real natural language interface here.

A lot is going on right now. Take for instance that Google Duplex demo, showing a digital assistant calling a hairdresser to make an appointment. Joost wrote a post about Google Duplex and the ethics and implications. If AI is this smart, we need to take note.

To get voice search and actions to work, Google relies on structured data. Structured data makes it immediately clear what all the different parts of a page mean so search engines can use that to do cool stuff with. Google Actions, the big database featuring things you can let Assistant do, uses structured data. For instance, here is Google’s page on recipe actions — which is the same as the regular structured data for recipes documentation. If you want to learn all about structured data, please read our Ultimate Guide to Structured Data.

New rules, new results

Earlier this month, Google announced a new and improved way of targeting people who search for recipes. By adding the correct structured data, you can get your recipes read out load. Google even said that by implementing this you might: “receive traffic from more sources, since users can now discover your recipes through the Google Assistant on Google Home.”

But, when throwing random recipes from some of the largest recipe sites in the world into the Structured Data Testing Tool, you’ll find that hardly any fully comply with these new rules yet. What’s more, even the implementation of the recipe Schema.org itself is widely different between sites. That being said, there’s still a lot to win, even for the big boys.

As of now, Google recommends four new properties in addition to the ones you probably already use:

  • keywords: additional terms to describe the recipe
  • recipeCategory: in which category does the recipe fit?
  • recipeCuisine: from which region is the recipe?
  • video: use this if you have a video showing to make the recipe

You’ll see the recommendations in Google’s Structured Data Testing Tool:warnings structured data testing tool

Guidance: reading it out loud

How cool would it be if your Google Home could assist while you were cooking? Not by setting timers and the like, but by reading the recipe for you. That’s now possible thanks to guidance. In addition to the four new recommended properties for structured data for recipes, Google states that:

“To enable your recipe for guidance with the Google Home and Google Assistant, make sure you add recipeIngredient and recipeInstructions. If your recipe doesn’t have these properties, the recipe isn’t eligible for guidance, but may still be eligible to appear in Search results.”

To get your Google Home to pronounce the steps of your recipes correctly, you need to set the value of recipeInstructions using HowToStep or HowToSection. The latter, of course, should be used if your recipe instructions consist of multiple parts or sections. You can also keep everything in one block of recipeInstruction, but then you are at the mercy of Google as it has to try and work everything out itself. If you have distinct steps, please use HowToStep and/or HowToSection.

Hello keywords, we meet again

In a move straight out of 1997, we see keywords pop up again. Google now recommends using the keyword property to add context for your recipes. Now, this shouldn’t be confused with the recipeCategory and recipeCuisine properties. It is an extra way of describing your articles using words that don’t relate to a category or type of cuisine. We’ll just have to wait and see if the spammers can keep themselves under control.

Getting into that carousel

One of the coolest ways to discover content is the swipeable carousel you see when you search for certain types of content on mobile. To greaten the chance of your site appearing in this overview you can add an ItemList with one or more ListItems to your content.

Now, Google is quick to add that it might not be necessary to add this if you only want to appear in the regular search carousel. If you throw several well-known recipes sites into the Structured Data Testing Tool you will see that hardly any have added ItemList to their pages. Still, they rank high and appear in the carousel. If, however, you want to have site-specific entries — like your list of 5 best chocolate cheesecake recipes, or another type of landing page with recipes — into that carousel you need to add the ItemList structured data. There are several ways of doing this; you can find out more on Google’s documentation pages.

Applying structured data for recipes

If you look at Schema.org/Recipe, you might be starting to go a little bit green around the gills. Where do you even start? It’s massive! These are all the properties you could add, but that doesn’t mean that you should. Google requires just a couple but recommends a lot more.

These are the required properties:

  • @context: set to Schema.org
  • @type: set to Recipe
  • image: can be a URL or a ImageObject
  • name: name of the dish

That’s it! But, as you might have guessed, this won’t get you very far. By providing Google with as much data about your recipe as possible, you increase the chance that Google ‘gets’ your recipe. After that, it can apply the rich results and corresponding Actions accordingly.

Here are the recommended properties:

  • aggregateRating: average review score for this recipe
  • author: who made it? Use Schema.org/Person
  • cookTime: the time it takes to cook the recipe
  • datePublished: when was the recipe published?
  • description: a description of the recipe
  • keywords: terms to describe the recipe
  • nutrition.calories: the number of calories. Use Schema.org/Energy
  • prepTime: how long do the preparations take?
  • recipeCategory: is it breakfast, lunch, dinner or something else?
  • recipeCuisine: where in the world is the recipe from originally?
  • recipeIngredient: every ingredient you need to make the recipe. This property is required if you want Google Home to read your recipe out loud.
  • recipeInstructions: mark up the steps with HowToStep or HowToSection with embedded ItemistElement with a HowToStep.
  • recipeYield: for how many servings is this?
  • review: add any review you might have
  • totalTime: how long does it all take?
  • video: add a video showing how to make the recipe, if applicable

To show you how this all translates to code, we need an example. So, here’s Googles example recipe in JSON-LD format. You’ll see that it is all obvious and pretty easy to understand. If you want to implement JSON-LD code on your page, Google Tag Manager might be your best bet.

<!doctype html>
<html amp lang="en">
    <meta charset="utf-8">
    <title>Party Coffee Cake</title>
    <link rel="canonical" href="http://example.ampproject.org/recipe-metadata.html" />
    <meta name="viewport" content="width=device-width,minimum-scale=1,initial-scale=1">
    <script type="application/ld+json">
      "@context": "http://schema.org/",
      "@type": "Recipe",
      "name": "Party Coffee Cake",
      "image": [
      "author": {
        "@type": "Person",
        "name": "Mary Stone"
      "datePublished": "2018-03-10",
      "description": "This coffee cake is awesome and perfect for parties.",
      "prepTime": "PT20M",
      "cookTime": "PT30M",
      "totalTime": "PT50M",
      "keywords": "cake for a party, coffee",
      "recipeYield": "10 servings",
      "recipeCategory": "Dessert",
      "recipeCuisine": "American",
      "nutrition": {
        "@type": "NutritionInformation",
        "calories": "270 calories"
      "recipeIngredient": [
        "2 cups of flour",
        "3/4 cup white sugar",
        "2 teaspoons baking powder",
        "1/2 teaspoon salt",
        "1/2 cup butter",
        "2 eggs",
        "3/4 cup milk"
      "recipeInstructions": [
          "@type": "HowToStep",
          "text": "Preheat the oven to 350 degrees F. Grease and flour a 9x9 inch pan."
          "@type": "HowToStep",
          "text": "In a large bowl, combine flour, sugar, baking powder, and salt."
          "@type": "HowToStep",
          "text": "Mix in the butter, eggs, and milk."
          "@type": "HowToStep",
          "text": "Spread into the prepared pan."
          "@type": "HowToStep",
          "text": "Bake for 30 to 35 minutes, or until firm."
          "@type": "HowToStep",
          "text": "Allow to cool."
      "review": {
        "@type": "Review",
        "reviewRating": {
          "@type": "Rating",
          "ratingValue": "4",
          "bestRating": "5"
        "author": {
          "@type": "Person",
          "name": "Julia Benson"
        "datePublished": "2018-05-01",
        "reviewBody": "This cake is delicious!",
        "publisher": "The cake makery"
      "aggregateRating": {
      "@type": "AggregateRating",
        "ratingValue": "5",
        "ratingCount": "18"
  "video": [
    "name": "How to make a Party Coffee Cake",
    "description": "This is how you make a Party Coffee Cake.",
    "thumbnailUrl": [
    "contentUrl": "http://www.example.com/video123.flv",
    "embedUrl": "http://www.example.com/videoplayer.swf?video=123",
    "uploadDate": "2018-02-05T08:00:00+08:00",
    "duration": "PT1M33S",
    "interactionCount": "2347",
    "expires": "2019-02-05T08:00:00+08:00"
    <h1>The best coffee cake you’ll ever try!</h1>

Conclusion: Got a recipe site? Add structured data now!

Recipe sites are in a very cool position. It seems that they get everything first. By marking up your recipes with structured data, you can get Google to do a lot of cool stuff with your recipes. You can get them to pronounce it via Google Home or try to find other ways of interacting with them with Actions via the Assistant database. And this is probably only the beginning.

Read more: ‘Structured data: the ultimate guide’ »

The post Structured data for recipes: getting content read out loud appeared first on Yoast.

Perhaps you heard about Google Duplex? You know, the artificial assistant that called a hairdresser to make an appointment? Fascinating technology, but what is it exactly? And does it affect search or SEO? In this post, I’ll explain what Google Duplex is. And, I’ll raise some ethical issues I have with it. Finally, I will go into the consequences of Google Duplex for SEO.

What is Google Duplex?

Google Duplex is an experimental new technology that Google demoed at Google I/O, the company’s annual developer conference. This technology allows Google to mimic human conversation. Google is quick to say that at this point, it’s only trained for specific fields.

Free course! Learn what makes your site rank with our SEO for beginners training »

Free SEO course: SEO for beginners Info

Google showed a video in which a robot makes an appointment for a hairdressers appointment by calling that hairdresser and having an actual conversation. If you haven’t seen a demo of it yet, check out this video first.

Is it good?

Last Wednesday at the Google I/O conference, John Hennessy said about Google Duplex: “In the domain of making appointments, it passes the Turing test.” The Turing test is a test that determines whether a human is indistinguishable from a robot. This means that the robot used in Google Duplex is not distinguishable from an actual human being.

John Hennessy is the chairman of the board of Google’s parent company Alphabet. He is also quite a hero in the field of computer science. When he says something like that — even about his own company — it’s worth thinking about.

John Hennessy was pretty quick to point out that it passes in only one specific field: the task of booking appointments. “It doesn’t pass it in general terms, but it passes in that specific domain. And that’s really an indication of what’s coming.” Which gets us to ethics.

The ethics of AI that’s this good

When you have an Artificial Intelligence (AI) that can interact with people, as Google Duplex can, you need to think about ethics. Luckily, people have been thinking about precisely these kinds of ethics problems for a long time. The first set of rules you’ll run into when you search around ethics concerning AI are Isaac Asimov’s famous three laws of robotics, introduced in his 1942 (!) short story Runaround:

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

While this laid the groundwork for almost all of the science fiction around robots, most of that doesn’t necessarily immediately come into play now. But since then, people have started adding on the three laws of robotics. The most well-known “fourth law of robotics” was added by novelist Lyuben Dilov, in his book Icarus’s Way. This law is as follows:

A robot must establish its identity as a robot in all cases.

Now, go back to the video linked above. Nowhere does that assistant state it’s a bot. In fact, it has mannerisms that make it very human. I think that’s wrong and I think people were rightly calling Google out on that. Google has already stated that they will change that. I’m curious how exactly. Let’s say I am skeptical. Google does not always communicate their intentions clearly. I mean: Google says it discloses which results are ads in its search results and which results aren’t. However, most ‘non-tech’ people don’t know what exactly is an ad and what is an organic result.  We’ll have to wait and see, or maybe, hear.

Security implications

In the wrong hands, this type of technology is incredibly scary. Did you know that it now takes less than 1 minute of recorded audio to reasonably accurately simulate somebody’s voice? Combine that with the inferior systems of security we currently have for phone conversations with, for instance, banks, and you have a potential disaster on your hands.

What will Google Duplex be used for?

The examples we’ve seen so far indicate that Google Duplex can be used to make straightforward phone calls – to plan meetings and make reservations. These examples fit the personal assistant purpose for which Google Assistant is promoted. But if an AI becomes this good at consumer interaction, of course, businesses will want to use it to receive phone calls as well. They could use it for helpdesks and other types of calls that we now task entire call centers with.

Future use of Google Duplex?

It is hard to say when Google Duplex will be used on a large scale. This might not happen next year or even the year after. But it’s definitely going faster than most people outside of the tech bubble realize. If Google Duplex can be trained to make a restaurant booking, it can also be trained to take your new credit card application. And, since it is an AI, it would be much faster and less error-prone than a human would be at performing your credit check.

Look at a Google Duplex-like system for receiving calls as a nice extension to the phone call conversion tracking system Google already has. Google could indeed take your credit card application. Or, without even all that much training, do the other side of the second example call in the video above and take the entire reservation system for a restaurant and automate it. The question then becomes: what if your digital assistant calls into the Duplex powered system on the other side? Will they use human-like conversation to get the job done? Will we end up with human speech as the ultimate computer to computer language?

Want rich snippets for your site? Try our Structured data training »

Structured data training Info

How does this impact search and SEO?

Google Duplex might not seem to have a direct impact on search, but consider this: if your Google Assistant can have conversations like this with your hairdresser and your restaurant of choice, will you have these conversations with him/her too? Suddenly you can talk to your phone and sound like you’re talking to your secretary, instead of sounding like the freak who talks to his phone or watch. Search becomes even more conversational and queries get more complicated.

When queries get more complicated, context becomes more important than ever. And now we’re back to what we’ve been writing about for quite a while here at Yoast: you need to write awesome content. I really can’t add much to what Marieke wrote in that post, so read it.

The other side of how this impacts SEO is more technical. For AIs to be efficient, it’s far easier to rely on structured data. If you use a standards-based system like Schema.org for things like reservations, all Google has to do is tie into that. Suddenly, it doesn’t have to retrain its system for a new booking engine; it can just detect that you use Schema.org for that, and poof, it just works.

Next steps

So what’s next? Well, now we wait. We wait until we get to play with this. We’ll have to figure out how good this truly is before we can do anything else.

Read more: ‘Readability ranks!’ »

The post Google Duplex: The what, the ethics, the SEO appeared first on Yoast.

There are many occassions when you may want to put a PDF on your site. For example, when you’ve made an online magazine, when an article you wrote was featured in a book or magazine, or when you’ve written detailed instructions for a DIY project. So far, so good.

But things can get a bit more complicated when you also have the content from this PDF somewhere else on your site, or on another website. To avoid duplicate content, you need to set a canonical URL. But how do you do that for a PDF document? And what is the best way to do that? Let’s discuss in today’s Ask Yoast!

Karen Schousboe emailed us her question:

I plan to publish a PDF magazine under medieval.news. Some of the articles in each issue will also be freely available on a sister website. How should I handle that? Do I link canonical from the articles to the PDF magazine or from the magazine to the website?”

Watch the video or read the transcript further down the page for my answer!

Canonicalization and PDFs

“Well, you can have a canonical HTTP header and what I would suggest doing is canonicalizing from the PDF magazine to the sister website, because HTML pages just rank a lot better than PDFs, usually.

In fact, I would suggest publishing everything in HTML and not necessarily in PDF because PDF is just not very easy to land on from search. You can’t do any tracking, you can’t do a whole lot of things that you can do with HTML. So I would seriously consider doing all of it in HTML pages and then canonicalizing between them. Good luck.”

Become a technical SEO expert with our Technical SEO 1 training! »

Technical SEO 1 training Info

Ask Yoast

In the series Ask Yoast, we answer SEO questions from our readers. Have an SEO-related question? Maybe we can help you out! Send an email to ask@yoast.com.

Note: please check our blog and knowledge base first, the answer to your question may already be out there! For urgent questions, for example about our plugin not working properly, we’d like to refer you to our support page.

Read more: ‘rel=canonical: the ultimate guide’ »

The post Ask Yoast: Canonical for PDF magazine appeared first on Yoast.

Optimizing how your content looks when it’s shared on third-party platforms like Facebook, Twitter or WhatsApp can drive improved visibility, clickthrough, and conversions. But it’s not as simple as just picking a great image…

When you share a URL on Facebook, Twitter, or other platforms, they’ll typically show a preview of the page, with a title, description, and image. These elements are typically taken from Open Graph tags defined in the source code of the page you’re sharing.

How does this work?

The way in which this works is defined by the Open Graph Protocol. This is an open source standard (like WordPress, and even the Yoast SEO plugin), which allows webmasters to tell third-party systems (like Facebook, Twitter, Pinterest, or even WhatsApp, Skype or Hotmail) about their pages.

Star Wars

It defines a set of meta tags which allow you to provide information about the type of content on a page (e.g., “this is a page about a movie”), metadata about that thing (e.g., “it’s called Star Wars – The Last Jedi”), and how it should be presented when shared.

They look like something this:

<meta property="og:title" content="Star Wars - The Last Jedi" />
<meta property="og:type" content="video.movie" />
<meta property="og:url" content="https://www.imdb.com/title/tt2527336/" />
<meta property="og:image" content="https://ia.media-imdb.com/images/M/MV5BMjQ1MzcxNjg4N15BMl5BanBnXkFtZTgwNzgwMjY4MzI@._V1_SY1000_CR0,0,675,1000_AL_.jpg" />
<meta property="og:image:width" content="675" />
<meta property="og:image:height" content="1000" />

Most websites (and those running the Yoast SEO plugin) will automatically output elements like these for all pages, posts and archives.

The og:image tags are particularly important because Open Graph tags most commonly play a role in social sharing dialogues. This tag defines the picture which shows up when users share your content across social networks, apps, and other systems. Optimizing the composition, dimensions, and even the file size of the image you use can influence whether somebody clicks and the quality of their experience.

Using images which are too large, too small, or the wrong dimensions can result in errors, or in platforms omitting your images entirely. But optimizing your Open Graph markup isn’t as simple as just picking a good image. There are some complexities around how different platforms use these tags, treat your images, and support different rules.


  • It’s impossible to specify different images/formats/files for different networks, other than for Facebook and Twitter. The Facebook image is used, by default, for all other networks/systems). This is a limitation of how these platforms work. The same goes for titles and descriptions.
  • The Yoast SEO plugin will automatically try and specify the best image for each platform where you share your content, based on the constraints of these platforms.
  • The image size and cropping won’t always be perfect across different platforms, as the way in which they work is inconsistent.
  • Specifically, your images should look great on ‘broadcast’ platforms like Facebook and Twitter, but might sometimes crop awkwardly on platforms designed for 1:1 or small group conversations, like WhatsApp or Telegram.
  • For best results, you should manually specify og:image tags for each post, through the plugin. You should ensure that your primary og:image is between 1200x800px and 2000x1600px, and is less than 2mb in size.
  • We’ll be adding developer support for more advanced customization via theme functions and filters. 

    Optimize your site for search & social media and keep it optimized with Yoast SEO Premium »

    Yoast SEO: the #1 WordPress SEO plugin Info

It’s not as simple as just picking a good image

Even though Open Graph tags use an open standard, each platform (Facebook, Twitter, WhatsApp, etc) treats Open Graph tags slightly differently. Some even have their own proprietary ‘versions’ of Open Graph tagging, too. Twitter’s ‘twitter card’ markup, for example, bears a strong resemblance to, and overlaps with, Open Graph tagging.

As an open project, the Open Graph is constantly changing and improving, too. Features and support come and go, the documentation isn’t always up to date, and individual platforms make (and change) their own decisions on how they interpret it, and which bits they’ll implement.

And as the web itself continues to evolve, there are more and more ‘platforms’ where people can share content, and where the Open Graph is used. From Slack, to WeChat, to tomorrow’s productivity and social media apps, they’ll all rely on the Open Graph, but use it in subtly different ways.

So when we’re trying to define a ‘best practice’ approach to support as part of the Yoast SEO plugin, it’s not as simple as just picking the ‘best image’ – we need to make sure that we provide the right tags and image formats for each platform and network.

To make things more complex, these tags and approaches sometimes conflict with or override each other. Twitter’s twitter:image property, for example, overrides an og:image value for images shared via Twitter, when both sets of tags are on the same page.

Lastly, the open graph specification allows us to provide multiple og:image values. This, in theory, allows the platform to make the best decision about which size to use and allows people who are sharing some choice over which image they pick. How different platforms interpret these values, however, varies considerably.

This logic behind which platforms use which images, in which scenarios, gets complex pretty quickly! So more often than not, we’re stuck relying on the og:image value(s) as a general default for all platforms, and adding specific support where we can, whilst trying to minimize conflict. This doesn’t always produce perfect results, so we’re always on the lookout for better ways to ‘get it right’ without requiring end users to spend hours specifying multiple image formats for each post they write.

The challenge of og:image as a default

In a perfect world, there are two different approaches to how platforms handle Open Graph tagging. They look like this:

  1. All platforms only use og:image tags. When multiple images are set, they automatically select (and crop) the best version for their context.
  2. All platforms have specific Open Graph tags (or their own versions). They allow fine-grain control over every scenario, by enabling us to specify the exact image which should be used in each case.

Unfortunately, we’re stuck somewhere in-between. Some platforms allow a degree of control, but the og:image tag functions as a general fallback for all other scenarios.

FB plugin Yoast SEO

This is particularly problematic, as the og:image is also Facebook’s primary image. This is a huge challenge for the Yoast SEO plugin team, and for anybody else trying to define a ‘best practice’ approach to tagging. The image we specify as the main image for Facebook sharing (usually a large, high-res picture) also has to be suitable as a general default for all platforms which don’t have their own specific tags.

For many of these platforms, a large file size optimized for sharing in a Facebook newsfeed won’t be appropriate for their context. For example, Pinterest expects a relatively small, square cropped thumbnail image when sharing from a page – and whilst it has its own tagging mechanisms, the presence of an og:image tag on the page overrides those.

There’s more complexity, too. Different platforms have varying restrictions on image dimensions, ratios, and file size. A high-res og:image optimized for Facebook (with a large file size) will, more often than not, not display at all when someone shares it on Slack, for example.

Frequently, Yoast SEO has to share the same default og:image between multiple platforms – even though they have different expectations and apply different rules and treatments. Trying to work out what the ‘default’ image tag(s) should be, when it has to be the main image for Facebook and a universal default, is tricky. But it’s a problem we need to solve if we’re to provide a best practice ‘default’ setting for WordPress users.

There are lots of unknowns

Because each platform maintains its own rules and documentation on how they treat og:image tags, there are often gaps in our knowledge. Specific restrictions, edge cases, and in particular, information on which rules override other rules, are rarely well-documented. The documentation which does exist is often ambiguous at best. Google+ “snippet” documentation, for example, states that they won’t use images which are “not square enough”. It’s unclear what this means in specific, technical terms. HTML overview

In order to determine the best universal approach to image sharing markup, we had to do some digging and some experimentation.

We were particularly interested in understanding how different platforms react to the presence of multiple og:image tags. If we can specify more than one image, and different platforms handle that differently, perhaps there’s a way in which we can get them to pick the most suitable image for their needs.

What we found

The way in which different platforms handle og:image tags (and in particular, multiple og:image tags) is often inconsistent, and frequently complex. Thankfully, most small platforms and apps simply crop and use the og:image tag (or the first og:image tag, if there are multiple in the set), and apply some reasonable constraints around dimensions, ratio, and file size. Some of the larger and more popular platforms, however, exhibit some particularly challenging behaviors, which complicate matters.

Here are some examples of the undocumented behaviors we’ve discovered (note that we’ve not talked about platforms which simply pick the first og:image tag, and which don’t exhibit any other ‘odd’ behavior). If you find any other undocumented features or behaviors which we haven’t covered or supported (or if we’ve made any mistakes in our research), we’d love to hear from you!


When multiple og:image tags are specified, Facebook uses the first tag in the set. This is in line with their documentation, but contrary to popular opinion (which assumes that the largest valid image is chosen). It’s also interesting to note that it uses the first image even if it’s invalid/brokenSelection FB

Additional images are available for selection by the user at the point of sharing (on desktop only). Images are hard-cropped and sized to fit the sharing dialogue window, based on the size of the window.


Instagram behaves similarly to Facebook, except that it will only show an image preview for ‘small’ images (if the image file size is smaller than 300KB, and the dimensions are ~256×256 – though we’ve seen up to 400×400 work), and only supports JPG and PNG formats.


Twitter uses the last image in an og:image set, unless a twitter:image tag exists. Using the twitter:image tag allows us to control Twitter images independently of all other types, though unfortunately doesn’t allow us to specify multiple values (to accommodate for different tweet contexts/layouts).

National Geographic

To add some complexity, Twitter supports multiple card layouts, which can be defined in a twitter:card tag. The default value is ‘summary’ (1:1 ratio), but it’s also possible to specify ‘summary_large_image’ for a larger, full-width image (2:1) ratio.

Unhelpfully, Twitter’s documentation shows the same layout for both card versions (summary, summary with large image).

Interestingly, Twitter used to support a ‘gallery’ type of card which held multiple images – however, they deprecated this into the ‘summary with large image’ card some time ago.


WhatsApp also uses the last image in the og:image set, which it hard crops to a small square. Note that, this appears to accept enormous images, both in terms of file size and dimensions; this red square is cropped from a 10000×10000, 1.49mb image. Test page Jono


Takes the first og:image, but caches it seemingly permanently (both locally and in the cloud), making it impossible to change/update an image thumbnail for a URL (without, e.g., manipulating the og:url value to include cache-busting elements).


Takes values from Facebook’s cache (typically the first og:image in a set). Cached images can be updated by messaging https://telegram.me/webpagebot (as a Telegram user) with up to 10 URLs. Note, caches only be updated if a page’s <html> tag contains a prefix=”og: http://ogp.me/ns#” attribute.


Pinterest’s documentation mentions that they “support” up to six og:image tags. However, sharing the page only ever utilizes the first* image the in the set.

They also support marking up inline images through Schema.org markup – however, when an og:image tag is present on the page (which will almost always be the case), it uses this instead.

Article Forbes

There’s also some ambiguity around the difference between how they handle Open Graph data with ‘article pins’ vs ‘rich pins’. The latter is an ‘upgraded’ version which displays more information on the card, but using these requires the site owner to validate their domain.

*There’s an edge-case where, if there are more than six images in the array, the sharing dialogue periodically seems to choose the seventh value (and there’s some other weirdness depending on the total numbers of images in the array).


Google’s Web Fundamentals documentation implies that Google+ may prefer (and will prioritize) schema markup over Open Graph markup. Theoretically, that might allow us to enable allow users to set a specific image for Google+. They also do some of the smartest intelligent cropping and ratio management (or, at least, the documentation on their approach is more complete and transparent than others).

As an interesting aside, Google+ ignores robots.txt directives, and so may unwittingly expose private/hidden assets.

What we’ve considered, and our decisions

That’s a lot of moving parts. We need to compare all of these rules and decide which og:image tags we output for any given post or page, on any given WordPress site running the Yoast SEO plugin. That means optimizing for the most common and general use-cases, without breaking too many edge-cases. It also means providing tools, hooks, and filters in WordPress to allow users with special requirements to alter the behavior to meet their particular needs.

That’s why we’re choosing to optimize the first image in the og:set for large, high-resolution sharing – the kind which Facebook supports and requires, but which cause issues with networks which expect a smaller image (like Instagram, or Telegram) sharing.

Whilst you could argue that Facebook might not necessarily wield the influence and domination that it used to, it’s undoubtedly still a significant platform in terms of audience size, and a place where page/post sharing is prolific – and where the quality/treatment of the image is critical to click through.

Given that both Facebook, and most platforms’ default is the first og:image tag in a set, we must ensure that this image is a large, high-quality image (with a suitable aspect ratio for Facebook). Unfortunately, this approach has some side-effects. There’ll be many cases where the image used is too large for Instagram (and other platforms which expect small thumbnails) to feature in shared post links.

We’re not completely happy with this as a solution, but it’s the best compromise we can come up with. As an aside, we also believe that, in its current state, Open Graph markup is a bit broken. We think that it feels intuitively right that the first and default og:image in a set should be a high quality, high-resolution image – and that it’s the responsibility of the platform to crop this down appropriately, or to use secondary/smaller og:images, or to provide their own markup/solutions.

Ideally, Open Graph tags would inherit some of the kinds of thinking behind CSS media queries, where you can specify the different screen widths at which different sets of logic apply. We’ll be seeking to lobby and work with the various platforms to improve their support and collaboration in the coming months and years.

User context is an important factor

We also think that this compromise makes more sense than optimizing for smaller images, because the context in which smaller thumbnails are used is different.

Example share link

We believe that individuals sharing URLs in one-to-one chat, or in small groups (e.g., in WhatsApp), are less likely to be negatively affected by a missing (or awkwardly sized/cropped) image. They’re usually chatting, engaged, and know the sender.

In the context of a newsfeed, like on Facebook or Twitter, the quality of the image is much more important – you’re scrolling through lots of noise, you’re less engaged, and a better image is an increased chance of a click/share/like. Structured data FB

In the case of Pinterest, and other systems where your interest is the image itself, we believe that most interaction occurs directly on the image, rather than from the page it’s on or a browser bookmarklet. Given this, we’re less concerned about Pinterest using the first og:image tag (which is a large image, optimized for Facebook) as a small, square thumbnail.

There’s an upper size limit, too

The biggest image you can have (both in terms of file size and dimensions) varies by platform, too. Some platforms support huge images (Facebook allows images to be up to 8mb) – but they chop these down into smaller thumbnails depending on the context. Some have relatively small max sizes, like LinkedIn and Telegram’s 2mb limit.

This makes it even more challenging to determine what the ‘best’ image should be, and which images should feature in the og:image set. We want to show a large, high-resolution image, but not too large.

It’s particularly tricky to pick the best size with WordPress, where we’re not always sure what image sizes we’ll be working with. That’s because, when a user uploads an image to WordPress, their site creates multiple versions of that image with different sizes and cropping. Typically, these are the original ‘full’ size, and ‘large’ (1024x1024px), ‘medium_large’ (768px, cropped), ‘medium’ (300x300px) and ‘thumbnail’ (150x150px) versions. But these default sizes are often altered by WordPress theme or plugin code, and by server configuration – and frequently, some might be too large for general use.

Because we need to make sure that the first og:image is suitable as a general default for as many platforms as possible, the Yoast SEO plugin evaluates post content, spots all of the images, and tries to pick the best size for each post. To get this right, we’ve evaluated the maximum file sizes and dimensions of a number of platforms, and we’ve set some automatic restrictions in the plugin.


  • When the ‘full’ size image is over 2mb file size, and/or over 2000 pixels on either axis, and/or the ratio exceeds 3:1, we’ll try and fall back to a smaller standard WordPress image size.
  • If we can’t find a suitable smaller image, we’ll continue to use the ‘full’ size. Note that this may result in the image not appearing in some sharing contexts.

In conclusion…

We don’t want our users to have to micromanage the details of how all of this works. Of course, when you’re producing great content for your audience, you should consider how that content appears on third party and social platforms. But it should be as simple a matter of picking an appropriate image, and letting the system do the rest – from sizing and file management, to ensuring that the best version shows up when it’s shared in other locations. Because every platform follows its own rules, however, we’ve had to make some decisions which won’t please every user and won’t solve for every scenario.

For most normal use-cases, we’d suggest that you manually set og:image values on your posts via the Yoast SEO plugin, and ensure that their dimensions are between 1200x800px and 2000x1600px (and that they’re less than 2mb in size).

If you disagree with the decisions we’ve made, or want to help us improve our solution; we’d love for you to get in touch. WordPress and Yoast SEO are both open source products – you can help by explaining your use-cases, reporting your bugs, and thinking about how a better solution might work.

We’d love to hear your thoughts; the Open Graph is a mess at the moment, and it’s up to all of us to fix it.

Some additional technical details

We’ve taken some liberty in the og:image markup, and we’re aware that we’re adding quite a lot of weight and markup with this approach. Specifically, we’ll output HTML which looks something like this:

<meta property="og:image" content="https://www.example.com/main-image.jpg" />
<meta property="og:image:secure_url" content="https://www.example.com/main-image.jpg" />
<meta property="og:image:height" content="2000" />
<meta property="og:image:width" content="2000" />
<meta property="og:image:alt" content="A description of the image" />
<meta property="og:image:type" content="image/jpg" />
<meta property="og:image" content="https://www.example.com/second-image.jpg" />
<meta property="og:image:secure_url" content="https://www.example.com/second-image.jpg" />
<meta property="og:image:height" content="800" />
<meta property="og:image:width" content="800" />
<meta property="og:image:alt" content="A description of the image" />
<meta property="og:image:type" content="image/jpg" />
<meta property="og:image" content="https://www.example.com/third-image.jpg" />
<meta property="og:image:secure_url" content="https://www.example.com/third-image.jpg" />
<meta property="og:image:height" content="600" />
<meta property="og:image:width" content="400" />
<meta property="og:image:alt" content="A description of the image" />
<meta property="og:image:type" content="image/jpg" />
<meta property="og:image" content="https://www.example.com/fourth-image.jpg" />
<meta property="og:image:secure_url" content="https://www.example.com/fourth-image.jpg" />
<meta property="og:image:height" content="256" />
<meta property="og:image:width" content="256" />
<meta property="og:image:alt" content="A description of the image" />
<meta property="og:image:type" content="image/jpg" />

Note the progression ‘down’ from ‘large, high-quality image’, through different media sizes (depending on the site/theme setup), ending at a universal ‘small’ size.

We’ll also output a twitter:image tag, alongside the other twitter:card requirements (unless you’ve chosen to disable it in the Yoast SEO plugin config).

We’ll likely continue to iterate and improve on the approach, but here’s our rationale behind some of our assumptions:

  • The og:image:type may not be strictly necessary in all cases, but there are many websites and server configurations where the images don’t have clean and recognizable ‘.jpg’ (or similar) file extensions. By making sure that we signpost the type of file, rather than making networks work it out, we reduce the risk of errors.
  • Facebook’s documentation around how it uses secure_url tags is unclear, especially for sites which are fully HTTPS. However, in the case of video tags, it mentions explicitly that both tags are required, even if both feature the same HTTPS URL. As such, we’ll retain the secure_url tags even when your site and image are served over HTTPS.
  • It’s generally considered best practice to label images with descriptive alt attributes, in order to support users who rely on screen readers and assistive technologies. We believe that Open Graph image tags shouldn’t be any different. This tag is only output when your images are labeled, so we’d encourage you to write some descriptive text during your image upload workflow.
  • Our 2mb file size limit aligns, incidentally, to the default upload size set in most WordPress implementations which run on ‘off the shelf’ hosting.
  • Our 2000×2000 pixel size flag should be a suitable maximum for almost all websites and screen sizes. Most browsers on desktop monitors have a width of fewer than 2,000 pixels (4k monitors and upwards often use image scaling to prevent everything from looking tiny). It’s also rare for any sharing ‘thumbnail’ activity (e.g., a Facebook message preview box) to take up the full width of the screen.
  • As Google+ isn’t widely used, we’ve chosen not to add additional complexity to the Yoast SEO plugin by offering the ability to specify dedicated, schema-based image markup for Google+ images. In most cases, we believe that the default og:image should be a suitable option for Google+ sharing – though we’re keen to hear from you if you find that this is not the case.
  • Unlike most other networks, WhatsApp supports SVG file formats. That means that, in theory, you could achieve optimal sharing for both WhatsApp and Facebook by setting your first og:image to be an SVG, and setting your second og:image as your full-res, large image. However. many other networks only read the first image, and won’t use the SVG file. SVG formats also come with a myriad of security risks, and so we’re not comfortable recommending their general use in this context.

Some undocumented Facebook ‘features’

If you’re feeling particularly geeky, you might also enjoy the following discoveries!

  • In addition to specifying the URL of an image, you can specify its height and width. This has some benefits, including encouraging Facebook to pre-cache the image on the first time it’s shared. However, when you specify multiple og:image tags, invalid/malformed height/width in any of those tags may cause problems with all of your images. E.g., an invalid og:image:height or og:image:width value on an image which isn’t chosen, prevents pre-rendering.
  • Specifying an image triggers the pre-caching process, regardless of whether it’s correct or not.
  • Facebook has different ‘share layout’ sizes, depending on the image size. Small images don’t scale up to a large preview very well, so they provide a condensed layout. However, the share layout size sometimes defaults to accommodating the smallest image from a set (e.g., if you have 10 huge og:image tags, and 1 small one, you sometimes get the small share layout).
  • Facebook also sometimes falls back to the ‘small’ layout if you have too many broken images in your set (as Facebook’s broken image file is only 540×300).
    Test page Jono
  • Setting incorrect image sizes lets you upscale small images in the Facebook debugger, but it doesn’t look like this is respected in the share dialogue. You cannot ‘downscale’ images so far as I can tell. There’s a “upscale=1” parameter in the version of the image which Facebook creates, which I suspect controls this.
  • Large images break! The maximum square image size appears to be in the region of 9200×9200. However, some images with unequal dimensions larger than this, but a lower total area (e.g., 10000×3000) work, as long as a 3:1 ratio or higher. Image test page JonoThis suggests that the boundaries might be based, in part, on a maximum square area of ~95,000,000 pixels.
  • As a minor additional note, when sharing a too-large image directly (i.e., linking directly to the file itself), Facebook just shows a blank image – there’s no fallback file/function used in their ‘safe image’ cleaner in this context.
  • Facebook supports a really interesting feature which lets you build relationships between pages featuring partial/linked Open Graph information. This, theoretically, allows you to ‘inherit’ and/or place centralized og markup elsewhere, reducing page weight. This might be useful for mobile/responsive subdomains, and some internationalization/versioning scenarios, perhaps, if other platforms supported it.
  • Additionally, where cloaking is often a risky tactic in SEO (and frowned upon by Google), Facebook actively suggest (see “You can target one of these user agents to serve the crawler a nonpublic version of your page”) cloaking mechanisms as a method of managing scenarios with paywalls, struggling servers, and other scenarios.
  • Despite claiming otherwise on their documentation, like Google+, Facebook’s crawler appears to ignore or disregard robots.txt directives – it’ll happily fetch Open Graph values from pages which are blocked by robots.txt files.

Some potential “hacks”?

Being stuck having to ‘share’ the first og:image tag between multiple networks is, as we’re discovering, limiting, and not ideal. Sticking to the Open Graph standards, as they’re written, lumps us with all sorts of unfortunate compromises and dead-ends.

So what if we think outside the rules a bit?

If we’re creative, there may be some clever tricks or undocumented approaches which we can use to bypass, confuse, or force certain behaviors from some of the trickier platforms.

Here are a few approaches we’ve considered, but ultimately discarded.

Can we use platform detection to create a dynamic solution?

Imagine for a moment, that every time a page is shared, the platform visits that page, reads the og:image tag(s), and grabs the image.

Theoretically, the Yoast SEO plugin could detect which platform is requesting the page or image, and execute some clever code to serve it the perfect image for its requirements.

That’d enable us to ensure that, regardless of what’s being shared, and where, we could intelligently make sure that the first og:image is the best option for the scenario.

However, the platforms don’t visit every time – they visit once, and save a copy of the tags (and the images) they found for a little while. That length of time, and the scenarios which cause them to revisit and/or update their cached version vary wildly by platform.

Still, theoretically, the Yoast SEO plugin could try and serve the right tags to the platform when they do visit, at the moment when they create their cache. But this approach relies heavily on the platforms identifying themselves when they visit (and on us recognizing their identities), on the website in question having a certain type of server configuration, and on our software doing some tricksy logic around deciding which tag(s) to show.

It might also open things up to forms of abuse by users and platforms who falsify their identities, and it won’t work for any website running any form of advanced caching (where the static pages are served to most visitors).

It all gets pretty complicated, and it’s not a robust enough approach to rely on.

Hidden image techniques

Some platforms, like Pinterest, do more than just grab the og:image tag(s) – they scan the page and look for other images too. That means that we can place sharing-optimized images outside of the Open Graph tags, as part of the page content, and hope that users select these when sharing.

In most scenarios, those images don’t necessarily need to be visible, in order to be discovered. We can place hidden images in the source code of a page.

Unfortunately, this technique doesn’t help in most cases, other than increasing the chance of an image showing up in a selection process. e.g., where Facebook allows you to select a thumbnail from an og:image set, or where Pinterest allows you to choose an image to share from a page when using their browser bookmarklet).

In most scenarios, in-page images are ignored when og:image tags are specified. Hiding images can also cause unexpected side-effects, such as accessibility issues, or slower page loads – especially when sharing the kinds of large images which platforms like Pinterest or Facebook look for in cases like this.

Publish-and-switch techniques

For some platforms, like Facebook, we can ‘push’ a specific image to them, by setting the first og:image and sending a request to their cache-busting URL.

We could adapt the Yoast SEO plugin to set a specific image as the first in the set, then ping the relevant platform’s cache-busting system to update the image. Then we could change the og:image ordering and repeat the process, setting the best image for each platform which allows for remote cache updating

Unfortunately, this only allows us to set the initial image. When their caches expire, we’re back to square one – they’ll use the logic we’ve outlined to pick their preferred images.

To repeat the cache-setting process, we’d need to constantly juggle the order of the tags, in line with the specific cache-expiry times of each platform. This adds a wealth of complexity, and simply isn’t feasible in most site setups (especially those running their own caching solutions), and only a few platforms support this kind of cache-busting.

More importantly, we want to avoid tactical hacks

As we’ve explored, we’re not keen on going to these kinds of lengths to try and fix a problem which could be fixed so much more effectively, and comprehensively, by the platforms themselves, and improvements to the Open Graph protocol.

We’re not adverse to implementing clever, technical solutions to help get the image selection process right, but we’d rather work with the platforms to address the problems at the source, rather than tackling the symptoms across the 8 million+ websites running the Yoast SEO plugin.

We’d love to hear your thoughts on how you think WordPress, Yoast SEO, the Open Graph Protocol and big platforms like Facebook and Twitter might be able to work together better!

Read on: ‘Social media optimization with Yoast SEO’ »

The post Advanced Technical SEO: How social image sharing works and how to optimize your og:image tags appeared first on Yoast.

Search engines like Google have a problem. It’s called ‘duplicate content.’ Duplicate content means that similar content is being shown on multiple locations (URLs) on the web. As a result, search engines don’t know which URL to show in the search results. This can hurt the ranking of a webpage. Especially when people start linking to all the different versions of the content, the problem becomes bigger. This article will help you to understand the various causes of duplicate content, and to find the solution for each of them.

What is duplicate content?

You can compare duplicate content to being on a crossroad. Road signs are pointing in two different directions for the same final destination: which road should you take? And now, to make it ‘worse’ the final destination is different too, but only ever so slightly. As a reader, you don’t mind: you get the content you came for. A search engine has to pick which one to show in the search results. It, of course, doesn’t want to show the same content twice.

Let’s say your article about ‘keyword x’ appears on http://www.example.com/keyword-x/ and the same content also appears on http://www.example.com/article-category/keyword-x/. This situation is not fictitious: it happens in lots of modern Content Management Systems. Your article has been picked up by several bloggers. Some of them link to the first URL; others link to the second URL. This is when the search engine’s problem shows its real nature: it’s your problem. The duplicate content is your problem because those links are both promoting different URLs. If they were all linking to the same URL, your chance of ranking for ‘keyword x’ would be higher.

Learn how to write awesome and SEO friendly articles in our SEO Copywriting training »

SEO copywriting training Info

Table of contents

1 Causes for duplicate content

There are dozens of reasons that cause duplicate content. Most of them are technical: it’s not very often that a human decides to put the same content in two different places without distinguishing the source: it feels unnatural to most of us. The technical reasons are plentiful though. It happens mostly because developers don’t think as a browser or a user, let alone a search engine spider, they think as a developer. That aforementioned article, that appears on http://www.example.com/keyword-x/ and http://www.example.com/article-category/keyword-x/? If you ask the developer, he’ll say it only exists once.

1.1 Misunderstanding the concept of a URL

Has that developer gone mad? No, he’s just speaking a different language. You see a database system probably powers the whole website. In that database, there’s only one article, the website’s software just allows for that same article in the database to be retrieved through several URLs. That’s because, in the eyes of the developer, the unique identifier for that article is the ID that article has in the database, not the URL. For the search engine though, the URL is the unique identifier to a piece of content. If you explain that to a developer, he’ll start getting the problem. And after reading this article, you’ll even be able to provide him with a solution right away.

1.2 Session IDs

You often want to keep track of your visitors and make it possible, for instance, to store items they want to buy in a shopping cart. To do that, you need to give them a ‘session.’ A session is a brief history of what the visitor did on your site and can contain things like the items in their shopping cart. To maintain that session as a visitor clicks from one page to another, the unique identifier for that session, the so-called Session ID, needs to be stored somewhere. The most common solution is to do that with cookies. However, search engines usually don’t store cookies.

At that point, some systems fall back to using Session IDs in the URL. This means that every internal link on the website gets that Session ID appended to the URL, and because that Session ID is unique to that session, it creates a new URL, and thus duplicate content.

1.3 URL parameters used for tracking and sorting

Another cause for duplicate content is the use of URL parameters that do not change the content of a page, for instance in tracking links. You see, http://www.example.com/keyword-x/ and http://www.example.com/keyword-x/?source=rss are not the same URL for a search engine. The latter might allow you to track what source people came from, but it might also make it harder for you to rank well. A very unwanted side effect!

This doesn’t just go for tracking parameters, of course. It goes for every parameter you can add to a URL that doesn’t change the vital piece of content, whether that parameter is for ‘changing the sorting on a set of products’ or for ‘showing another sidebar’: they all cause duplicate content.

1.4 Scrapers & content syndication

Most of the causes for duplicate content are all your own or at the very least your website’s ‘fault.’ Sometimes, however, other websites use your content, with or without your consent. They do not always link to your original article, and thus the search engine doesn’t ‘get’ it and has to deal with yet another version of the same article. The more popular your site becomes, the more scrapers you’ll often have, making this issue bigger and bigger.

1.5 Order of parameters

Optimize your site for search & social media and keep it optimized with Yoast SEO Premium »

Yoast SEO: the #1 WordPress SEO plugin Info
Another common cause is that a CMS doesn’t use nice and clean URLs, but rather URLs like /?id=1&cat=2, where ID refers to the article and cat refers to the category. The URL /?cat=2&id=1 will render the same results in most website systems, but they’re completely different for a search engine.

1.6 Comment pagination

In my beloved WordPress, but also in some other systems, there is an option to paginate your comments. This leads to the content being duplicated across the article URL, and the article URL + /comment-page-1/, /comment-page-2/ etc.

If your content management system creates printer friendly pages and you link to those from your article pages, in most cases Google will find those, unless you specifically block them. Now, which version should Google show? The one laden with ads and peripheral content,  or the one with just your article?

1.8 WWW vs. non-WWW

One of the oldest in the book, but sometimes search engines still get it wrong: WWW vs. non-WWW duplicate content, when both versions of your site are accessible. A less common situation but one I’ve seen as well: HTTP vs. HTTPS duplicate content, where the same content is served out over both.

2 Conceptual solution: a ‘canonical’ URL

As determined above, the fact that several URLs lead to the same content is a problem, but it can be solved. A human working at a publication will normally be able to tell you quite easily what the ‘correct’ URL for a certain article should be. The funny thing is, though, sometimes when you ask three people in the same company, they’ll give three different answers…

That’s a problem that needs solving in those cases because, in the end, there can be only one (URL). That ‘correct’ URL for a piece of content has been dubbed the Canonical URL by the search engines.


Ironic side note

Canonical is a term stemming from the Roman Catholic tradition, where a list of sacred books was created and accepted as genuine. They were dubbed the canonical Gospels of the New Testament. The irony is: it took the Roman Catholic church about 300 years and numerous fights to come up with that canonical list, and they eventually chose four versions of the same story

3 Identifying duplicate contents issues

You might not know whether you have a duplicate content issue on your site or with your content. Let me give you some methods of finding out whether you do.

3.1 Google Search Console

Google Search Console is a great tool for identifying duplicate content. If you go into the Search Console for your site, check under Search Appearance » HTML Improvements, and you’ll see this:

If pages have duplicate titles or duplicate descriptions, that’s almost never a good thing. Clicking on it will reveal the URLs that have duplicate titles or descriptions and will help you identify the problem. The issue is that if you have an article like the one about keyword X, and it shows up in two categories, the titles might be different. They might, for instance, be ‘Keyword X – Category X – Example Site’ and ‘Keyword X – Category Y – Example Site’. Google won’t pick those up as duplicate titles, but you can find them by searching.

3.2 Searching for titles or snippets

There are several search operators that are very helpful for cases like these. If you’d want to find all the URLs on your site that contain your keyword X article, you’d type the following search phrase into Google:

site:example.com intitle:"Keyword X"

Google will then show you all pages on example.com that contain that keyword. The more specific you make that intitle part, the easier it is to weed out duplicate content. You can use the same method to identify duplicate content across the web. Let’s say the full title of your article was ‘Keyword X – why it is awesome’, you’d search for:

intitle:"Keyword X - why it is awesome"

And Google would give you all sites that match that title. Sometimes it’s worth even searching for one or two complete sentences from your article, as some scrapers might change the title. In some cases, when you do a search like that, Google might show a notice like this on the last page of results:

Duplicate content noticed by Google

This is a sign that Google is already ‘de-duping’ the results. It’s still not good, so it’s worth clicking the link and looking at all the other results to see whether you can fix some of those.

4 Practical solutions for duplicate content

Once you’ve decided which URL is the canonical URL for your piece of content, you have to start a process of canonicalization (yeah I know, try to say that three times out loud fast). This means we have to let the search engine know about the canonical version of a page and let it find it ASAP. There are four methods of solving the problem, in order of preference:

  1. Not creating duplicate content
  2. Redirecting duplicate content to the canonical URL
  3. Adding a canonical link element to the duplicate page
  4. Adding an HTML link from the duplicate page to the canonical page

4.1 Avoiding duplicate content

Some of the above causes for duplicate content have very simple fixes to them:

  • Session ID’s in your URLs?
    These can often just be disabled in your system’s settings.
  • Have duplicate printer friendly pages?
    These are completely unnecessary: you should just use a print style sheet.
  • Using comment pagination in WordPress?
    You should just disable this feature  (under settings » discussion) on 99% of sites.
  • Parameters in a different order?
    Tell your programmer to build a script to always order parameters in the same order (this is often referred to as a so-called URL factory).
  • Tracking links issues?
    In most cases, you can use hash tag based campaign tracking instead of parameter-based campaign tracking.
  • WWW vs. non-WWW issues?
    Pick one and stick with it by redirecting the one to the other. You can also set a preference in Google Webmaster Tools, but you’ll have to claim both versions of the domain name.

If you can’t fix your problem that easily, it might still be worth it to put in the effort. The goal would be to prevent the duplicate content from appearing altogether. It’s by far the best solution to the problem.

4.2 301 Redirecting duplicate content

In some cases, it’s impossible to entirely prevent the system you’re using from creating wrong URLs for content, but sometimes it is possible to redirect them. If this isn’t logical to you (which I can understand), do keep it in mind while talking to your developers. If you do get rid of some of the duplicate content issues, make sure that you redirect all the old duplicate content URLs to the proper canonical URLs. 

Learn how to write awesome and SEO friendly articles in our SEO Copywriting training »

SEO copywriting training Info

4.3 Using rel=”canonical” links

Sometimes you don’t want to or can’t get rid of a duplicate version of an article, even when you do know that it’s the wrong URL. For that particular issue, the search engines have introduced the canonical link element. It’s placed in the section of your site, and it looks like this:

<link rel="canonical" href="http://example.com/wordpress/seo-plugin/">

In the href section of the canonical link, you place the correct canonical URL for your article. When a search engine that supports canonical finds this link element, it performs what is a soft 301 redirect. It transfers most of the link value gathered by that page to your canonical page.

This process is a bit slower than the 301 redirect though, so if you can do a 301 redirect that would be preferable, as mentioned by Google’s John Mueller.

Read more: ‘ rel=canonical • What it is and how (not) to use it ’ »

4.4 Linking back to the original content

If you can’t do any of the above, possibly because you don’t control thesection of the site your content appears on, adding a link back to the original article on top of or below the article is always a good idea. This might be something you want to do in your RSS feed: add a link back to the article in it. Some scrapers will filter that link out, but some others might leave it in. If Google encounters several links pointing to your article, it will figure out soon enough that that’s the actual canonical version of the article.

5 Conclusion: duplicate content is fixable, and should be fixed

Duplicate content happens everywhere. I have yet to encounter a site of more than 1,000 pages that hasn’t got at least a tiny duplicate content problem. It’s something you need to keep an eye on at all times. It is fixable though, and the rewards can be plentiful. Your quality content might soar in the rankings by just getting rid of duplicate content on your site!

Keep reading: ‘ Ask Yoast: webshops and duplicate content’ »

The post Duplicate content: causes and solutions appeared first on Yoast.